From 9fa4f5eb6b7a045bcee03e9e28831c0443384fb8 Mon Sep 17 00:00:00 2001 From: Michael Pratt Date: Tue, 17 Oct 2017 21:04:35 -0700 Subject: [PATCH] gs: disable resumable uploads By default, the GCS Go packages have an internal "chunk size" of 8MB, used for blob uploads. Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF is reached) then write that full 8MB to the network all at once. This behavior does not play nicely with --limit-upload, which only limits the Reader passed to Media. While the long-term average upload rate will be correctly limited, the actual network bandwidth will be very spikey. e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will spend 8s reading from the rate-limited reader (performing no network requests), then 1s writing to the network at 8MB/s. This is bad for network connections hurt by full-speed uploads, particularly when writing 8MB will take several seconds. Disable resumable uploads entirely by setting the chunk size to zero. This causes the io.Reader to be passed further down the request stack, where there is less (but still some) buffering. My connection is around 1.5MB/s up, with nominal ~15ms ping times to 8.8.8.8. Without this change, --limit-upload 1024 results in several seconds of ~200ms ping times (uploading), followed by several seconds of ~15ms ping times (reading from rate-limited reader). A bandwidth monitor reports this as several seconds of ~1.5MB/s followed by several seconds of 0.0MB/s. With this change, --limit-upload 1024 results in ~20ms ping times and the bandwidth monitor reports a constant ~1MB/s. I've elected to make this change unconditional of --limit-upload because the resumable uploads shouldn't be providing much benefit anyways, as restic already uploads mostly small blobs and already has a retry mechanism. --limit-download is not affected by this problem, as Get().Download() returns the real http.Response.Body without any internal buffering. Updates #1216 --- CHANGELOG.md | 1 + internal/backend/gs/gs.go | 29 ++++++++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c2a778363..e7cc465fc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -32,6 +32,7 @@ Important Changes in 0.X.Y `--limit-download` flags. https://github.com/restic/restic/issues/1216 https://github.com/restic/restic/pull/1336 + https://github.com/restic/restic/pull/1358 * Failed backend requests are now automatically retried. https://github.com/restic/restic/pull/1353 diff --git a/internal/backend/gs/gs.go b/internal/backend/gs/gs.go index ca0164527..1141497a1 100644 --- a/internal/backend/gs/gs.go +++ b/internal/backend/gs/gs.go @@ -214,10 +214,37 @@ func (be *Backend) Save(ctx context.Context, h restic.Handle, rd io.Reader) (err debug.Log("InsertObject(%v, %v)", be.bucketName, objName) + // Set chunk size to zero to disable resumable uploads. + // + // With a non-zero chunk size (the default is + // googleapi.DefaultUploadChunkSize, 8MB), Insert will buffer data from + // rd in chunks of this size so it can upload these chunks in + // individual requests. + // + // This chunking allows the library to automatically handle network + // interruptions and re-upload only the last chunk rather than the full + // file. + // + // Unfortunately, this buffering doesn't play nicely with + // --limit-upload, which applies a rate limit to rd. This rate limit + // ends up only limiting the read from rd into the buffer rather than + // the network traffic itself. This results in poor network rate limit + // behavior, where individual chunks are written to the network at full + // bandwidth for several seconds, followed by several seconds of no + // network traffic as the next chunk is read through the rate limiter. + // + // By disabling chunking, rd is passed further down the request stack, + // where there is less (but some) buffering, which ultimately results + // in better rate limiting behavior. + // + // restic typically writes small blobs (4MB-30MB), so the resumable + // uploads are not providing significant benefit anyways. + cs := googleapi.ChunkSize(0) + info, err := be.service.Objects.Insert(be.bucketName, &storage.Object{ Name: objName, - }).Media(rd).Do() + }).Media(rd, cs).Do() be.sem.ReleaseToken()