Sometimes s3 listobjects for a directory includes an entry for that
directory. The restic s3 backend doesn't expect that and returns
an error.
Symptom is:
ReadDir: invalid key name restic/key/, removing prefix
restic/key/ yielded empty string
I'm not sure when s3 does that; I'm unable to reproduce it myself.
But in any case, it seems correct to ignore that when it happens.
Fixes #1068
By default, the GCS Go packages have an internal "chunk size" of 8MB,
used for blob uploads.
Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF
is reached) then write that full 8MB to the network all at once.
This behavior does not play nicely with --limit-upload, which only
limits the Reader passed to Media. While the long-term average upload
rate will be correctly limited, the actual network bandwidth will be
very spikey.
e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will
spend 8s reading from the rate-limited reader (performing no network
requests), then 1s writing to the network at 8MB/s.
This is bad for network connections hurt by full-speed uploads,
particularly when writing 8MB will take several seconds.
Disable resumable uploads entirely by setting the chunk size to zero.
This causes the io.Reader to be passed further down the request stack,
where there is less (but still some) buffering.
My connection is around 1.5MB/s up, with nominal ~15ms ping times to
8.8.8.8.
Without this change, --limit-upload 1024 results in several seconds of
~200ms ping times (uploading), followed by several seconds of ~15ms ping
times (reading from rate-limited reader). A bandwidth monitor reports
this as several seconds of ~1.5MB/s followed by several seconds of
0.0MB/s.
With this change, --limit-upload 1024 results in ~20ms ping times and
the bandwidth monitor reports a constant ~1MB/s.
I've elected to make this change unconditional of --limit-upload because
the resumable uploads shouldn't be providing much benefit anyways, as
restic already uploads mostly small blobs and already has a retry
mechanism.
--limit-download is not affected by this problem, as Get().Download()
returns the real http.Response.Body without any internal buffering.
Updates #1216
This PR adds the ability of chaining the credentials provider,
such that restic as a tool attempts to honor credentials from
multiple different ways.
Currently supported mechanisms are
- static (user-provided)
- IAM profile (only valid inside configured ec2 instances)
- Standard AWS envs (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
- Standard Minio envs (MINIO_ACCESS_KEY, MINIO_SECRET_KEY)
Refer https://github.com/restic/restic/issues/1341
Windows, and to a lesser extent OS X, don't conform to XDG and have
their own preferred locations for caches.
On Windows, use %LOCALAPPDATA%/restic (i.e., ~/AppData/Local/restic). I
can't find authoritative documentation from Microsoft recommending
specifically which of %APPDATA%, %LOCALAPPDATA%, and %TEMP% should be
used for caches, but %LOCALAPPDATA% is where browsers store their
caches, so it seems like a good fit.
On OS X, use ~/Library/Caches/restic, which is recommended by the Apple
documentation. They do suggest using the application "bundle identifier"
as the base folder name, but restic doesn't have one, so I just used
"restic".
If the service account used with restic does not have the
storage.buckets.get permission (in the "Storage Admin" role), Create
cannot use Get to determine if the bucket is accessible.
Rather than always trying to create the bucket on Get error, gracefully
fall back to assuming the bucket is accessible. If it is, restic init
will complete successfully. If it is not, it will fail on a later call.
Here is what init looks like now in different cases.
Service account without "Storage Admin":
Bucket exists and is accessible (this is the case that didn't work
before):
$ ./restic init -r gs:this-bucket-does-exist:/
enter password for new backend:
enter password again:
created restic backend c02e2edb67 at gs:this-bucket-does-exist:/
Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.
Bucket exists but is not accessible:
$ ./restic init -r gs:this-bucket-does-exist:/
enter password for new backend:
enter password again:
create key in backend at gs:this-bucket-does-exist:/ failed:
service.Objects.Insert: googleapi: Error 403:
my-service-account@myproject.iam.gserviceaccount.com does not have
storage.objects.create access to object this-bucket-exists/keys/0fa714e695c8ecd58cb467cdeb04d36f3b710f883496a90f23cae0315daf0b93., forbidden
Bucket does not exist:
$ ./restic init -r gs:this-bucket-does-not-exist:/
create backend at gs:this-bucket-does-not-exist:/ failed:
service.Buckets.Insert: googleapi: Error 403:
my-service-account@myproject.iam.gserviceaccount.com does not have storage.buckets.create access to bucket this-bucket-does-not-exist., forbidden
Service account with "Storage Admin":
Bucket exists and is accessible: Same
Bucket exists but is not accessible: Same. Previously this would fail
when Create tried to create the bucket. Now it fails when trying to
create the keys.
Bucket does not exist:
$ ./restic init -r gs:this-bucket-does-not-exist:/
enter password for new backend:
enter password again:
created restic backend c3c48b481d at gs:this-bucket-does-not-exist:/
Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.
This commit adds code to synchronize downloading files to the cache.
Before, requests that came in for files currently downloading would fail
because the file was not completed in the cache. Now, the code waits
until the download is completed.
Closes #1278
This commit adds a function to the cache which can decide to proactively
load the complete pack file and store it in the cache. This is helpful
for pack files containing only tree blobs, as it is likely that the same
file is accessed again in the future.
This commits adds rudimentary support for a cache directory, enabled by
default. The cache directory is created if it does not exist. The cache
is used if there's anything in it, newly created snapshot and index
files are written to the cache automatically.
In the manual, state which standard roles the service account must
have to work correctly, as well as the specific permissions required,
for creating even more specific custom roles.
This was a bit tricky: We start the ssh binary, but we want it to ignore
SIGINT. In contrast, restic itself should process SIGINT and clean up
properly. Before, we used `setsid()` to give the ssh process its own
process group, but that means it cannot prompt the user for a password
because the tty is gone.
So, now we're passing in two functions that ignore SIGINT just before
the ssh process is started and re-install it after start.
The code now bundles tree blobs and data blobs into different pack
files, so we'll end up with pack files that either only contain data or
trees. This is in preparation to adding a cache (#1040), because
tree-only pack files can easily be cached later on.
At the moment when two items to be saved have the same directory name,
restic only saves the first one to the repo. Let's say we have a
structure like this:
dir1
└── subdir
└── file
dir2
└── subdir
└── file
When restic is run on `dir1/subdir` and `dir2/subdir`, it will only save
the first `subdir`:
$ restic backup dir1/subdir dir2/subdir
[...]
$ restic ls -l latest
drwxr-xr-x 1000 100 0 2017-08-27 20:56:39 /subdir
-rw-r--r-- 1000 100 17 2017-08-27 20:56:39 /subdir/file
That's obviously a bad thing, caused by an early decision to strip the
full path to the files/dirs to save and only leave the last directory.
This commit partly resolves this by handling colliding names and
resolving the conflicts. Restic will now append a counter to the file
(`-123`) until the conflict is resolved. So in the example above, we'll
end up with the following structure:
$ restic ls -l latest
drwxr-xr-x 1000 100 0 2017-08-27 20:56:39 /subdir
-rw-r--r-- 1000 100 17 2017-08-27 20:56:39 /subdir/file
drwxr-xr-x 1000 100 0 2017-08-27 20:56:46 /subdir-1
-rw-r--r-- 1000 100 17 2017-08-27 20:56:46 /subdir-1/file
This partly addresses #549 and closes #1179.
At first I thought that the obvious correction would be to archive the
full path. But it turns out that collisions may still occur: Suppose you
have a file named `foo` in the current directory, and the parent directory
also contains a file `foo`. Archiving these with restic also causes a
collision, since restic strips the `../` from the first file:
$ restic backup ../foo foo
This also happens with `tar`, which does not handle the collision and
will happily archive two files called `foo`.
So, the best way forward is to handle name collisions and archive the
whole path. The latter will be tackled in a separate PR.