Conceptually the backend configuration should be validated when creating
or opening the backend, but not when filling in information from
environment variables into the configuration.
This unified construction removes most backend-specific code from
global.go. The backend registry will also enable integration tests to
use custom backends if necessary.
In order to change the backend initialization in `global.go` to be able
to generically call cfg.ApplyEnvironment() for supported backends, the
`interface{}` returned by `ParseConfig` must contain a pointer to the
configuration.
An alternative would be to use reflection to convert the type from
`interface{}(Config)` to `interface{}(*Config)` (from value to pointer
type). However, this would just complicate the type mess further.
The SemaphoreBackend now uniformly enforces the limit of concurrent
backend operations. In addition, it unifies the parameter validation.
The List() methods no longer uses a semaphore. Restic already never runs
multiple list operations in parallel.
By managing the semaphore in a wrapper backend, the sections that hold a
semaphore token grow slightly. However, the main bottleneck is IO, so
this shouldn't make much of a difference.
The key insight that enables the SemaphoreBackend is that all of the
complex semaphore handling in `openReader()` still happens within the
original call to `Load()`. Thus, getting and releasing the semaphore
tokens can be refactored to happen directly in `Load()`. This eliminates
the need for wrapping the reader in `openReader()` to release the token.
The Test method was only used in exactly one place, namely when trying
to create a new repository it was used to check whether a config file
already exists.
Use a combination of Stat() and IsNotExist() instead.
The only use cases in the code were in errors.IsFatal, backend/b2,
which needs a workaround, and backend.ParseLayout. The last of these
requires all backends to implement error unwrapping in IsNotExist.
All backends except gs already did that.
... called backend/sema. I resisted the temptation to call the main
type sema.Phore. Also, semaphores are now passed by value to skip a
level of indirection when using them.
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
The error returned when finishing the upload of an object was dropped.
This could cause silent upload failures and thus data loss in certain
cases. When a MD5 hash for the uploaded blob is specified, a wrong
hash/damaged upload would return its error via the Close() whose error
was dropped.
Bugs in the error handling while uploading a file to the backend could
cause incomplete files, e.g. https://github.com/golang/go/issues/42400
which could affect the local backend.
Proactively add sanity checks which will treat an upload as failed if
the reported upload size does not match the actual file size.
a gs service account may only have object permissions on an existing
bucket but no bucket create/get permissions.
these service accounts currently are blocked from initialization a
restic repository because restic can not determine if the bucket exists.
this PR updates the logic to assume the bucket exists when the bucket
attribute request results in a permissions denied error.
this way, restic can still initialize a repository if the service
account does have object permissions
fixes: https://github.com/restic/restic/issues/3100
In the Google Cloud Storage backend, support specifying access tokens
directly, as an alternative to a credentials file. This is useful when
restic is used non-interactively by some other program that is already
authenticated and eliminates the need to store long lived credentials.
The access token is specified in the GOOGLE_ACCESS_TOKEN environment
variable and takes precedence over GOOGLE_APPLICATION_CREDENTIALS.
This change removes the hardcoded Google auth mechanism for the GCS
backend, instead using Google's provided client library to discover and
generate credential material.
Google recommend that client libraries use their common auth mechanism
in order to authorise requests against Google services. Doing so means
you automatically support various types of authentication, from the
standard GOOGLE_APPLICATION_CREDENTIALS environment variable to making
use of Google's metadata API if running within Google Container Engine.
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.
The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.
Before, all backend implementations were required to return an error if
the file that is to be written already exists in the backend. For most
backends, that means making a request (e.g. via HTTP) and returning an
error when the file already exists.
This is not accurate, the file could have been created between the HTTP
request testing for it, and when writing starts. In addition, apart from
the `config` file in the repo, all other file names have pseudo-random
names with a very very low probability of a collision. And even if a
file name is written again, the way the restic repo is structured this
just means that the same content is placed there again. Which is not a
problem, just not very efficient.
So, this commit relaxes the requirement to return an error when the file
in the backend already exists, which allows reducing the number of API
requests and thereby the latency for remote backends.
During the development of #1524 I discovered that the Google Cloud
Storage backend did not yet use the HTTP transport, so things such as
bandwidth limiting did not work. This commit does the necessary magic to
make the GS library use our HTTP transport.