The function supports efficiently loading a specified list of blobs from
a single pack in a streaming fashion. That is there's no need for
temporary files independent of the pack size.
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
This can be used to check how large a backup is or validate exclusions.
It does not actually write any data to the underlying backend. This is
implemented as a simple overlay backend that accepts writes without
forwarding them, passes through reads, and generally does the minimal
necessary to pretend that progress is actually happening.
Fixes #1542
Example usage:
$ restic -vv --dry-run . | grep add
new /changelog/unreleased/issue-1542, saved in 0.000s (350 B added)
modified /cmd/restic/cmd_backup.go, saved in 0.000s (16.543 KiB added)
modified /cmd/restic/global.go, saved in 0.000s (0 B added)
new /internal/backend/dry/dry_backend_test.go, saved in 0.000s (3.866 KiB added)
new /internal/backend/dry/dry_backend.go, saved in 0.000s (3.744 KiB added)
modified /internal/backend/test/tests.go, saved in 0.000s (0 B added)
modified /internal/repository/repository.go, saved in 0.000s (20.707 KiB added)
modified /internal/ui/backup.go, saved in 0.000s (9.110 KiB added)
modified /internal/ui/jsonstatus/status.go, saved in 0.001s (11.055 KiB added)
modified /restic, saved in 0.131s (25.542 MiB added)
Would add to the repo: 25.892 MiB
The io.Reader interface does not support contexts, such that it is
necessary to embed the context into the backendReaderAt struct. This has
the problem that a reader might suddenly stop working when it's
contained context is canceled. However, this is now problem here as the
reader instances never escape the calling function.
This is no change in behavior as a canceled context did later on cause
the config file creation to fail. Therefore this change just lets the
repository initialization fail a bit earlier.
This allows creating multiple repositories with identical chunker
parameters which is required for working deduplication when copying
snapshots between different repositories.
The slicing operator `slice[low:high]` default to 0 for the lower bound and
len(slice) for the upper bound when either or both are not specified.
Fix the code to use `cap(slice)` to check for the slice capacity.
If a blob in a pack file can be decrypted successfully but contains data
that results in a different hash than stated in the header pack, then
abort repacking. As both the pack header and the blob are
cryptographically verified this either means than a malicious entity
tampered with the backup or indicates hardware problems on the client.
prune should fail with an error in both cases.
The test now uses the fact that the sort is stable. It's not guaranteed
to be, but the test is cleaner and more exhaustive. sortCachedPacksFirst
no longer needs a return value.
The benchmark was actually testing the speed of index lookups.
name old time/op new time/op delta
SaveAndEncrypt-8 101ns ± 2% 31505824ns ± 1% +31311591.31% (p=0.000 n=10+10)
name old speed new speed delta
SaveAndEncrypt-8 41.7TB/s ± 2% 0.0TB/s ± 1% -100.00% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
SaveAndEncrypt-8 1.00B ± 0% 20989508.40B ± 0% +2098950740.00% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SaveAndEncrypt-8 0.00 123.00 ± 0% +Inf% (p=0.000 n=10+9)
(The actual speed is ca. 131MiB/s.)
A side remark to the definition of Index.blob:
Another possibility would have been to use:
blob map[restic.BlobHandle]*indexEntry
This would have led to the following sizes:
key: 32 + 1 = 33 bytes
value: 8 bytes
indexEntry: 8 + 4 + 4 = 16 bytes
each packID: 32 bytes
To save N index entries, we would therefore have needed:
N * OF * (33 + 8) bytes + N * 16 + N * 32 bytes / BP = N * 82 bytes
More precicely, using a pointer instead of a direct entry is the better memory choice if:
OF * 8 bytes + entrysize < OF * entrysize <=> entrysize > 8 bytes * OF/(OF-1)
Under the assumption of OF=1.5, this means using pointers would have been the better choice
if sizeof(indexEntry) > 24 bytes.
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
-> also cleans up pending index entries when they are saved in the index
-> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
-> removes the need of an index uploader
-> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
The username and hostname for new keys can be specified with the new
--user and --host flags, respectively. The flags are used only by the
`key add` command and are otherwise ignored.
This allows adding keys with for a desired user and host without having
to run restic as that particular user on that particular host, making
automated key management easier.
Co-authored-by: James TD Smith <ahktenzero@mohorovi.cc>
When loading a blob, restic first looks up pack files containing the
blob. To avoid unnecessary work an already cached pack file is preferred.
However, if there is only a single pack file to choose from (which is
the normal case) sorting the one-element list won't change anything.
Therefore avoid the unnecessary cache check in that case.
The previous benchmark spent much of its time allocating RNGs and
generating too many random numbers. It now spends 90% of its time
hashing and half of the rest writing to files.
name old time/op new time/op delta
PackerManager-8 319ms ± 1% 247ms ± 1% -22.48% (p=0.000 n=20+18)
name old speed new speed delta
PackerManager-8 143MB/s ± 1% 213MB/s ± 1% +48.63% (p=0.000 n=10+18)
name old alloc/op new alloc/op delta
PackerManager-8 635kB ± 0% 92kB ± 0% -85.48% (p=0.000 n=10+19)
name old allocs/op new allocs/op delta
PackerManager-8 1.64k ± 0% 1.43k ± 0% -12.76% (p=0.000 n=10+20)
The pool was used improperly, causing more allocations to be
performed than without it.
name old time/op new time/op delta
SaveAndEncrypt-8 36.8ms ± 2% 36.9ms ± 2% ~ (p=0.218 n=10+10)
name old speed new speed delta
SaveAndEncrypt-8 114MB/s ± 2% 114MB/s ± 2% ~ (p=0.218 n=10+10)
name old alloc/op new alloc/op delta
SaveAndEncrypt-8 21.1MB ± 0% 21.0MB ± 0% -0.44% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SaveAndEncrypt-8 79.0 ± 0% 77.0 ± 0% -2.53% (p=0.000 n=10+10)
I was running "golangci-lint" and found this two warnings
internal/checker/checker.go:135:18: (*Checker).LoadIndex$3 - result 0 (error) is always nil (unparam)
final := func() error {
^
internal/repository/repository.go:457:18: (*Repository).LoadIndex$3 - result 0 (error) is always nil (unparam)
final := func() error {
^
It turns out that these functions are used only in "RunWorkers(...)",
which is used only two times in whole project right after this "final"
functions.
And because these "final" functions always return "nil", I've
descided, that it would be better to remove requriments for "final" func
to return error to avoid magick "return nil" at their end.
Restic used to quit if the repository password was typed incorrectly once.
Restic will now ask the user again for the repository password if typed incorrectly.
The user will now get three tries to input the correct password before restic quits.
This commit changes the signatures for repository.LoadAndDecrypt and
utils.LoadAll to allow passing in a []byte as the buffer to use. This
buffer is enlarged as needed, and returned back to the caller for
further use.
In later commits, this allows reducing allocations by reusing a buffer
for multiple calls, e.g. in a worker function.
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.
The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.
The logging in these functions double the time they take to execute.
However, it is only really useful on failures, which are better
reported by the calling functions.
benchmark old ns/op new ns/op delta
BenchmarkMasterIndexLookupSingleIndex-6 897 395 -55.96%
BenchmarkMasterIndexLookupMultipleIndex-6 2001 1090 -45.53%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 492 215 -56.30%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 1649 912 -44.69%
benchmark old allocs new allocs delta
BenchmarkMasterIndexLookupSingleIndex-6 9 1 -88.89%
BenchmarkMasterIndexLookupMultipleIndex-6 19 1 -94.74%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 6 0 -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 16 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkMasterIndexLookupSingleIndex-6 160 96 -40.00%
BenchmarkMasterIndexLookupMultipleIndex-6 240 96 -60.00%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 48 0 -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 128 0 -100.00%