fd05037e1a changed the allocation batch
size from 256 to 128 under the assumption that an indexEntry is 60 bytes
on amd64, but it's 64: structs are padded out to a multiple of 8 for
alignment reasons. That means we'd waste no space in malloc even without
the batch allocation, at least on 64-bit machines. While that strategy
cuts the overallocation down dramatically for many small indexes, it also
seems to slow allocation down (Go 1.18, Linux, amd64, -benchtime=2s):
name old time/op new time/op delta
DecodeIndex-8 4.67s ± 5% 4.60s ± 1% ~ (p=0.953 n=10+5)
DecodeIndexParallel-8 4.67s ± 3% 4.60s ± 1% ~ (p=0.953 n=10+5)
IndexHasUnknown-8 37.8ns ± 8% 36.5ns ±14% ~ (p=0.841 n=5+5)
IndexHasKnown-8 38.5ns ±12% 37.7ns ±10% ~ (p=0.968 n=5+5)
IndexAlloc-8 615ms ±18% 607ms ± 1% ~ (p=1.000 n=10+5)
IndexAllocParallel-8 245ms ±11% 285ms ± 6% +16.40% (p=0.001 n=10+5)
MasterIndexAlloc-8 286ms ± 9% 275ms ± 2% ~ (p=1.000 n=10+5)
LoadIndex/v1-8 27.0ms ± 4% 26.8ms ± 1% ~ (p=0.690 n=5+5)
LoadIndex/v2-8 22.4ms ± 1% 22.8ms ± 2% +1.48% (p=0.016 n=5+5)
name old alloc/op new alloc/op delta
IndexAlloc-8 446MB ± 0% 446MB ± 0% -0.00% (p=0.000 n=8+4)
IndexAllocParallel-8 446MB ± 0% 446MB ± 0% -0.00% (p=0.008 n=8+5)
MasterIndexAlloc-8 213MB ± 0% 159MB ± 0% -25.47% (p=0.000 n=10+5)
name old allocs/op new allocs/op delta
IndexAlloc-8 913k ± 0% 2632k ± 0% +188.19% (p=0.008 n=5+5)
IndexAllocParallel-8 913k ± 0% 2632k ± 0% +188.21% (p=0.008 n=5+5)
MasterIndexAlloc-8 318k ± 0% 1172k ± 0% +267.86% (p=0.008 n=5+5)
Instead, this patch sets a batch size of 4, which means no space is
wasted by malloc on 64-bit and very little on 32-bit. It still gets very
close to the savings from not allocating in batches, without requiring
special code for bits.UintSize==64. Benchmark results, again for
Linux/amd64:
name old time/op new time/op delta
DecodeIndex-8 4.67s ± 5% 4.83s ± 9% ~ (p=0.315 n=10+10)
DecodeIndexParallel-8 4.67s ± 3% 4.68s ± 4% ~ (p=0.315 n=10+10)
IndexHasUnknown-8 37.8ns ± 8% 44.5ns ±19% ~ (p=0.095 n=5+5)
IndexHasKnown-8 38.5ns ±12% 36.9ns ± 8% ~ (p=0.690 n=5+5)
IndexAlloc-8 615ms ±18% 628ms ±18% ~ (p=0.218 n=10+10)
IndexAllocParallel-8 245ms ±11% 262ms ± 9% +7.02% (p=0.043 n=10+10)
MasterIndexAlloc-8 286ms ± 9% 287ms ±13% ~ (p=1.000 n=10+10)
LoadIndex/v1-8 27.0ms ± 4% 26.8ms ± 0% ~ (p=1.000 n=5+5)
LoadIndex/v2-8 22.4ms ± 1% 22.5ms ± 0% ~ (p=0.056 n=5+5)
name old alloc/op new alloc/op delta
IndexAlloc-8 446MB ± 0% 446MB ± 0% ~ (p=1.000 n=8+10)
IndexAllocParallel-8 446MB ± 0% 446MB ± 0% -0.00% (p=0.000 n=8+8)
MasterIndexAlloc-8 213MB ± 0% 160MB ± 0% -25.02% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
IndexAlloc-8 913k ± 0% 1333k ± 0% +45.94% (p=0.000 n=8+10)
IndexAllocParallel-8 913k ± 0% 1333k ± 0% +45.94% (p=0.000 n=8+8)
MasterIndexAlloc-8 318k ± 0% 525k ± 0% +64.99% (p=0.000 n=10+10)
The allocation method indexmap.newEntry has also been rewritten in a
form that is a few instructions shorter.
Apparently SMB/CIFS on Linux/macOS returns somewhat random errnos when
trying to sync a windows share which does not support calling fsync for
a directory.
This functionality has gone unused since
4b3dc415ef changed hashing.Reader's only
client to use ioutil.ReadAll on a bufio.Reader wrapping the hashing
Reader.
Reverts bcb852a8d0.
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
Tree packs are cached locally at clients and thus benefit a lot from
being compressed. Ensure this be having prune always repack pack files
containing uncompressed trees.
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
The config file is not compressed as it should remain readable by older
restic versions such that these can return a proper error.
As the old format for unpacked data does not include a version header,
make use of a trick: The old data is always encoded as JSON. Thus it can
only start with '{' or '['. For any other value the first byte indicates
a versioned format. The version is set to 2 for now. Then the zstd
compressed data follows.
As repack streams packs these occupy one backend connection. Uploading a
new pack also requires a backend connection. To prevent a deadlock
during repack when reaching the backend connections limit, simply limit
the repackWorker count to always leave one connection for uploading.
* Write new file payload to a temp file before touching the original
binary. Minimizes the possibility of failing mid-write and corrupting
the binary.
* On Windows, move the original binary out to a temp file rather than
removing it as the running binary is locked. Fixes issue #2248.
When resolving snapshotIDs in FindFilteredSnapshots either
FindLatestSnapshot or FindSnapshot is called. Both operations issue a
list operation to the backend. When for example passing a long list of
snapshot ids to `forget` this could lead to a large number of list
operations.
These commands filter the snapshots according to some criteria which
essentially requires loading the index before filtering the snapshots.
Thus create a copy of the snapshots list beforehand and use it later on.
Fixes #3687. Uses the cast suggested by @MichaelEischer, except that the
contant isn't cast along, because it's untyped and will be converted by
the compiler as necessary.
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
This is quite similar to gitignore. If a pattern is suffixed by an
exclamation mark and match a file that was previously matched by a
regular pattern, the match is cancelled. Notably, this can be used
with `--exclude-file` to cancel the exclusion of some files.
Like for gitignore, once a directory is excluded, it is not possible
to include files inside the directory. For example, a user wanting to
only keep `*.c` in some directory should not use:
~/work
!~/work/*.c
But:
~/work/*
!~/work/*.c
I didn't write documentation or changelog entry. I would like to get
feedback if this is the right approach for excluding/including files
at will for backups. I use something like this as an exclude file to
backup my home:
$HOME/**/*
!$HOME/Documents
!$HOME/code
!$HOME/.emacs.d
!$HOME/games
# [...]
node_modules
*~
*.o
*.lo
*.pyc
# [...]
$HOME/code/linux/*
!$HOME/code/linux/.git
# [...]
There are some limitations for this change:
- Patterns are not mixed accross methods: patterns from file are
handled first and if a file is excluded with this method, it's not
possible to reinclude it with `--exclude !something`.
- Patterns starting with `!` are now interpreted as a negative
pattern. I don't think anyone was relying on that.
- The whole list of patterns is walked for each match. We may
optimize later by exiting early if we know no pattern is starting
with `!`.
Fix #233
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.
Create a temporary file with a sufficiently random name to essentially
avoid any chance of conflicts. Once the upload has finished remove the
temporary suffix. Interrupted upload thus will be ignored by restic.
This ensures that restic won't create lots of new lock files without
deleting them later on.
In some cases a Delete operation on a backend can return a "File does
not exist" error even though the Delete operation succeeded. This can
for example be caused by request retries. This caused restic to forget
about the new lock file and continue trying to remove the old (already
deleted) lock file.
The function supports efficiently loading a specified list of blobs from
a single pack in a streaming fashion. That is there's no need for
temporary files independent of the pack size.
The archiver uses FS.OpenFile, where FS is an instance of the FS
interface. This is different from fs.OpenFile, which uses the OpenFile
method provided by the fs package.
Citing Kerrisk, The Linux Programming Interface:
The O_NOATIME flag is intended for use by indexing and backup
programs. Its use can significantly reduce the amount of disk
activity, because repeated disk seeks back and forth across the
disk are not required to read the contents of a file and to update
the last access time in the file’s i-node[.]
restic used to do this, but the functionality was removed along with the
fadvise call in #670.
Currently, `restic backup` (if a `--parent` is not provided)
will choose the most recent matching snapshot as the parent snapshot.
This makes sense in the usual case,
where we tag the snapshot-being-created with the current time.
However, this doesn't make sense if the user has passed `--time`
and is currently creating a snapshot older than the latest snapshot.
Instead, choose the most recent snapshot
which is not newer than the snapshot-being-created's timestamp,
to avoid any time travel.
Impetus for this change:
I'm using restic for the first time!
I have a number of existing BTRFS snapshots
I am backing up via restic to serve as my initial set of backups.
I initially `restic backup`'d the most recent snapshot to test,
then started backing up each of the other snapshots.
I noticed in `restic cat snapshot <id>` output
that all the remaining snapshots have the most recent as the parent.
The missing eof with http2 when a response included a content-length
header but no data, has been fixed in golang 1.17.3/1.16.10. Therefore
just drop the canary test and schedule it for removal once go 1.18 is
required as minimum version by restic.
Because there is no guarantee that a cleanup of these directories will occur
after the "restic check", we extend the behavior to detect and manage these
specific cache directories and allow their cleanup too.
Package internal/dump has been reworked so its API consists of a single
type Dumper that handles tar and zip formats. Tree loading and node
writing happen concurrently.
When deleting a file, B2 sometimes returns a "500 Service Unavailable"
error but nevertheless correctly deletes the file. Due to retries in
the B2 library blazer, we sometimes also see a "400 File not present"
error. The retries of restic for the delete request then fail with
"404 File with such name does not exist.".
As we have to rely on request retries in a distributed system to handle
temporary errors, also consider a delete request to be successful if the
file is reported as not existing. This should be safe as B2 claims to
provide a strongly consistent bucket listing and thus a missing file
shouldn't mysteriously show up again later on.
restic dump uses bloblru.Cache to keep buffers alive, but also reuses
evicted buffers. That means large buffers may be used to store small
blobs, causing the cache to think it's using less memory than it
actually does.
This can be caused when the test has uploaded four blobs, then queues
two blobs for upload which are delayed. Then a seventh file can be
opened which lead to a test failure.
The rest config normally uses prepareURL to sanitize URLs and ensures
that the URL ends with a slash. However, the test used an URL without a
trailing slash, which after the rest server changes causes test
failures.
For files below 256MB this uses the md5 hash calculated while assembling
the pack file. For larger files the hash for each 100MB part is
calculated on the fly. That hash is also reused as temporary filename.
As restic only uploads encrypted data which includes among others a
random initialization vector, the file hash shouldn't be susceptible to
md5 collision attacks (even though the algorithm is broken).
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
This can be used to check how large a backup is or validate exclusions.
It does not actually write any data to the underlying backend. This is
implemented as a simple overlay backend that accepts writes without
forwarding them, passes through reads, and generally does the minimal
necessary to pretend that progress is actually happening.
Fixes #1542
Example usage:
$ restic -vv --dry-run . | grep add
new /changelog/unreleased/issue-1542, saved in 0.000s (350 B added)
modified /cmd/restic/cmd_backup.go, saved in 0.000s (16.543 KiB added)
modified /cmd/restic/global.go, saved in 0.000s (0 B added)
new /internal/backend/dry/dry_backend_test.go, saved in 0.000s (3.866 KiB added)
new /internal/backend/dry/dry_backend.go, saved in 0.000s (3.744 KiB added)
modified /internal/backend/test/tests.go, saved in 0.000s (0 B added)
modified /internal/repository/repository.go, saved in 0.000s (20.707 KiB added)
modified /internal/ui/backup.go, saved in 0.000s (9.110 KiB added)
modified /internal/ui/jsonstatus/status.go, saved in 0.001s (11.055 KiB added)
modified /restic, saved in 0.131s (25.542 MiB added)
Would add to the repo: 25.892 MiB
Add comment that the check is based on the stdlib HTTP2 client. Refactor
the checks into a function. Return an error if the value in the
Content-Length header cannot be parsed.
Ensure that only snapshots made in the past are taken into account when running restic forget with the within switches (--keep-within, --keep-within- hourly, and friends)
Allow keeping hourly/daily/weekly/monthly/yearly snapshots for a given time period.
This adds the following flags/parameters to restic forget:
--keep-within-hourly duration
--keep-within-daily duration
--keep-within-weekly duration
--keep-within-monthly duration
--keep-within-yearly duration
Includes following changes:
- Add tests for --keep-within-hourly (and friends)
- Add documentation for --keep-within-hourly (and friends)
- Add changelog for --keep-within-hourly (and friends)
The first test function ensures that the workaround works as expected.
And the second test function is intended to fail as soon as the issue
has been fixed in golang to allow us to eventually remove the
workaround.
The golang http client does not return an error when a HTTP2 reply
includes a non-zero content length but does not return any data at all.
This scenario can occur e.g. when using rclone when a file stored in a
backend seems to be accessible but then fails to download.
Failed pack/blob downloads should be retried. For blobs that fail
decryption assume that the pack file is really damaged and try to
restore the remaining blobs.
* Stop prepending the operation name: it's already part of os.PathError,
leading to repetitive errors like "Chmod: chmod /foo/bar: operation not
permitted".
* Use errors.Is to check for specific errors.
Since the fileInfos are returned in a []interface, they're already
allocated on the heap. Making them pointers explicitly means the
compiler doesn't need to generate fileInfo and *fileInfo versions of the
methods on this type. The binary becomes about 7KiB smaller on
Linux/amd64.
mintty on windows always uses pipes to connect stdout between processes
and for the terminal output. The previous implementation always assumed
that stdout connected to a pipe means that stdout is displayed on a
mintty terminal. However, this detection breaks when using pipes to
connect processes and for powershell which uses pipes when redirecting
to a file.
Now the pipe filename is queried and matched against the pattern used by
msys / cygwin when connected to the terminal. In all other cases assume
that a pipe is just a regular pipe.
Previously the progress bar / status update interval used
stdoutIsTerminal to determine whether it is possible to update the
progress bar or not. However, its implementation differed from the
detection within the backup command which included additional checks to
detect the presence of mintty on Windows. mintty behaves like a terminal
but uses pipes for communication.
This adds stdoutCanUpdateStatus() which calls the same terminal detection
code used by backup. This ensures that all commands consistently switch
between interactive and non-interactive terminal mode.
stdoutIsTerminal() now also returns true whenever stdoutCanUpdateStatus()
does so. This is required to properly handle the special case of mintty.
The error returned when finishing the upload of an object was dropped.
This could cause silent upload failures and thus data loss in certain
cases. When a MD5 hash for the uploaded blob is specified, a wrong
hash/damaged upload would return its error via the Close() whose error
was dropped.
The azureAdapter was used directly without a pointer, but the Len()
method was only defined with a pointer receiver (which means Len() is
not present on a azureAdapter{}, only on a pointer to it).
Bugs in the error handling while uploading a file to the backend could
cause incomplete files, e.g. https://github.com/golang/go/issues/42400
which could affect the local backend.
Proactively add sanity checks which will treat an upload as failed if
the reported upload size does not match the actual file size.
Before, the scanner would could files twice if they were included in the
list of backup targets twice, e.g. `restic backup foo foo/bar` would
could the file `foo/bar` twice.
This commit uses the tree structure from the archiver to run the
scanner, so both parts see the same files.
This assigns an id to each tree root and then keeps track of how many
tree loads (i.e. trees referenced for the first time) are pending per
tree root. Once a tree root and its subtrees were fully processed there
are no more pending tree loads and the tree root is reported as
processed.
When the tomb is created with a canceled context, then the workers
started via `t.Go` exist nearly immediately. Once for the first time all
started goroutines have been stopped, it is not allowed to issue further
calls to `t.Go`. This is a problem when the started goroutines exit
immediately, as for example the first goroutine might already have
stopped before starting the second one, which is not allowed as once the
first goroutines has stopped no goroutines were running.
To fix this race condition the startup and main task of the archiver now
also run within a `t.Go` function. This also allows unifying the error
handling as it is no longer necessary to distinguish between errors
returned by the workers or the saveTree processing. The tomb now just
returns the first error encountered, which should also be the most
descriptive one.
Depending on the used backend, operations started with a canceled
context may fail or not. For example the local backend still works in
large parts when called with a canceled context. Backends transfering
data via http don't work. It is also not possible to retry failed
operations in that state as the RetryBackend will abort with a 'context
canceled' error.
Ensure uniform behavior of all backends by checking for a canceled
context by checking for a canceled context as a first step in the
RetryBackend. This ensures uniform behavior across all backends, as
backends are always wrapped in a RetryBackend.
If the context was canceled then saveTree might receive a treeID or not
depending on the timing. This could cause saveTree to incorrectly return
a nil treeID as valid. Fix this always returning an error when the
context was canceled in the meantime.
A canceled background context lets the blob/tree/fileSavers exit
without reporting an error. The error handling previously replaced
a 'context canceled' error received by the main backup method with
the error reported by the savers. However, in case of a canceled
background context that error is nil, causing restic to loose the
error and save a snapshot with a nil tree.
The counter value needs to be aligned to 64 bit in memory for the
atomic functions to work on some platform (such as 32 bit ARM).
The atomic package says in its documentation:
> These functions require great care to be used correctly. Except for
> special, low-level applications, synchronization is better done with
> channels or the facilities of the sync package.
This commit replaces the atomic functions with a simple sync.Mutex, so
we don't have to care about alignment.
This adds support for the following environment variables, which were
previously missing:
OS_USER_ID User ID for keystone v3 authentication
OS_USER_DOMAIN_ID User domain ID for keystone v3 authentication
OS_PROJECT_DOMAIN_ID Project domain ID for keystone v3 authentication
OS_TRUST_ID Trust ID for keystone v3 authentication
The canUpdateStatus check was simplified in #2608, but it accidentally flipped
the condition. The correct check is as follows: If the output is a pipe then
restic probably runs in mintty/cygwin. In that case it's possible to
update the output status. In all other cases it isn't.
This commit inverts to condition again to offer the previous and correct
behavior.
On shutdown the backup commands waits for the terminal output goroutine
to stop. However while running in the background the goroutine ignored
the canceled context.
In #2584 this was changed to use the uid/gid of the root node. This
would be okay for the top-level directory of a snapshot, however, this
change also applied to normal directories within a snapshot. This
change reverts the problematic part and adds a test that directory
attributes are represented correctly.
This code is more strict in what it expects to find in the backend:
depending on the layout, either a directory full of files or a directory
full of such directories.
a gs service account may only have object permissions on an existing
bucket but no bucket create/get permissions.
these service accounts currently are blocked from initialization a
restic repository because restic can not determine if the bucket exists.
this PR updates the logic to assume the bucket exists when the bucket
attribute request results in a permissions denied error.
this way, restic can still initialize a repository if the service
account does have object permissions
fixes: https://github.com/restic/restic/issues/3100
UnusedBlobs now directly reads the list of existing blobs from the
repository index. This removes the need for the blobStatusExists flag,
which in turn allows converting the blobRefs map into a BlobSet.
By construction these two errors always show up in pairs: 'size could
not be found' is printed when the blob is not found in the repository
index. That blob is also part of the `blobs` array. Later on, check
iterates over that array and checks whether the blob is marked as
existing. Which cannot be the case as that mark is generated by
iterating over the repository index.
The merged warning no longer reports the blob index within a file. That
information could also be derived by printing the affected tree using
`cat` and searching for the blob.
Due to the return if !isFile, the IsDir branch in List was never taken
and subdirectories were traversed recursively.
Also replaced isFile by an IsRegular check, which has been equivalent
since Go 1.12 (golang/go@a2a3dd00c9).
The restic security model includes full trust of the local machine, so
this should not fix any actual security problems, but it's better to be
safe than sorry.
Fixes #2192.
The file permissions included a go specific directory bit which
accidentially forced the usage of the GNU header format. This leads
to problems with 7zip on Windows or when extended attributes are
used.
The VSS support works for 32 and 64-bit windows, this includes a check that
the restic version matches the OS architecture as required by VSS. The backup
operation will fail the user has not sufficient permissions to use VSS.
Snapshotting volumes also covers mountpoints but skips UNC paths.
The io.Reader interface does not support contexts, such that it is
necessary to embed the context into the backendReaderAt struct. This has
the problem that a reader might suddenly stop working when it's
contained context is canceled. However, this is now problem here as the
reader instances never escape the calling function.
The list operation used by RemoveStaleLocks or RemoveAllLocks will
already be canceled by the passed in context. Therefore we can also just
cancel the remove operation as the unlock command won't process all lock
files anyways.
This is no change in behavior as a canceled context did later on cause
the config file creation to fail. Therefore this change just lets the
repository initialization fail a bit earlier.
A pattern part containing "/" is used to mark a path or a pattern as
absolute. However, on Windows the path separator is "\" such that glob
patterns like "?" could match the marker. The code now explicitly skips
the marker when the pattern does not represent an absolute path.
We now use v4 of the module. `backoff.WithMaxRetries` no longer repeats
an operation endlessly when a retry count of 0 is specified. This
required a few fixes for the tests.
The file is already created with the proper permissions, thus the chmod
call is not critical. However, some file systems have don't allow
modifications of the file permissions. Similarly the chmod call in the Remove
operation should not prevent it from working.
Many instances of errors.Wrap in this package would produce messages
like "Open: open <filename>: no such file or directory"; those now omit
the first "Open:" (or "Stat:", or "MkdirAll"). The function readVersion
now appends its own name to the error message, rather than the function
that failed, to make it easier to spot. Other function names (e.g.,
Load) are already added further up in the call chain.
The archiver first called the Select function for a path before checking
whether the Lstat on that path actually worked. The RejectFuncs in
exclude.go worked around this by checking whether they received a nil
os.FileInfo. Checking first is more obvious and requires less code.
When the backup is interrupted for some reason while the scanner is
still active this could lead to a deadlock. Interruptions are triggered
by canceling the context object used by both the backup progress UI and
the scanner. It is possible that a context is canceled between the
respective check in the scanner and it calling the `ReportTotal` method
of the UI. The latter method sends a message to the UI goroutine.
However, a canceled context will also stop that goroutine, which can
cause the channel send operation to block indefinitely.
This is resolved by adding a `closed` channel which is closed once the
UI goroutine is stopped and serves as an escape hatch for reported UI
updates.
This change covers not just the ReportTotal method but all potentially
affected methods of the progress UI implementation.
Makes the following corrections to the "Applying Policy:" output:
- keep the last 1 snapshots snapshots => keep 1 latest snapshots
- keep the last 1 snapshots, 3 hourly, 5 yearly snapshots => keep 1 latest, 3 hourly, 5 yearly snapshots
This allows creating multiple repositories with identical chunker
parameters which is required for working deduplication when copying
snapshots between different repositories.
Previously the directory stats were reported immediately after calling
`SaveDir`. However, as the latter method saves the tree asynchronously
the stats were still initialized to their nil value. The stats are now
reported via a callback similar to the one used for the fileSaver.
The slicing operator `slice[low:high]` default to 0 for the lower bound and
len(slice) for the upper bound when either or both are not specified.
Fix the code to use `cap(slice)` to check for the slice capacity.
If a blob in a pack file can be decrypted successfully but contains data
that results in a different hash than stated in the header pack, then
abort repacking. As both the pack header and the blob are
cryptographically verified this either means than a malicious entity
tampered with the backup or indicates hardware problems on the client.
prune should fail with an error in both cases.
The seen BlobSet always contained a subset of the entries in blobs.
Thus use blobs instead and avoid the memory overhead of the second set.
Suggested-by: Alexander Weiss <alex@weissfam.de>
As the connection to the rclone child process is now closed after an
unexpected subprocess exit, later requests will cause the http2
transport to try to reestablish a new connection. As previously this never
should have happened, the connection called panic in that case. This
panic is now replaced with a simple error message, as it no longer
indicates an internal problem.
- Add Open() functionality to dir
- only access index for blobs when file is read
- Implement NodeOpener and put one-time file stuff there
- Add comment about locking as suggested by bazil.org/fuse
=> Thanks at Michael Eischer for suggesting the last two improvements
Calling `Close()` on the rclone backend sometimes failed during test
execution with 'signal: Broken pipe'. The stdio connection closed both
the stdin and stdout file descriptors at the same moment, therefore
giving rclone no chance to properly send any final http2 data frames.
Now the stdin connection to rclone is closed first and will only be
forcefully closed after a timeout. In case rclone exits before the
timeout then the stdio connection will be closed normally.
restic did not notice when the rclone subprocess exited unexpectedly.
restic manually created pipes for stdin and stdout and used these for the
connection to the rclone subprocess. The process creating a pipe gets
file descriptors for the sender and receiver side of a pipe and passes
them on to the subprocess. The expected behavior would be that reads or
writes in the parent process fail / return once the child process dies
as a pipe would now just have a reader or writer but not both.
However, this never happened as restic kept the reader and writer
file descriptors of the pipes. `cmd.StdinPipe` and `cmd.StdoutPipe`
close the subprocess side of pipes once the child process was started
and close the parent process side of pipes once wait has finished. We
can't use these functions as we need access to the raw `os.File` so just
replicate that behavior.
The test now uses the fact that the sort is stable. It's not guaranteed
to be, but the test is cleaner and more exhaustive. sortCachedPacksFirst
no longer needs a return value.
In the Google Cloud Storage backend, support specifying access tokens
directly, as an alternative to a credentials file. This is useful when
restic is used non-interactively by some other program that is already
authenticated and eliminates the need to store long lived credentials.
The access token is specified in the GOOGLE_ACCESS_TOKEN environment
variable and takes precedence over GOOGLE_APPLICATION_CREDENTIALS.
If a data blob and a tree blob with the same ID (= same content) exist,
then the checker did not report a data or tree blob as unused when the
blob of the other type was still in use.
The `DuplicateTree` flag is necessary to ensure that failures cannot be
swallowed. The old checker implementation ignores errors from LoadTree
if the corresponding tree was already checked.
Backups traverse the file tree in depth-first order and saves trees on
the way back up. This results in tree packs filled in a way comparable
to the reverse Polish notation. In order to check tree blobs in that
order, the treeFilter would have to delay the forwarding of tree nodes
until all children of it are processed which would complicate the
implementation.
Therefore do the next similar thing and traverse the tree in depth-first
order, but process trees already on the way down. The tree blob ids are
added in reverse order to the backlog, which is once again reverted when
removing the ids from the back of the backlog.
The blobRefs map and the processedTrees IDSet are merged to reduce the
memory usage. The blobRefs map now uses separate flags to track blob
usage as data or tree blob. This prevents skipping of trees whose
content is identical to an already processed data blob. A third flag
tracks whether a blob exists or not, which removes the need for the
blobs IDSet.
Even though the checkTreeWorker skips already processed chunks,
filterTrees did queue the same tree blob on every occurence. This
becomes a serious performance bottleneck for larger number of snapshots
that cover mostly the same directories. Therefore decode a tree blob
exactly once.
The benchmark was actually testing the speed of index lookups.
name old time/op new time/op delta
SaveAndEncrypt-8 101ns ± 2% 31505824ns ± 1% +31311591.31% (p=0.000 n=10+10)
name old speed new speed delta
SaveAndEncrypt-8 41.7TB/s ± 2% 0.0TB/s ± 1% -100.00% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
SaveAndEncrypt-8 1.00B ± 0% 20989508.40B ± 0% +2098950740.00% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SaveAndEncrypt-8 0.00 123.00 ± 0% +Inf% (p=0.000 n=10+9)
(The actual speed is ca. 131MiB/s.)