After reading and chunking all data in a file, the FutureFile still has
to wait until the FutureBlobs are completed. This was done synchronously
which results in blocking the file saver and prevents the next file from
being read.
By replacing the FutureBlob with a callback, it becomes possible to
complete the FutureFile asynchronously.
SaveTree did not use the TreeSaver but rather managed the tree
collection and upload itself. This prevents using the parallelism
offered by the TreeSaver and duplicates all related code. Using the
TreeSaver can provide some speed-ups as all steps within the backup tree
now rely on FutureNodes. This can be especially relevant for backups
with large amounts of explicitly specified files.
The main difference between SaveTree and SaveDir is, that only the
former can save tree blobs in which nodes have a different name than the
actual file on disk. This is the result of resolving name conflicts
between multiple files with the same name. The filename that must be
used within the snapshot is now passed directly to
restic.NodeFromFileInfo. This ensures that a FutureNode already contains
the correct filename.
FutureBlob now uses a Take() method as a more memory-efficient way to
retrieve the futures result. In addition, futures are now collected
while saving the file. As only a limited number of blobs can be queued
for uploading, for a large file nearly all FutureBlobs already have
their result ready, such that the FutureBlob object just consumes
memory.
There is no real difference between the FutureTree and FutureFile
structs. However, differentiating both increases the size of the
FutureNode struct.
The FutureNode struct is now only 16 bytes large on 64bit platforms.
That way is has a very low overhead if the corresponding file/directory
was not processed yet.
There is a special case for nodes that were reused from the parent
snapshot, as a go channel seems to have 96 bytes overhead which would
result in a memory usage regression.
github.com/pkg/errors is no longer getting updates, because Go 1.13
went with the more flexible errors.{As,Is} function. Use those instead:
errors from pkg/errors already support the Unwrap interface used by 1.13
error handling. Also:
* check for io.EOF with a straight ==. That value should not be wrapped,
and the chunker (whose error is checked in the cases changed) does not
wrap it.
* Give custom Error methods pointer receivers, so there's no ambiguity
when type-switching since the value type will no longer implement error.
* Make restic.ErrAlreadyLocked private, and rename it to
alreadyLockedError to match the stdlib convention that error type
names end in Error.
* Same with rest.ErrIsNotExist => rest.notExistError.
* Make s3.Backend.IsAccessDenied a private function.
This isn't doing anything. Channels should get cleaned up by the GC when
the last reference to them disappears, just like all other data
structures. Also inlined BufferPool.Put in Buffer.Release, its only
caller.
The Save methods of the BlobSaver, FileSaver and TreeSaver return early
on when the archiver is stopped due to an error. For that they select on
both the tomb.Dying() and context.Done() channels, which can lead to a
race condition when the tomb is killed due to an error: The tomb first
closes its Dying channel before canceling all child contexts.
Archiver.SaveDir only aborts its execution once the context was
canceled. When the tomb killing is paused between closing its Dying
channel and canceling the child contexts, this lets the
FileSaver/TreeSaver.Save methods return immediately, however, ScanDir
still reads further files causing the test case to fail.
As a killed tomb always cancels all child contexts and as the Savers
always use a context bound to the tomb, it is sufficient to just use
context.Done() as escape hatch in the Save functions. This fixes the
mismatch between SaveDir and Save.
Adjust the tests to use contexts bound to the tomb for all interactions
with the Savers.
This commit changes how the worker goroutines for saving e.g. blobs
interact. Before, it was possible to get stuck sending an instruction to
archive a file or dir when no worker goroutines were available any more.
This commit introduces a `done` channel for each of the worker pools,
which is set to the channel returned by `tomb.Dying()`, so it is closed
when the first worker returned an error.
This commit changes the archiver so that low-level errors saving data to
the repo are returned to the caller (instead of being handled by the
error callback function). This correctly bubbles up errors like a full
temp file system and makes restic abort early and makes all other worker
goroutines exit.