Also make the errors a bit less verbose by not prepending the operation,
since pkg/xattr already does that. Old errors looked like
Listxattr: xattr.list /myfiles/.zfs/snapshot: invalid argument
After repacking every blob that should be kept must have been repacked.
We have seen a few cases in which a single blob went missing, which
could have been caused by a bitflip somewhere. This sanity check might
help catch some of these cases.
pkg/sftp.Client.MkdirAll(d) does a Stat to determine if d exists and is
a directory, then a recursive call to create the parent, so the calls
for data/?? each take three round trips. Doing a Mkdir first should
eliminate two round trips for 255/256 data directories as well as all
but one of the top-level directories.
Also, we can do all of the calls concurrently. This may reintroduce some
of the Stat calls when multiple goroutines try to create the same
parent, but at the default number of connections, that should not be
much of a problem.
FutureBlob now uses a Take() method as a more memory-efficient way to
retrieve the futures result. In addition, futures are now collected
while saving the file. As only a limited number of blobs can be queued
for uploading, for a large file nearly all FutureBlobs already have
their result ready, such that the FutureBlob object just consumes
memory.
There is no real difference between the FutureTree and FutureFile
structs. However, differentiating both increases the size of the
FutureNode struct.
The FutureNode struct is now only 16 bytes large on 64bit platforms.
That way is has a very low overhead if the corresponding file/directory
was not processed yet.
There is a special case for nodes that were reused from the parent
snapshot, as a go channel seems to have 96 bytes overhead which would
result in a memory usage regression.
Unused blobs are not a problem but rather expected to exist now that
prune by default does not remove every unused blob. However, the option
has caused questions from users whether a repository is damaged or not,
so just remove that option.
Note that the remaining code is left intact as it is still useful for
our test cases.
It wasn't clear that Google Cloud Storage and Google Drive are two different services and that one should use the rclone backend for the latter. This commit adds a note with this information.