RemoveUnpacked will eventually block removal of all filetypes other than
snapshots. However, getting there requires a major refactor to provide
some components with privileged access.
This ensures that the pack header is actually read completely.
Previously, for a truncated file it was possible to only read a part of
the header, as backend.Load(...) is not guaranteed to return as many
bytes as requested by the length parameter.
Due to the interface of streamPack, we cannot guarantee that operations
progress fast enough that the underlying connections remains open. This
introduces partial failures which massively complicate the error
handling.
Switch to a simpler approach that retrieves the pack in chunks of 32MB.
If a blob is larger than this limit, then it is downloaded separately.
To avoid multiple copies in memory, an auxiliary interface
`discardReader` is introduced that allows directly accessing the
downloaded byte slices, while still supporting the streaming used by the
`check` command.
To only stream the content of a pack file once, check used StreamPack
with a custom pack load function. This combination was always brittle
and complicates using StreamPack everywhere else. Now that StreamPack
internally uses PackBlobIterator use that primitive instead, which is a
much better fit for what the check command requires.
For now, the guide is only shown if the blob content does not match its
hash. The main intended usage is to handle data corruption errors when
using maximum compression in restic 0.16.0
The test had a 4% chance of not modified the data read from the
repository, in which case the test would fail. Change the data
manipulation to just modified each read operation.
TestRepository and its variants always returned no-op cleanup functions.
If they ever do need to do cleanup, using testing.T.Cleanup is easier
than passing these functions around.
The ioutil functions are deprecated since Go 1.17 and only wrap another
library function. Thus directly call the underlying function.
This commit only mechanically replaces the function calls.
Repositories with mixed packs are probably quite rare by now. When
loading data blobs from a mixed pack file, this will no longer trigger
caching that file. However, usually tree blobs are accessed first such
that this shouldn't make much of a difference.
The checker gets a simpler replacement.
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.
> name old time/op new time/op delta
> MasterIndexEach-16 6.68s ±24% 0.96s ± 2% -85.64% (p=0.008 n=5+5)