Iterating through the indexmap according to the bucket order has the
problem that all indexEntries are accessed in random order which is
rather cache inefficient.
As we already keep a list of all allocated blocks, just iterate through
it. This allows iterating through a batch of indexEntries without random
memory accesses. In addition, the packID will likely remain similar
across multiple blobs as all blobs of a pack file are added as a single
batch.
This data structure reduces the wasted memory to O(sqrt(n)). The
top-layer of the hashed array tree (HAT) also has a size of O(sqrt(n)),
which makes it cache efficient. The top-layer should be small enough to
easily fit into the CPU cache and thus only adds little overhead
compared to directly accessing an index entry via a pointer.
The indexEntry objects are now allocated in a separate array. References
to an indexEntry are now stored as array indices. This has the benefit
of allowing the garbage collector to ignore the indexEntry objects as
these do not contain pointers and are part of a single large allocation.
Use the logging methods from testing.TB to make use of tb.Helper(). This
allows the tests to log the filename and line number in which the test
helper was called. Previously the test helper was logged which is rarely
useful.
This function casts its argument to int32 before passing it to the
system call, so that big-endian CPUs read the lower rather than the
upper 32 bits of the pid.
This also gets rid of the last import of "unsafe" in the Unix build.
I changed syscall to x/sys/unix while I was at it, to remove one more
import line. The constants and types there are aliases for their syscall
counterparts.
As the `Fatal` error type only includes a string, it becomes impossible
to inspect the contained error. This is for a example a problem for the
fuse implementation, which must be able to detect context.Canceled
errors.
Co-authored-by: greatroar <61184462+greatroar@users.noreply.github.com>
For hardlinked files, only the first instance of that file increases the
amount of bytes to restore. All later instances only increase the file
count but not the restore size.
restic must be able to refresh lock files in time. However, large
uploads over slow connections can cause the lock refresh to be stuck
behind the large uploads and thus time out.
The test had a 4% chance of not modified the data read from the
repository, in which case the test would fail. Change the data
manipulation to just modified each read operation.
This adds support for caching already rewritten trees, handling of load
errors and disabling the check that the serialization doesn't lead to
data loss.
The more generic RewriteNode callback replaces the SelectByName and
PrintExclude functions. The main part of this change is a preparation to
allow using the TreeRewriter for the `repair snapshots` command.