This data structure reduces the wasted memory to O(sqrt(n)). The
top-layer of the hashed array tree (HAT) also has a size of O(sqrt(n)),
which makes it cache efficient. The top-layer should be small enough to
easily fit into the CPU cache and thus only adds little overhead
compared to directly accessing an index entry via a pointer.
The indexEntry objects are now allocated in a separate array. References
to an indexEntry are now stored as array indices. This has the benefit
of allowing the garbage collector to ignore the indexEntry objects as
these do not contain pointers and are part of a single large allocation.
New and its helpers used to create the cache directories several times
over. They now only do so once. The added test ensures that the cache is
produced in a consistent state when parts are deleted.
Use the logging methods from testing.TB to make use of tb.Helper(). This
allows the tests to log the filename and line number in which the test
helper was called. Previously the test helper was logged which is rarely
useful.
This function casts its argument to int32 before passing it to the
system call, so that big-endian CPUs read the lower rather than the
upper 32 bits of the pid.
This also gets rid of the last import of "unsafe" in the Unix build.
I changed syscall to x/sys/unix while I was at it, to remove one more
import line. The constants and types there are aliases for their syscall
counterparts.
As the `Fatal` error type only includes a string, it becomes impossible
to inspect the contained error. This is for a example a problem for the
fuse implementation, which must be able to detect context.Canceled
errors.
Co-authored-by: greatroar <61184462+greatroar@users.noreply.github.com>
For hardlinked files, only the first instance of that file increases the
amount of bytes to restore. All later instances only increase the file
count but not the restore size.
restic must be able to refresh lock files in time. However, large
uploads over slow connections can cause the lock refresh to be stuck
behind the large uploads and thus time out.
The test had a 4% chance of not modified the data read from the
repository, in which case the test would fail. Change the data
manipulation to just modified each read operation.
This adds support for caching already rewritten trees, handling of load
errors and disabling the check that the serialization doesn't lead to
data loss.
The more generic RewriteNode callback replaces the SelectByName and
PrintExclude functions. The main part of this change is a preparation to
allow using the TreeRewriter for the `repair snapshots` command.
The builtin mechanism to capture a stacktrace in Go is to send a SIGQUIT
to the running process. However, this mechanism is not avaiable on
Windows. Thus, tweak the SIGINT handler to dump a stacktrace if the
environment variable `RESTIC_DEBUG_STACKTRACE_SIGINT` is set.
The SemaphoreBackend now uniformly enforces the limit of concurrent
backend operations. In addition, it unifies the parameter validation.
The List() methods no longer uses a semaphore. Restic already never runs
multiple list operations in parallel.
By managing the semaphore in a wrapper backend, the sections that hold a
semaphore token grow slightly. However, the main bottleneck is IO, so
this shouldn't make much of a difference.
The key insight that enables the SemaphoreBackend is that all of the
complex semaphore handling in `openReader()` still happens within the
original call to `Load()`. Thus, getting and releasing the semaphore
tokens can be refactored to happen directly in `Load()`. This eliminates
the need for wrapping the reader in `openReader()` to release the token.
The snapshot filtering internally converts relative paths to absolute
ones to ensure that the parent snapshots selection works for backups of
relative paths.
x/text/width.LookupRune has to re-encode its argument as UTF-8,
while LookupString operates on the UTF-8 directly.
The uint casts get rid of a bounds check.
Benchmark results, with b.ResetTimer introduced first:
name old time/op new time/op delta
TruncateASCII-8 69.7ns ± 1% 55.2ns ± 1% -20.90% (p=0.000 n=20+18)
TruncateUnicode-8 350ns ± 1% 171ns ± 1% -51.05% (p=0.000 n=20+19)
Since 0.15 (#4020), inodes are generated as hashes of names, xor'd with
the parent inode. That means that the inode of a/b/b is
h(a/b/b) = h(a) ^ h(b) ^ h(b) = h(a).
I.e., the grandchild has the same inode as the grandparent. GNU find
trips over this because it thinks it has encountered a loop in the
filesystem, and fails to search a/b/b. This happens more generally when
the same name occurs an even number of times.
Fix this by multiplying the parent by a large prime, so the combining
operation is not longer symmetric in its arguments. This is what the FNV
hash does, which we used prior to 0.15. The hash is now
h(a/b/b) = h(b) ^ p*(h(b) ^ p*h(a))
Note that we already ensure that h(x) is never zero.
Collisions can still occur, but they should be much less likely to occur
within a single path.
Fixes #4253.
Tests in cmd_forget_test.go need the same convenience function
that was implemented in snapshot_policy_test.go.
Function parseDuration(...) was moved to testing.go and renamed to
ParseDurationOrPanic(...).
This turns snapshotFilterOptions from cmd into a restic.SnapshotFilter
type and makes restic.FindFilteredSnapshot and FindFilteredSnapshots
methods on that type. This fixes #4211 by ensuring that hosts and paths
are named struct fields instead of unnamed function arguments in long
lists of such.
Timestamp limits are also included in the new type. To avoid too much
pointer handling, the convention is that time zero means no limit.
That's January 1st, year 1, 00:00 UTC, which is so unlikely a date that
we can sacrifice it for simpler code.
This method had a buffer argument, but that was nil at all call sites.
That's removed, and instead LoadUnpacked now reuses whatever it
allocates inside its retry loop.