As long as only a small fraction of the data in a repository is
rewritten, the keepBlobs set will be rather small after cleaning it up.
As golang maps do not shrink their memory usage, just copy the contents
over to a new map. However, only copy the map if the cleanup removed at
least half the entries.
The set covers necessary, existing and duplicate blobs. This removes the
duplicate sets used to track whether all necessary blobs also exist.
This reduces the memory usage of prune by about 20-30%.
The RetryBackend tests depend on the mock backend. When the Backend
interface is eventually split from the restic package, this will lead to
a dependency cycle between backend and backend/mock. Thus split the
RetryBackend into a separate package to avoid this problem.
Archiver.Save queries the current time multiple times. This commit
removes one of these calls as they showed up while profiling a backup of
a nearly unchanged dataset containing 3 million files.
The string form was presumably useful before the introduction of
layouts, but right now it just makes call sequences and garbage
collection more expensive (the latter because every string contains
a pointer to be scanned).
if x { return true } return false => return x
fmt.Sprintf("%v", x) => fmt.Sprint(x) or x.String()
The fmt.Sprintf idiom is still used in the SecretString tests, where it
serves security hardening.
ID.UnmarshalJSON accepted non-JSON input with ' as the string delimiter.
Also, the error message for non-hex input was less informative than it
could be and it performed too many checks.
Changed ParseID to keep the error messages consistent.
FindFilteredSnapshots no longer prints errors during snapshot loading on
stderr, but instead passes the error to the callback to allow the caller
to decide on what to do.
In addition, it moves the logic to handle an explicit snapshot list from
the main package to restic.
The helper function uidGidInt used strconv.ParseInt instead of
ParseUint, so it silently ignored some invalid user/group IDs.
Also, improve the error message. "Invalid UID" is more informative than
having "ParseInt" twice (*strconv.NumError displays the function name).
Finally, the user.User struct can be passed by pointer to get reduce
code size.
This package is no longer needed, since we can use the stdlib's
http.NewRequestWithContext.
backend/rclone already did, but it needed a different error check due to
a difference between net/http and ctxhttp.
Also, store the http.Client by value in the REST backend (changed to a
pointer when ctxhttp was introduced) and use errors.WithStack instead
of errors.Wrap where the message was no longer accurate. Errors from
http.NewRequestWithContext will start with "net/http" or "net/url", so
they're easy to identify.
The backup command failed if a directory contains duplicate entries.
Downgrade the severity of this problem from fatal error to a warning.
This allows users to still create a backup.
SaveTree did not use the TreeSaver but rather managed the tree
collection and upload itself. This prevents using the parallelism
offered by the TreeSaver and duplicates all related code. Using the
TreeSaver can provide some speed-ups as all steps within the backup tree
now rely on FutureNodes. This can be especially relevant for backups
with large amounts of explicitly specified files.
The main difference between SaveTree and SaveDir is, that only the
former can save tree blobs in which nodes have a different name than the
actual file on disk. This is the result of resolving name conflicts
between multiple files with the same name. The filename that must be
used within the snapshot is now passed directly to
restic.NodeFromFileInfo. This ensures that a FutureNode already contains
the correct filename.