Commit Graph

221 Commits

Author SHA1 Message Date
greatroar c9557b2822 internal/repository: Fix LoadBlob + fuzz test
When given a buf that is big enough for a compressed blob but not its
decompressed contents, the copy at the end of LoadBlob would skip the
last part of the contents.

Fixes #3783.
2022-06-06 17:02:28 +02:00
MichaelEischer b2a2e5f727
Merge pull request #3753 from greatroar/indexmap-alloc
repository: Re-tune indexmap allocation strategy
2022-05-14 15:44:08 +02:00
greatroar 5141228e0c repository: Re-tune indexmap allocation strategy
fd05037e1a changed the allocation batch
size from 256 to 128 under the assumption that an indexEntry is 60 bytes
on amd64, but it's 64: structs are padded out to a multiple of 8 for
alignment reasons. That means we'd waste no space in malloc even without
the batch allocation, at least on 64-bit machines. While that strategy
cuts the overallocation down dramatically for many small indexes, it also
seems to slow allocation down (Go 1.18, Linux, amd64, -benchtime=2s):

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    DecodeIndexParallel-8     4.67s ± 3%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    IndexHasUnknown-8        37.8ns ± 8%    36.5ns ±14%      ~     (p=0.841 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    37.7ns ±10%      ~     (p=0.968 n=5+5)
    IndexAlloc-8              615ms ±18%     607ms ± 1%      ~     (p=1.000 n=10+5)
    IndexAllocParallel-8      245ms ±11%     285ms ± 6%   +16.40%  (p=0.001 n=10+5)
    MasterIndexAlloc-8        286ms ± 9%     275ms ± 2%      ~     (p=1.000 n=10+5)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 1%      ~     (p=0.690 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.8ms ± 2%    +1.48%  (p=0.016 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%    -0.00%  (p=0.000 n=8+4)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%    -0.00%  (p=0.008 n=8+5)
    MasterIndexAlloc-8        213MB ± 0%     159MB ± 0%   -25.47%  (p=0.000 n=10+5)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     2632k ± 0%  +188.19%  (p=0.008 n=5+5)
    IndexAllocParallel-8       913k ± 0%     2632k ± 0%  +188.21%  (p=0.008 n=5+5)
    MasterIndexAlloc-8         318k ± 0%     1172k ± 0%  +267.86%  (p=0.008 n=5+5)

Instead, this patch sets a batch size of 4, which means no space is
wasted by malloc on 64-bit and very little on 32-bit. It still gets very
close to the savings from not allocating in batches, without requiring
special code for bits.UintSize==64. Benchmark results, again for
Linux/amd64:

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.83s ± 9%     ~     (p=0.315 n=10+10)
    DecodeIndexParallel-8     4.67s ± 3%     4.68s ± 4%     ~     (p=0.315 n=10+10)
    IndexHasUnknown-8        37.8ns ± 8%    44.5ns ±19%     ~     (p=0.095 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    36.9ns ± 8%     ~     (p=0.690 n=5+5)
    IndexAlloc-8              615ms ±18%     628ms ±18%     ~     (p=0.218 n=10+10)
    IndexAllocParallel-8      245ms ±11%     262ms ± 9%   +7.02%  (p=0.043 n=10+10)
    MasterIndexAlloc-8        286ms ± 9%     287ms ±13%     ~     (p=1.000 n=10+10)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 0%     ~     (p=1.000 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.5ms ± 0%     ~     (p=0.056 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%     ~     (p=1.000 n=8+10)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%   -0.00%  (p=0.000 n=8+8)
    MasterIndexAlloc-8        213MB ± 0%     160MB ± 0%  -25.02%  (p=0.000 n=10+9)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+10)
    IndexAllocParallel-8       913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+8)
    MasterIndexAlloc-8         318k ± 0%      525k ± 0%  +64.99%  (p=0.000 n=10+10)

The allocation method indexmap.newEntry has also been rewritten in a
form that is a few instructions shorter.
2022-05-11 21:22:14 +02:00
MichaelEischer df554e5f69
Merge pull request #3748 from greatroar/runworkers
repository: Remove RunWorkers, report ctx.Err()
2022-05-11 19:38:46 +02:00
greatroar 2e0f1f5113 repository: Remove RunWorkers, report ctx.Err()
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
2022-05-10 22:26:00 +02:00
Michael Eischer ae7e51382a Fix error on temp file deletion on windows
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
2022-05-09 22:43:26 +02:00
Michael Eischer cf5cb673fb repository: Use existing method to collect pack ids 2022-04-30 19:14:21 +02:00
Michael Eischer b335cb6285 repository: Refactor index IDs collection 2022-04-30 19:14:21 +02:00
Michael Eischer 4b01b06f2f repository: Test compressed blobs in StreamPack 2022-04-30 11:34:10 +02:00
Michael Eischer ec2b25565a repository: test uncompressedLength field and index example 2022-04-30 11:34:10 +02:00
Michael Eischer 9ffb8920f1 repository: run blackbox tests using old and new repo version 2022-04-30 11:34:10 +02:00
Michael Eischer abe5935693 repository: unify repository version-specific initialization
Mark the master index as compressed also when initializing a new
repository. This is only relevant for testing.
2022-04-30 11:34:10 +02:00
Alexander Neumann 8776031f96 Leave allocating slices to the decompress code 2022-04-30 11:34:10 +02:00
Alexander Neumann 5eb05a0afe Configure zstd encoder/decoder 2022-04-30 11:34:10 +02:00
Michael Eischer 2f36e044db Cleanup pack header check 2022-04-30 11:34:10 +02:00
Alexander Neumann 8b11b86383 Add option global --compression 2022-04-30 11:34:10 +02:00
Michael Eischer 7132df529e repository: Increase index size for repo version 2
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
2022-04-30 11:34:10 +02:00
Michael Eischer 66f9048bce repository: Alloc zstd encoder/decoder on demand 2022-04-30 11:34:10 +02:00
Michael Eischer fd05037e1a repository: recalibrate index batch allocation size 2022-04-30 11:34:10 +02:00
Michael Eischer 6fb408d90e repository: implement pack compression 2022-04-30 11:34:10 +02:00
Michael Eischer 362ab06023 init: Add flag to specify created repository version 2022-04-30 10:07:42 +02:00
Michael Eischer 4b957e7373 repository: Implement index/snapshot/lock compression
The config file is not compressed as it should remain readable by older
restic versions such that these can return a proper error.

As the old format for unpacked data does not include a version header,
make use of a trick: The old data is always encoded as JSON. Thus it can
only start with '{' or '['. For any other value the first byte indicates
a versioned format. The version is set to 2 for now. Then the zstd
compressed data follows.
2022-04-30 10:07:42 +02:00
Michael Eischer e597b99b55 repository: Reduce repack workers to prevent deadlock
As repack streams packs these occupy one backend connection. Uploading a
new pack also requires a backend connection. To prevent a deadlock
during repack when reaching the backend connections limit, simply limit
the repackWorker count to always leave one connection for uploading.
2022-04-23 11:28:18 +02:00
Alexander Neumann a059ef90f8
Merge pull request #3702 from MichaelEischer/extend-config-error
Print used key name if config fails to load
2022-04-10 20:25:24 +02:00
Michael Eischer c2aabb2686 Print used key name if config fails to load 2022-04-09 22:38:18 +02:00
Alexander Neumann 04e054465a
Merge pull request #3475 from MichaelEischer/local-sftp-conn-limit
Limit concurrent operations for local / sftp backend
2022-04-09 21:33:00 +02:00
Michael Eischer 7b9ae91e04 copy: Load snapshots before indexes 2022-04-09 12:27:25 +02:00
Michael Eischer cd783358d3 local: Limit concurrent backend operations
Use a limit of 2 similar to the filereader concurrency in the archiver.
2022-04-09 12:21:38 +02:00
Michael Eischer 6408686973 repository: Simplify Blob equality check 2022-03-28 22:09:49 +02:00
Michael Eischer 243698680a crypto: Use helpers for size calculations 2022-03-28 22:09:49 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer dc3d77dacc repository: make saveAndEncrypt private 2022-03-28 22:09:49 +02:00
Michael Eischer 6877e7edbb repository: Rename LoadAndDecrypt to LoadUnpacked
The method is the complement for SaveUnpacked and not for
SaveAndEncrypt. The latter assembles blobs into pack files.
2022-03-28 22:09:49 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
Alexander Neumann e682f7c0d6 Add tests for StreamPack 2022-03-21 21:15:03 +01:00
Michael Eischer bba8ba7a5b repository: cancel streampack context after error 2022-02-12 20:18:25 +01:00
Michael Eischer 47554a3428 repository: Fix error handling in repack
When storing a blob fails, this is a fatal error which must not be
retried.
2022-02-12 20:18:25 +01:00
Michael Eischer 930a00ad54 checker: reuse bufio reader 2022-02-12 20:18:25 +01:00
Michael Eischer 34ebafb8b6 repository: don't crash if blob size is too short 2022-02-12 20:18:25 +01:00
Michael Eischer becebf5d88 repository: remove unused DownloadAndHash 2022-02-12 20:18:25 +01:00
Michael Eischer f1e58e7c7f checker: rewrite ReadData to stream packs 2022-02-12 20:18:25 +01:00
Michael Eischer f40abd92fa restorer: convert to use StreamPack 2022-02-12 20:18:25 +01:00
Michael Eischer f00f690658 repository: stream packs during repacking 2022-02-12 20:18:25 +01:00
Michael Eischer c4a2bfcb39 repository: Add StreamPacks function
The function supports efficiently loading a specified list of blobs from
a single pack in a streaming fashion. That is there's no need for
temporary files independent of the pack size.
2022-02-12 20:18:25 +01:00
Michael Eischer 153e2ba859 repository: Implement lisiting blobs per pack file 2022-02-12 20:18:24 +01:00
greatroar 8d2996eaaa Replace siphash by hash/maphash
In Go 1.17.1, maphash has become quite a bit faster than siphash, so we
can drop one third-party dependency. maphash is just an interface to the
standard Go map's hash function, which we already trust for other use
cases.

Benchmark results on linux/amd64, -benchtime=3s:

name                                             old time/op    new time/op    delta
IndexHasUnknown-8                                  50.6ns ±10%    41.0ns ±19%  -18.92%  (p=0.000 n=9+10)
IndexHasKnown-8                                    52.6ns ±12%    41.5ns ±12%  -21.13%  (p=0.000 n=9+10)
IndexMapHash-8                                     3.64µs ± 1%    2.00µs ± 0%  -45.09%  (p=0.000 n=10+9)
IndexAlloc-8                                        700ms ± 1%     601ms ± 6%  -14.18%  (p=0.000 n=8+10)
IndexAllocParallel-8                                205ms ± 5%     192ms ± 8%   -6.18%  (p=0.043 n=10+10)
MasterIndexAlloc-8                                  319ms ± 1%     279ms ± 5%  -12.58%  (p=0.000 n=10+10)
MasterIndexLookupSingleIndex-8                      156ns ± 8%     147ns ± 6%   -5.46%  (p=0.023 n=10+10)
MasterIndexLookupMultipleIndex-8                    150ns ± 7%     142ns ± 8%   -5.69%  (p=0.007 n=10+10)
MasterIndexLookupSingleIndexUnknown-8              74.4ns ± 6%    72.0ns ± 9%     ~     (p=0.175 n=10+9)
MasterIndexLookupMultipleIndexUnknown-8            67.4ns ± 9%    65.5ns ± 7%     ~     (p=0.340 n=9+9)
MasterIndexLookupParallel/known,indices=25-8        461ns ± 2%     445ns ± 2%   -3.49%  (p=0.000 n=10+10)
MasterIndexLookupParallel/unknown,indices=25-8      408ns ±11%     378ns ± 5%   -7.22%  (p=0.035 n=10+9)
MasterIndexLookupParallel/known,indices=50-8        479ns ± 1%     437ns ± 4%   -8.82%  (p=0.000 n=10+10)
MasterIndexLookupParallel/unknown,indices=50-8      406ns ± 8%     343ns ±15%  -15.44%  (p=0.001 n=10+10)
MasterIndexLookupParallel/known,indices=100-8       480ns ± 1%     455ns ± 5%   -5.15%  (p=0.000 n=8+10)
MasterIndexLookupParallel/unknown,indices=100-8     391ns ±18%     382ns ± 8%     ~     (p=0.315 n=10+10)
MasterIndexLookupBlobSize-8                        71.0ns ± 8%    57.2ns ±11%  -19.36%  (p=0.000 n=9+10)
PackerManager-8                                     254ms ± 1%     254ms ± 1%     ~     (p=0.285 n=15+15)

name                                             old speed      new speed      delta
IndexMapHash-8                                   1.12GB/s ± 1%  2.05GB/s ± 0%  +82.13%  (p=0.000 n=10+9)
PackerManager-8                                   208MB/s ± 1%   207MB/s ± 1%     ~     (p=0.281 n=15+15)

name                                             old alloc/op   new alloc/op   delta
IndexMapHash-8                                      0.00B          0.00B          ~     (all equal)
IndexAlloc-8                                        400MB ± 0%     400MB ± 0%     ~     (p=1.000 n=9+10)
IndexAllocParallel-8                                401MB ± 0%     401MB ± 0%   +0.00%  (p=0.000 n=10+10)
MasterIndexAlloc-8                                  258MB ± 0%     262MB ± 0%   +1.42%  (p=0.000 n=9+10)
PackerManager-8                                    73.1kB ± 0%    73.1kB ± 0%     ~     (p=0.382 n=13+13)

name                                             old allocs/op  new allocs/op  delta
IndexMapHash-8                                       0.00           0.00          ~     (all equal)
IndexAlloc-8                                         907k ± 0%      907k ± 0%   -0.00%  (p=0.000 n=10+10)
IndexAllocParallel-8                                 907k ± 0%      907k ± 0%   +0.00%  (p=0.009 n=10+10)
MasterIndexAlloc-8                                   327k ± 0%      317k ± 0%   -3.06%  (p=0.000 n=10+10)
PackerManager-8                                       744 ± 0%       744 ± 0%     ~     (all equal)
2021-09-19 16:05:18 +02:00
Alexander Weiss 81876d5c1b Simplify cache logic 2021-09-03 21:01:00 +02:00
Michael Eischer 9aa2eff384 Add plumbing to calculate backend specific file hash for upload
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
2021-08-04 22:17:46 +02:00
Ryan Hitchman 77bf148460 backup: add --dry-run/-n flag to show what would happen.
This can be used to check how large a backup is or validate exclusions.
It does not actually write any data to the underlying backend. This is
implemented as a simple overlay backend that accepts writes without
forwarding them, passes through reads, and generally does the minimal
necessary to pretend that progress is actually happening.

Fixes #1542

Example usage:

$ restic -vv --dry-run . | grep add
new       /changelog/unreleased/issue-1542, saved in 0.000s (350 B added)
modified  /cmd/restic/cmd_backup.go, saved in 0.000s (16.543 KiB added)
modified  /cmd/restic/global.go, saved in 0.000s (0 B added)
new       /internal/backend/dry/dry_backend_test.go, saved in 0.000s (3.866 KiB added)
new       /internal/backend/dry/dry_backend.go, saved in 0.000s (3.744 KiB added)
modified  /internal/backend/test/tests.go, saved in 0.000s (0 B added)
modified  /internal/repository/repository.go, saved in 0.000s (20.707 KiB added)
modified  /internal/ui/backup.go, saved in 0.000s (9.110 KiB added)
modified  /internal/ui/jsonstatus/status.go, saved in 0.001s (11.055 KiB added)
modified  /restic, saved in 0.131s (25.542 MiB added)
Would add to the repo: 25.892 MiB
2021-08-04 21:19:29 +02:00
Alexander Neumann aef3658a5f Address review comments 2021-01-30 20:02:37 +01:00