Commit Graph

1238 Commits

Author SHA1 Message Date
greatroar c9557b2822 internal/repository: Fix LoadBlob + fuzz test
When given a buf that is big enough for a compressed blob but not its
decompressed contents, the copy at the end of LoadBlob would skip the
last part of the contents.

Fixes #3783.
2022-06-06 17:02:28 +02:00
Alexander Neumann d2c5843c68
Merge pull request #3704 from MichaelEischer/compression-migrations
Support migration to repository format with compression
2022-05-29 15:52:21 +02:00
Alexander Neumann 78a21bbccf
Merge pull request #3752 from MichaelEischer/fix-dir-sync-errors
local: Ignore additional errors for directory syncing
2022-05-29 12:54:51 +02:00
greatroar dde8e9e296 internal/restic: Custom ID.MarshalJSON
This skips an allocation. internal/archiver benchmarks, Linux/amd64:

name                     old time/op    new time/op    delta
ArchiverSaveFileSmall-8    3.94ms ± 6%    3.91ms ± 6%    ~     (p=0.947 n=20+20)
ArchiverSaveFileLarge-8     304ms ± 3%     301ms ± 4%    ~     (p=0.265 n=18+18)

name                     old speed      new speed      delta
ArchiverSaveFileSmall-8  1.04MB/s ± 6%  1.05MB/s ± 6%    ~     (p=0.803 n=20+20)
ArchiverSaveFileLarge-8   142MB/s ± 3%   143MB/s ± 4%    ~     (p=0.421 n=18+19)

name                     old alloc/op   new alloc/op   delta
ArchiverSaveFileSmall-8    17.9MB ± 0%    17.9MB ± 0%  -0.01%  (p=0.000 n=19+19)
ArchiverSaveFileLarge-8     382MB ± 2%     382MB ± 1%    ~     (p=0.687 n=20+19)

name                     old allocs/op  new allocs/op  delta
ArchiverSaveFileSmall-8       540 ± 1%       528 ± 0%  -2.19%  (p=0.000 n=19+19)
ArchiverSaveFileLarge-8     1.93k ± 3%     1.79k ± 4%  -7.06%  (p=0.000 n=20+20)
2022-05-27 12:26:37 +02:00
MichaelEischer 88a8701fb5
Merge pull request #3734 from lbausch/validate-patterns
Validate exclude patterns
2022-05-14 16:20:15 +02:00
MichaelEischer b2a2e5f727
Merge pull request #3753 from greatroar/indexmap-alloc
repository: Re-tune indexmap allocation strategy
2022-05-14 15:44:08 +02:00
MichaelEischer b52c631bd3
Merge pull request #3754 from greatroar/simplify-hashing
hashing: Fix up comments
2022-05-14 15:33:51 +02:00
Lorenz Bausch 36bd464e8c
Add tests for validating exclude patterns 2022-05-11 22:41:00 +02:00
greatroar 39a335e690 hashing: Fix up comments 2022-05-11 21:36:10 +02:00
greatroar 5141228e0c repository: Re-tune indexmap allocation strategy
fd05037e1a changed the allocation batch
size from 256 to 128 under the assumption that an indexEntry is 60 bytes
on amd64, but it's 64: structs are padded out to a multiple of 8 for
alignment reasons. That means we'd waste no space in malloc even without
the batch allocation, at least on 64-bit machines. While that strategy
cuts the overallocation down dramatically for many small indexes, it also
seems to slow allocation down (Go 1.18, Linux, amd64, -benchtime=2s):

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    DecodeIndexParallel-8     4.67s ± 3%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    IndexHasUnknown-8        37.8ns ± 8%    36.5ns ±14%      ~     (p=0.841 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    37.7ns ±10%      ~     (p=0.968 n=5+5)
    IndexAlloc-8              615ms ±18%     607ms ± 1%      ~     (p=1.000 n=10+5)
    IndexAllocParallel-8      245ms ±11%     285ms ± 6%   +16.40%  (p=0.001 n=10+5)
    MasterIndexAlloc-8        286ms ± 9%     275ms ± 2%      ~     (p=1.000 n=10+5)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 1%      ~     (p=0.690 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.8ms ± 2%    +1.48%  (p=0.016 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%    -0.00%  (p=0.000 n=8+4)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%    -0.00%  (p=0.008 n=8+5)
    MasterIndexAlloc-8        213MB ± 0%     159MB ± 0%   -25.47%  (p=0.000 n=10+5)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     2632k ± 0%  +188.19%  (p=0.008 n=5+5)
    IndexAllocParallel-8       913k ± 0%     2632k ± 0%  +188.21%  (p=0.008 n=5+5)
    MasterIndexAlloc-8         318k ± 0%     1172k ± 0%  +267.86%  (p=0.008 n=5+5)

Instead, this patch sets a batch size of 4, which means no space is
wasted by malloc on 64-bit and very little on 32-bit. It still gets very
close to the savings from not allocating in batches, without requiring
special code for bits.UintSize==64. Benchmark results, again for
Linux/amd64:

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.83s ± 9%     ~     (p=0.315 n=10+10)
    DecodeIndexParallel-8     4.67s ± 3%     4.68s ± 4%     ~     (p=0.315 n=10+10)
    IndexHasUnknown-8        37.8ns ± 8%    44.5ns ±19%     ~     (p=0.095 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    36.9ns ± 8%     ~     (p=0.690 n=5+5)
    IndexAlloc-8              615ms ±18%     628ms ±18%     ~     (p=0.218 n=10+10)
    IndexAllocParallel-8      245ms ±11%     262ms ± 9%   +7.02%  (p=0.043 n=10+10)
    MasterIndexAlloc-8        286ms ± 9%     287ms ±13%     ~     (p=1.000 n=10+10)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 0%     ~     (p=1.000 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.5ms ± 0%     ~     (p=0.056 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%     ~     (p=1.000 n=8+10)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%   -0.00%  (p=0.000 n=8+8)
    MasterIndexAlloc-8        213MB ± 0%     160MB ± 0%  -25.02%  (p=0.000 n=10+9)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+10)
    IndexAllocParallel-8       913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+8)
    MasterIndexAlloc-8         318k ± 0%      525k ± 0%  +64.99%  (p=0.000 n=10+10)

The allocation method indexmap.newEntry has also been rewritten in a
form that is a few instructions shorter.
2022-05-11 21:22:14 +02:00
Michael Eischer 48a0d83143 local: Ignore additional errors for directory syncing
Apparently SMB/CIFS on Linux/macOS returns somewhat random errnos when
trying to sync a windows share which does not support calling fsync for
a directory.
2022-05-11 20:37:59 +02:00
MichaelEischer ac36fda155
Merge pull request #3749 from greatroar/simplify-hashing
hashing: Remove io.WriterTo implementation
2022-05-11 20:03:43 +02:00
MichaelEischer df554e5f69
Merge pull request #3748 from greatroar/runworkers
repository: Remove RunWorkers, report ctx.Err()
2022-05-11 19:38:46 +02:00
greatroar 54b8337813 hashing: Remove io.WriterTo implementation
This functionality has gone unused since
4b3dc415ef changed hashing.Reader's only
client to use ioutil.ReadAll on a bufio.Reader wrapping the hashing
Reader.

Reverts bcb852a8d0.
2022-05-10 23:41:18 +02:00
greatroar 2e0f1f5113 repository: Remove RunWorkers, report ctx.Err()
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
2022-05-10 22:26:00 +02:00
MichaelEischer 47c56dea5c
Merge pull request #3746 from greatroar/cache-lstat
cache: Don't Lstat before creating CACHEDIR.TAG
2022-05-10 20:31:15 +02:00
greatroar 2da377c582 cache: Don't Lstat before creating the tag file
The tag file is opened with O_CREATE|O_EXCL and ErrExist is handled, so
we don't need to check for existence first.
2022-05-10 18:52:39 +02:00
Michael Eischer ae7e51382a Fix error on temp file deletion on windows
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
2022-05-09 22:43:26 +02:00
Michael Eischer c1bbbcd0dc migrate: Allow migrations to request a check run
This is currently only used by upgrade_repo_v2.
2022-05-09 22:31:30 +02:00
Michael Eischer 5815f727ee checker: convert error type to use pointer-receivers 2022-05-09 22:31:30 +02:00
Michael Eischer e36a40db10 upgrade_repo_v2: Use atomic replace for supported backends 2022-05-09 22:31:30 +02:00
Michael Eischer 5406743102 prune: Automatically repack uncompressed trees for repo v2
Tree packs are cached locally at clients and thus benefit a lot from
being compressed. Ensure this be having prune always repack pack files
containing uncompressed trees.
2022-05-09 22:31:30 +02:00
Alexander Neumann c8c0d659ec Add migration to compress all data 2022-05-09 22:31:30 +02:00
Alexander Neumann 8c244214bf Add tests for upgrade migration 2022-05-09 22:31:30 +02:00
Alexander Neumann a5f1d318ac Try to make repo upgrade migration more failsafe 2022-05-09 22:31:30 +02:00
Alexander Neumann 82ed5a3a15 Add repo upgrade migration 2022-05-09 22:31:30 +02:00
Michael Eischer 92816fa966 init: Enable compression support by default 2022-05-09 22:31:30 +02:00
Lorenz Bausch 9fb81c4246
Validate exclude patterns 2022-05-07 21:12:47 +02:00
Lorenz Bausch e7fd200237
Keep original pattern for later use 2022-05-07 21:08:09 +02:00
Michael Eischer cf5cb673fb repository: Use existing method to collect pack ids 2022-04-30 19:14:21 +02:00
Michael Eischer b335cb6285 repository: Refactor index IDs collection 2022-04-30 19:14:21 +02:00
Daniel Gröber f31b4f29c1 Use config file modes to derive new dir/file modes
Fixes #2351
2022-04-30 15:59:51 +02:00
Michael Eischer 4b01b06f2f repository: Test compressed blobs in StreamPack 2022-04-30 11:34:10 +02:00
Michael Eischer bcab548617 pack: slightly expand testing of compressed blobs 2022-04-30 11:34:10 +02:00
Michael Eischer ec2b25565a repository: test uncompressedLength field and index example 2022-04-30 11:34:10 +02:00
Michael Eischer 9ffb8920f1 repository: run blackbox tests using old and new repo version 2022-04-30 11:34:10 +02:00
Michael Eischer abe5935693 repository: unify repository version-specific initialization
Mark the master index as compressed also when initializing a new
repository. This is only relevant for testing.
2022-04-30 11:34:10 +02:00
Alexander Neumann 8776031f96 Leave allocating slices to the decompress code 2022-04-30 11:34:10 +02:00
Alexander Neumann 5eb05a0afe Configure zstd encoder/decoder 2022-04-30 11:34:10 +02:00
Michael Eischer 2f36e044db Cleanup pack header check 2022-04-30 11:34:10 +02:00
Alexander Neumann 8b11b86383 Add option global --compression 2022-04-30 11:34:10 +02:00
Michael Eischer 7132df529e repository: Increase index size for repo version 2
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
2022-04-30 11:34:10 +02:00
Michael Eischer 66f9048bce repository: Alloc zstd encoder/decoder on demand 2022-04-30 11:34:10 +02:00
Michael Eischer fd05037e1a repository: recalibrate index batch allocation size 2022-04-30 11:34:10 +02:00
Michael Eischer 6fb408d90e repository: implement pack compression 2022-04-30 11:34:10 +02:00
Michael Eischer 362ab06023 init: Add flag to specify created repository version 2022-04-30 10:07:42 +02:00
Michael Eischer 4b957e7373 repository: Implement index/snapshot/lock compression
The config file is not compressed as it should remain readable by older
restic versions such that these can return a proper error.

As the old format for unpacked data does not include a version header,
make use of a trick: The old data is always encoded as JSON. Thus it can
only start with '{' or '['. For any other value the first byte indicates
a versioned format. The version is set to 2 for now. Then the zstd
compressed data follows.
2022-04-30 10:07:42 +02:00
Michael Eischer e597b99b55 repository: Reduce repack workers to prevent deadlock
As repack streams packs these occupy one backend connection. Uploading a
new pack also requires a backend connection. To prevent a deadlock
during repack when reaching the backend connections limit, simply limit
the repackWorker count to always leave one connection for uploading.
2022-04-23 11:28:18 +02:00
Michael Eischer ee627cd832 backend/mem: Actually enforce connection limit
This will allow tests to detect deadlocks related to the connections
limit.
2022-04-23 11:22:00 +02:00
Michael Eischer 4f97492d28 Backend: Expose connections parameter 2022-04-23 11:13:08 +02:00