Commit Graph

17 Commits

Author SHA1 Message Date
Michael Eischer cb9cbe55d9 repository: store oversized blobs in separate pack files
Store oversized blobs in separate pack files as the blobs is large
enough to warrant its own pack file. This simplifies the garbage
collection of such blobs and keeps the cache smaller, as oversize (tree)
blobs only have to be downloaded if they are actually used.
2023-10-17 22:52:16 +02:00
Michael Eischer 0a6fa602c8 add option for setting min pack size 2022-08-05 23:47:12 +02:00
Michael Eischer 753e56ee29 repository: Limit to a single pending pack file
Use only a single not completed pack file to keep the number of open and
active pack files low. The main change here is to defer hashing the pack
file to the upload step. This prevents the pack assembly step to become
a bottleneck as the only task is now to write data to the temporary pack
file.

The tests are cleaned up to no longer reimplement packer manager
functions.
2022-07-02 22:42:34 +02:00
Michael Eischer 120ccc8754 repository: Rework blob saving to use an async pack uploader
Previously, SaveAndEncrypt would assemble blobs into packs and either
return immediately if the pack is not yet full or upload the pack file
otherwise. The upload will block the current goroutine until it
finishes.

Now, the upload is done using separate goroutines. This requires changes
to the error handling. As uploads are no longer tied to a SaveAndEncrypt
call, failed uploads are signaled using an errgroup.

To count the uploaded amount of data, the pack header overhead is no
longer returned by `packer.Finalize` but rather by
`packer.HeaderOverhead`. This helper method is necessary to continue
returning the pack header overhead directly to the responsible call to
`repository.SaveBlob`. Without the method this would not be possible,
as packs are finalized asynchronously.
2022-07-02 22:42:34 +02:00
Michael Eischer a6e9e08034 Account for pack header overhead at each entry
This will miss the pack header crypto overhead and the length field,
which only amount to a few bytes per pack file.
2022-07-02 18:55:58 +02:00
Michael Eischer 6fb408d90e repository: implement pack compression 2022-04-30 11:34:10 +02:00
Michael Eischer 9aa2eff384 Add plumbing to calculate backend specific file hash for upload
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
2021-08-04 22:17:46 +02:00
aawsome 0fed6a8dfc
Use "pack file" instead of "data file" (#2885)
- changed variable names, especially changed DataFile into PackFile
- changed in some comments
- always use "pack file" in docu
2020-08-16 11:16:38 +02:00
greatroar b592614061 Improve PackerManager benchmark
The previous benchmark spent much of its time allocating RNGs and
generating too many random numbers. It now spends 90% of its time
hashing and half of the rest writing to files.

name             old time/op    new time/op    delta
PackerManager-8     319ms ± 1%     247ms ± 1%  -22.48%  (p=0.000 n=20+18)

name             old speed      new speed      delta
PackerManager-8   143MB/s ± 1%   213MB/s ± 1%  +48.63%  (p=0.000 n=10+18)

name             old alloc/op   new alloc/op   delta
PackerManager-8     635kB ± 0%      92kB ± 0%  -85.48%  (p=0.000 n=10+19)

name             old allocs/op  new allocs/op  delta
PackerManager-8     1.64k ± 0%     1.43k ± 0%  -12.76%  (p=0.000 n=10+20)
2020-03-05 22:30:03 +01:00
greatroar b7c3039eb2 Remove Go 1.5 compatibility code from PackerManager benchmark
This alone is enough to speed up the benchmark by ~10%.
2020-03-05 22:29:06 +01:00
Gábor Lipták e5d7879622
Correct ineffassign
Signed-off-by: Gábor Lipták <gliptak@gmail.com>
2018-10-19 16:58:14 -04:00
Alexander Neumann 99f7fd74e3 backend: Improve Save()
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.

The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.
2018-03-03 15:49:44 +01:00
Alexander Neumann 3541d06d07 repo: Split packers for tree and data
The code now bundles tree blobs and data blobs into different pack
files, so we'll end up with pack files that either only contain data or
trees. This is in preparation to adding a cache (#1040), because
tree-only pack files can easily be cached later on.
2017-09-22 15:36:47 +02:00
Alexander Neumann db0e3cd772 repo: Remove packer limits
This commit simplifies finding a packer: The first open packer is taken,
and the upper limit for the pack file is removed.
2017-09-22 15:36:47 +02:00
Alexander Neumann 23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00
Alexander Neumann 6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann 83d1a46526 Moves files 2017-07-23 14:19:13 +02:00