Commit Graph

11 Commits

Author SHA1 Message Date
Michael Eischer c4fc5c97f9 prune: Use a single CountedBlobSet to track blobs
The set covers necessary, existing and duplicate blobs. This removes the
duplicate sets used to track whether all necessary blobs also exist.
This reduces the memory usage of prune by about 20-30%.
2022-10-22 18:45:12 +02:00
Michael Eischer fbcbd5318c repository: extract LoadTree/SaveTree
The repository has no real idea what a Tree is. So these methods never
belonged there.
2022-07-17 13:11:28 +02:00
Michael Eischer 6f53ecc1ae adapt workers based on whether an operation is CPU or IO-bound
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
2022-07-03 12:19:26 +02:00
Michael Eischer 254c8743fc Limit number of large tree blobs loaded in parallel by StreamTrees
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.
2022-02-19 12:26:09 +01:00
Michael Eischer eda8c67616 restic: let FindUsedBlobs handle multiple snapshots at once 2021-01-28 11:08:43 +01:00
Michael Eischer 258ce0c1e5 parallel: report progress for StreamTrees
This assigns an id to each tree root and then keeps track of how many
tree loads (i.e. trees referenced for the first time) are pending per
tree root. Once a tree root and its subtrees were fully processed there
are no more pending tree loads and the tree root is reported as
processed.
2021-01-28 11:08:43 +01:00
Michael Eischer f2a1b125cb restic: Actually parallelize FindUsedBlobs 2021-01-28 11:08:43 +01:00
Michael Eischer af66a62c04 FindUsedBlobs: Test that seen blobs are skipped
This also copies the TreeLoader interface from internal/walker to allow
stubbing the repository in the call to `FindUsedBlobs`.
2020-08-01 12:29:16 +02:00
Michael Eischer 9ea1a78bd4 FindUsedBlobs: Check for seen blobs before loading trees
The only effective change in behavior is that that toplevel nodes can
also be skipped.
2020-08-01 12:29:16 +02:00
Michael Eischer 184103647a FindUsedBlobs: merge seen into blobs BlobSet
The seen BlobSet always contained a subset of the entries in blobs.
Thus use blobs instead and avoid the memory overhead of the second set.

Suggested-by: Alexander Weiss <alex@weissfam.de>
2020-08-01 12:29:16 +02:00
Alexander Neumann 23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00