Commit Graph

42 Commits

Author SHA1 Message Date
Michael Eischer 1ebd57247a repository: optimize MasterIndex.Each
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.

> name                old time/op  new time/op  delta
> MasterIndexEach-16   6.68s ±24%   0.96s ± 2%  -85.64%  (p=0.008 n=5+5)
2022-09-24 12:21:59 +02:00
Michael Eischer 77b1980d8e repository: MasterIndex.Packs: reduce allocations 2022-08-19 21:10:43 +02:00
Michael Eischer 6ff9517e45 repository: MasterIndex.ListPacks / Index.EachByPack allow earlier GC
Allow earlier garbage collection of some of the intermediate data
structures.
2022-08-19 21:06:33 +02:00
Michael Eischer 6f53ecc1ae adapt workers based on whether an operation is CPU or IO-bound
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
2022-07-03 12:19:26 +02:00
Michael Eischer 04c23fa95d rebuild-index: correctly rebuild index for mixed packs
For mixed packs, data and tree blobs were stored in separate index
entries. This results in warning from the check command and maybe other
problems.
2022-07-02 19:24:02 +02:00
Michael Eischer 2cd7e90ad1 repository: cleanup 2022-07-02 18:39:59 +02:00
Michael Eischer 1974ad7ce2 repository: hide MasterIndex.FinalizeFullIndexes / FinalizeNotFinalIndexes 2022-07-02 18:39:59 +02:00
Michael Eischer ef53ca4a5a repository: remove MasterIndex.All() 2022-07-02 18:39:59 +02:00
Michael Eischer bf81bf0795 repository: Properly set id for finalized index
As MergeFinalIndex and index uploads can occur concurrently, it is
necessary for MergeFinalIndex to check whether the IDs for an index were
already set before merging it. Otherwise, we'd loose the ID of an index
which is set _after_ uploading it.
2022-07-02 18:39:59 +02:00
Michael Eischer e0a7852b8b repository: remove unused (Master)Index.Count 2022-07-02 18:39:58 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Michael Eischer a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
greatroar 2e0f1f5113 repository: Remove RunWorkers, report ctx.Err()
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
2022-05-10 22:26:00 +02:00
Michael Eischer b335cb6285 repository: Refactor index IDs collection 2022-04-30 19:14:21 +02:00
Michael Eischer 7132df529e repository: Increase index size for repo version 2
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
2022-04-30 11:34:10 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer 153e2ba859 repository: Implement lisiting blobs per pack file 2022-02-12 20:18:24 +01:00
Alexander Weiss 81876d5c1b Simplify cache logic 2021-09-03 21:01:00 +02:00
Alexander Neumann 16313bfcc9 errcheck: Add error check for MergeFinalIndexes() 2021-01-30 20:02:37 +01:00
Michael Eischer de99207046 repository: tweak comment for packs method 2020-12-22 23:01:58 +01:00
Alexander Weiss 68b74e359e Count packs directly in RebuildIndexFiles 2020-12-22 23:01:58 +01:00
Alexander Weiss aa7a5f19c2 Use BlobHandle in index methods 2020-11-22 20:41:12 +01:00
Alexander Weiss e3013271a6 Harmonize naming 2020-11-22 20:41:12 +01:00
Alexander Weiss ce5d630681 Add MasterIndex.PackSize() 2020-11-21 22:13:54 +01:00
Alexander Weiss 187c8fb259 Parallelize MasterIndex.Save() 2020-11-15 07:05:09 +01:00
Alexander Weiss 1ec628ddf5 Add extraObsolete to MasterIndex.Save 2020-11-15 07:05:09 +01:00
greatroar ddca699cd2 Replace restic.Progress with new progress.Counter
This fixes two race conditions while cleaning up the code.
2020-11-09 12:12:35 +01:00
Alexander Weiss 38cc4393f6 Add Masterindex.Save(); Add Index.Packs() 2020-11-06 20:23:30 +01:00
Sergio Rubio e708628cfd
Remove unused function
Not currently used, and it'd need to be added to the MasterIndex interface first.
2020-10-28 13:24:49 +01:00
Michael Eischer d6cfe857b7 pass proper context into MasterIndex.RebuildIndex 2020-10-09 22:39:07 +02:00
Michael Eischer d0329cf3eb Adjust comments to match name of exported methods 2020-09-05 10:07:16 +02:00
Alexander Weiss 9d1fb94c6c make Lookup() return all blobs
+ simplify syntax
2020-07-25 21:18:34 +02:00
Alexander Weiss e388d962a5 Merge final indexes together for faster index access 2020-07-22 21:54:02 +02:00
Michael Eischer d92e2c5769 simplify index code 2020-06-14 07:50:19 +02:00
Alexander Weiss 91906911b0 Fix non-intuitive repository behavior
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
  -> also cleans up pending index entries when they are saved in the index
  -> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
  -> removes the need of an index uploader
  -> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
2020-06-11 13:05:23 +02:00
Alexander Neumann 663c57ab4d debug: Remove manual Str() call Log() 2018-01-25 20:49:41 +01:00
Matthew Dawson 3789e55e20
repostiory/index: Remove logging from Lookup function.
The logging in these functions double the time they take to execute.
However, it is only really useful on failures, which are better
reported by the calling functions.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              897           395           -55.96%
BenchmarkMasterIndexLookupMultipleIndex-6            2001          1090          -45.53%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       492           215           -56.30%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     1649          912           -44.69%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              1              -88.89%
BenchmarkMasterIndexLookupMultipleIndex-6            19             1              -94.74%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       6              0              -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     16             0              -100.00%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           96            -40.00%
BenchmarkMasterIndexLookupMultipleIndex-6            240           96            -60.00%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       48            0             -100.00%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     128           0             -100.00%
2018-01-23 22:28:38 -05:00
Matthew Dawson df2c03a6a4
repository/master_index: Optimize Index.Lookup()
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup.  However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.

Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error.  Also change
all the calls to these functions to handle the new function signature.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              820           897           +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6            12821         2001          -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       5378          492           -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     17026         1649          -90.31%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              9              +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            59             19             -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       22             6              -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     72             16             -77.78%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           160           +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            3200          240           -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       1232          48            -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     4272          128           -97.00%
2018-01-23 22:25:56 -05:00
Felix Lee cc5ada63a4 fixes #1251, race when writing indexes 2017-10-07 05:11:42 -07:00
Alexander Neumann 23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00
Alexander Neumann 6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann 83d1a46526 Moves files 2017-07-23 14:19:13 +02:00