Commit Graph

102 Commits

Author SHA1 Message Date
Michael Eischer 66818a8f98
Merge pull request #3980 from MichaelEischer/prune-compression-stats
prune: Correctly count used/duplicate blobs for partially compressed repos
2022-11-12 20:06:56 +01:00
Michael Eischer 59a90943bb
Merge pull request #3983 from greatroar/formatting
Centralize and fix formatting of bytes, percentages, durations
2022-10-31 18:52:24 +01:00
greatroar 006380199e cmd, ui: Deduplicate formatting utilities 2022-10-23 13:40:07 +02:00
Michael Eischer ba58ccbe07 prune: add remark about non-deterministic blob selection 2022-10-22 19:46:10 +02:00
Michael Eischer 05651d6d4f prune: Correctly count used/duplicate blobs for partially compressed repos
Counting the first occurrence of a duplicate blob as used and counting
all other as duplicates, independent of which instance of the blob is
kept, is only accurate if all copies of the blob have the same size. This
is no longer the case for a repository containing both compressed and
uncompressed blobs.

Thus for duplicated blobs first count all instances as duplicates and
then subtract the actually used instance later on.
2022-10-22 19:24:36 +02:00
Michael Eischer d966c52707 prune: allow gc of set of repacked blobs before index rebuild 2022-10-22 18:45:12 +02:00
Michael Eischer 68c9cb9c6a prune: Shrink keepBlobs set if possible
As long as only a small fraction of the data in a repository is
rewritten, the keepBlobs set will be rather small after cleaning it up.
As golang maps do not shrink their memory usage, just copy the contents
over to a new map. However, only copy the map if the cleanup removed at
least half the entries.
2022-10-22 18:45:12 +02:00
Michael Eischer c4fc5c97f9 prune: Use a single CountedBlobSet to track blobs
The set covers necessary, existing and duplicate blobs. This removes the
duplicate sets used to track whether all necessary blobs also exist.
This reduces the memory usage of prune by about 20-30%.
2022-10-22 18:45:12 +02:00
Michael Eischer 2e3f1c08c5 repository: split index into a separate package 2022-10-08 21:15:34 +02:00
Michael Eischer 6d2d297215 pass global context through cobra 2022-10-03 00:19:46 +02:00
Michael Eischer 928914f821 Prepare for context bound to lock lifetime 2022-10-03 00:19:46 +02:00
Michael Eischer 985722b102 Remove ctx from globalOptions
Previously the global context was either accessed via gopts.ctx,
stored in a local variable and then used within that function or
sometimes both. This makes it very hard to follow which ctx or a wrapped
version of it reaches which method.

Thus just drop the context from the globalOptions struct and pass it
explicitly to every command line handler method.
2022-10-03 00:19:46 +02:00
Michael Eischer 1ebd57247a repository: optimize MasterIndex.Each
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.

> name                old time/op  new time/op  delta
> MasterIndexEach-16   6.68s ±24%   0.96s ± 2%  -85.64%  (p=0.008 n=5+5)
2022-09-24 12:21:59 +02:00
mattxtaz 01ab36336f Fix typo with double percentage in help text 2022-08-07 20:21:05 +01:00
Michael Eischer 55a11c1396 Reword prune --repack-small description 2022-08-05 23:48:36 +02:00
Michael Eischer 176b387d98 Always repack very small pack files 2022-08-05 23:48:36 +02:00
Michael Eischer 324935cb80 Only repack small files if there are multiple of them 2022-08-05 23:48:34 +02:00
Michael Eischer 1b076cda97 rename option to --pack-size 2022-08-05 23:47:43 +02:00
Michael Eischer 6a6d313c9a prune: reduce priority of repacking small packs 2022-08-05 23:47:12 +02:00
Kyle Brennan 0269381b8d prune: add repack-small parameter 2022-08-05 23:47:12 +02:00
Michael Eischer e85a21eda2 prune: move code 2022-07-30 17:37:07 +02:00
Michael Eischer d0590b7841 prune: Add internal integrity check
After repacking every blob that should be kept must have been repacked.
We have seen a few cases in which a single blob went missing, which
could have been caused by a bitflip somewhere. This sanity check might
help catch some of these cases.
2022-07-30 17:37:07 +02:00
Michael Eischer 5cbde03eae prune: split into smaller functions 2022-07-30 17:37:07 +02:00
Alexander Weiss 7643237da5 prune: separate collecting/printing/pruning 2022-07-30 17:37:07 +02:00
Michael Eischer 715d457aad prune: code cleanups 2022-07-17 11:41:56 +02:00
Michael Eischer 9be1bd2acc prune: handle very high duplication of some blobs
Suggested-By: Alexander Weiss <alex@weissfam.de>
2022-07-17 11:39:56 +02:00
Alexander Weiss 7478cbf70e prune: Enhance treatment of duplicates 2022-07-17 00:22:23 +02:00
Michael Eischer ec4dfa3c66 Wording: replace further repo occurrences with repository 2022-07-12 20:48:01 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Alexander Neumann 74f7fe2b98
Merge pull request #3767 from MichaelEischer/fix-prune-empty-snapshot
prune: Fix crash on snapshot loading error
2022-05-29 16:52:45 +02:00
Michael Eischer c8e1ac4049 prune: Don't print stack trace if snapshot can't be loaded 2022-05-23 22:38:45 +02:00
Michael Eischer 173695104c prune: Fix crash on empty snapshot 2022-05-23 22:32:59 +02:00
Michael Eischer 381bd94c6c prune: Add option to repack uncompressed data 2022-05-09 22:31:30 +02:00
Michael Eischer 5406743102 prune: Automatically repack uncompressed trees for repo v2
Tree packs are cached locally at clients and thus benefit a lot from
being compressed. Ensure this be having prune always repack pack files
containing uncompressed trees.
2022-05-09 22:31:30 +02:00
Michael Eischer dbbeac7174 prune: Add unsafe option to recover from no free space
The new option allows prune to operate with nearly no scratch space by only removing
no longer necessary pack files and first deleting the index before
rebuilding it. By first deleting the index it becomes safe to just
delete no longer necessary pack files. However, as a downside there's
now the risk that the repository becomes inaccessible if prune fails.

To recover from that problem a user might have to manually delete the
repository index and then run (a full) `rebuild-index` again.
2022-04-30 19:21:07 +02:00
Michael Eischer f5609d1d3c prune: Fail early if too few backend connections 2022-04-23 11:32:52 +02:00
Michael Eischer 3d29083e60 copy/find/ls/recover/stats: Memorize snapshot listing before index
These commands filter the snapshots according to some criteria which
essentially requires loading the index before filtering the snapshots.
Thus create a copy of the snapshots list beforehand and use it later on.
2022-04-09 12:26:30 +02:00
Michael Eischer 2ec0f3303a backup/diff/dump/restore/stats: List snapshots before index
During a backup the index is written before the corresponding snapshots.
To ensure that a concurrent/later restic run can read a snapshot's data,
restic thus must first load the snapshots and only afterwards the index.
Otherwise it is not possible to ensure that the loaded index is recent
enough to cover all of the snapshot's data.
2022-04-09 12:24:09 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
rawtaz abfbacf3d3
Merge pull request #3591 from MichaelEischer/prune-fix-max-repack
prune: Handle --max-repack-size=0 as expected
2022-01-13 03:53:20 +01:00
mattxtaz 6ff32ee4d3 Add missing colon in prune stats output and change padding to 14 chars to align the fields 2022-01-06 21:15:15 +00:00
Michael Eischer 0cfdb82ea4 prune: Handle --max-repack-size=0 as expected
Previously the flag was ignored and `--max-repack-size=1` had to be
used.
2021-12-27 15:48:56 +01:00
greatroar 7f0aa49f45 cmd/restic: Streamline progress printing
* PrintProgress no longer does unnecessary Sprintf calls, and performs
  fewer allocations in general
* newProgressMax's callback checks whether the terminal supports
  line updates once instead of once per call
* the callback looks up the terminal width once per call instead of
  twice (on Windows)
* the status shortening now uses the Unicode-aware version from
  internal/ui/termstatus (future-proofing)
2021-09-03 11:48:22 +02:00
Michael Eischer aad8864835 prune: Add missing newlines in error descriptions 2021-06-20 14:25:40 +02:00
Alexander Neumann 027a51529d prune: Improve error message for missing files
This commit changes the error message so that a list of file names is
printed. Before, just the raw map was printed, which is not a great user
interface.
2021-01-31 11:31:27 +01:00
Alexander Weiss e08e65dc30 prune: Simplify logic selecting packs to repack 2021-01-29 22:27:22 +01:00
Alexander Weiss daeb4cdf8f prune: Fix statistics for --repack-cacheable-only 2021-01-29 22:27:22 +01:00
Alexander Neumann a16ce65295
Merge pull request #3244 from MichaelEischer/better-damage-reports
Print more details about possible repository damages
2021-01-29 11:45:45 +01:00
Michael Eischer 58b5679f14 prune: reword missing blobs error
The previous wording could be understood such that the prune run did
damage the repository.
2021-01-28 21:48:24 +01:00