restic

Commit Graph

Author	SHA1	Message	Date
Alexander Neumann	bdfedf1f5b	Merge pull request #3173 from MichaelEischer/unify-index-loading Unify index loading	2021-01-28 13:50:42 +01:00
Michael Eischer	e2b0072441	check: add progress bar to the tree structure check	2021-01-28 11:10:50 +01:00
Michael Eischer	258ce0c1e5	parallel: report progress for StreamTrees This assigns an id to each tree root and then keeps track of how many tree loads (i.e. trees referenced for the first time) are pending per tree root. Once a tree root and its subtrees were fully processed there are no more pending tree loads and the tree root is reported as processed.	2021-01-28 11:08:43 +01:00
Michael Eischer	6e03f80ca2	check: Split the parallelized tree loader into a reusable component The actual code change is minimal	2021-01-28 11:08:43 +01:00
Michael Eischer	1d7bb01a6b	check: Cleanup tree loading and switch to use errgroup The helper methods are now wired up in the Structure method.	2021-01-28 11:08:43 +01:00
Alexander Weiss	2a1add7538	check: remove file size counter	2020-12-23 02:34:31 +01:00
Michael Eischer	96904f8972	check: extract parallel index loading	2020-12-22 22:36:18 +01:00
Alexander Weiss	26f85779be	Parallelize ForAllSnapshots	2020-12-06 05:09:58 +01:00
Alexander Weiss	aa7a5f19c2	Use BlobHandle in index methods	2020-11-22 20:41:12 +01:00
Alexander Weiss	a851c53cbe	Use PackSize in checker	2020-11-21 22:13:54 +01:00
Alexander Weiss	c3ddde9e7d	Return hdrSize in ListPack	2020-11-21 22:13:54 +01:00
Michael Eischer	1f43cac12d	check: Only track data blobs when unused blobs should be reported This improves the memory usage of check a lot as it now only has to track tree blobs when run using the default parameters.	2020-11-15 18:43:07 +01:00
Michael Eischer	6da66c15d8	check: Simplify referenced blob tracking The result is identical as long as the context in not canceled. However, in that case the result is incomplete anyways.	2020-11-15 18:42:55 +01:00
Michael Eischer	3500f9490c	check: Simplify blob status tracking UnusedBlobs now directly reads the list of existing blobs from the repository index. This removes the need for the blobStatusExists flag, which in turn allows converting the blobRefs map into a BlobSet.	2020-11-15 18:42:42 +01:00
Michael Eischer	b8c7543a55	check: Merge 'size could not be found' and 'not found in index' errors By construction these two errors always show up in pairs: 'size could not be found' is printed when the blob is not found in the repository index. That blob is also part of the `blobs` array. Later on, check iterates over that array and checks whether the blob is marked as existing. Which cannot be the case as that mark is generated by iterating over the repository index. The merged warning no longer reports the blob index within a file. That information could also be derived by printing the affected tree using `cat` and searching for the blob.	2020-11-15 18:41:50 +01:00
Alexander Weiss	17bb77b1f9	check: Also check blob length and offset	2020-11-14 00:42:49 +01:00
Alexander Weiss	80dcfca191	check: Check sizes computed from index and pack header	2020-11-14 00:42:49 +01:00
MichaelEischer	46d31ab86d	Merge pull request #3058 from greatroar/counter Replace restic.Progress with new progress.Counter (fixes two race conditions)	2020-11-09 22:19:09 +01:00
Alexander Weiss	239931578c	check: check index for packs that are read	2020-11-09 17:28:14 +01:00
greatroar	21b787a4d1	Stop Counters where they're constructed and started	2020-11-09 13:03:31 +01:00
greatroar	ddca699cd2	Replace restic.Progress with new progress.Counter This fixes two race conditions while cleaning up the code.	2020-11-09 12:12:35 +01:00
Alexander Weiss	b44ecde8b0	Fix setting of ID in DecodeIndex	2020-10-17 09:12:58 +02:00
MichaelEischer	4ba237bb93	Merge pull request #3019 from greatroar/refactor-decodeindex Refactor index decoding	2020-10-15 23:22:33 +02:00
greatroar	b27375f5ce	defer close(ch) outside repository.RunWorkers	2020-10-14 15:50:16 +02:00
greatroar	27db3ec262	Refactor index decoding Decoding old-format indices no longer requires loading and decrypting twice.	2020-10-13 20:47:50 +02:00
Michael Eischer	c458e114d4	pass context to Find / FindSnapshot This allows proper interruption of restic while it searches for snapshots or key files.	2020-10-09 22:37:56 +02:00
Michael Eischer	4784540f04	repository: Simplify worker group code	2020-09-05 10:07:16 +02:00
aawsome	0fed6a8dfc	Use "pack file" instead of "data file" (#2885 ) - changed variable names, especially changed DataFile into PackFile - changed in some comments - always use "pack file" in docu	2020-08-16 11:16:38 +02:00
MichaelEischer	34181b13a2	Merge pull request #2328 from MichaelEischer/no-repeated-checks Fix duplicate tree checks within `restic check`	2020-07-22 22:08:02 +02:00
Alexander Weiss	e388d962a5	Merge final indexes together for faster index access	2020-07-22 21:54:02 +02:00
Michael Eischer	9b0e718852	checker: Test that blob types are not confused	2020-07-20 23:43:47 +02:00
Michael Eischer	ddf0b8cd0b	checker: Properly distinguish between data and tree blobs If a data blob and a tree blob with the same ID (= same content) exist, then the checker did not report a data or tree blob as unused when the blob of the other type was still in use.	2020-07-20 22:58:39 +02:00
Michael Eischer	2d0c138c9b	checker: Test that check only decodes trees once The `DuplicateTree` flag is necessary to ensure that failures cannot be swallowed. The old checker implementation ignores errors from LoadTree if the corresponding tree was already checked.	2020-07-20 22:51:53 +02:00
Michael Eischer	ef325ffc02	checker: Cleanup error handling code This change only moves code around but does not result in any change in behavior.	2020-07-20 22:51:53 +02:00
Michael Eischer	7a165f32a9	checker: Traverse trees in depth-first order Backups traverse the file tree in depth-first order and saves trees on the way back up. This results in tree packs filled in a way comparable to the reverse Polish notation. In order to check tree blobs in that order, the treeFilter would have to delay the forwarding of tree nodes until all children of it are processed which would complicate the implementation. Therefore do the next similar thing and traverse the tree in depth-first order, but process trees already on the way down. The tree blob ids are added in reverse order to the backlog, which is once again reverted when removing the ids from the back of the backlog.	2020-07-20 22:51:53 +02:00
Michael Eischer	36c69e3ca7	checker: Unify blobs, processed trees and referenced blobs map The blobRefs map and the processedTrees IDSet are merged to reduce the memory usage. The blobRefs map now uses separate flags to track blob usage as data or tree blob. This prevents skipping of trees whose content is identical to an already processed data blob. A third flag tracks whether a blob exists or not, which removes the need for the blobs IDSet.	2020-07-20 22:51:47 +02:00
Michael Eischer	35d8413639	checker: Remove dead index map	2020-07-20 22:37:31 +02:00
Michael Eischer	c66a0e408c	checker: Reduce cost of debug log Avoid duplicate allocation of the Subtree list.	2020-07-20 22:37:31 +02:00
Michael Eischer	70f4c014ef	checker: Decode identical tree nodes only once Even though the checkTreeWorker skips already processed chunks, filterTrees did queue the same tree blob on every occurence. This becomes a serious performance bottleneck for larger number of snapshots that cover mostly the same directories. Therefore decode a tree blob exactly once.	2020-07-20 22:37:31 +02:00
Michael Eischer	f0d8710611	Add benchmark for checker scaling with snapshot count	2020-07-20 22:37:31 +02:00
Alexander Bruyako	da48b925ff	remove unnecessary error return I was running "golangci-lint" and found this two warnings internal/checker/checker.go:135:18: (Checker).LoadIndex$3 - result 0 (error) is always nil (unparam) final := func() error { ^ internal/repository/repository.go:457:18: (Repository).LoadIndex$3 - result 0 (error) is always nil (unparam) final := func() error { ^ It turns out that these functions are used only in "RunWorkers(...)", which is used only two times in whole project right after this "final" functions. And because these "final" functions always return "nil", I've descided, that it would be better to remove requriments for "final" func to return error to avoid magick "return nil" at their end.	2020-01-27 18:28:21 +03:00
Alexander Neumann	7304738872	check: Reduce default parallelism from 40 to 5	2019-04-13 13:38:39 +02:00
Alexander Neumann	66efa425bf	Reuse buffer in worker functions	2019-04-13 13:38:39 +02:00
Alexander Neumann	e046428c94	Replace FilesInParallel with an errgroup.Group	2019-04-13 13:38:39 +02:00
Chris Howie	1688713400	Add key hinting (#2097 )	2018-11-25 09:13:18 -05:00
Alexander Neumann	bfa18ee8ec	DownloadAndHash: Check error returned by Load()	2018-10-28 21:28:56 +01:00
Alexander Neumann	754482fe6c	checker: Disable size check for now	2018-07-15 21:52:38 +02:00
Alexander Neumann	38926d8576	Use new archiver code in tests	2018-04-25 14:42:45 +02:00
Alexander Neumann	83ca08245b	checker: Check metadata size and blob sizes	2018-04-22 11:37:05 +02:00
Alexander Neumann	1c1fede399	Improve error message for orphaned pack files	2018-04-07 10:07:54 +02:00
Alexander Neumann	e68a7fea8a	check: Allow filling the cache during check Closes #1665	2018-04-01 13:59:27 +02:00
Alexander Neumann	99f7fd74e3	backend: Improve Save() As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346) this changes the signature for `backend.Save()`. It now takes a parameter of interface type `RewindReader`, so that the backend implementations or our `RetryBackend` middleware can reset the reader to the beginning and then retry an upload operation. The `RewindReader` interface also provides a `Length()` method, which is used in the backend to get the size of the data to be saved. This removes several ugly hacks we had to do to pull the size back out of the `io.Reader` passed to `Save()` before. In the `s3` and `rest` backend this is actively used.	2018-03-03 15:49:44 +01:00
Igor Fedorenko	07d080830e	Add --read-data-subset flag to check command Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-18 23:31:27 -05:00
Igor Fedorenko	ab040d8811	Introduced repository.DownloadAndHash helper Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-16 21:13:11 -05:00
Igor Fedorenko	d58ae43317	Reworked Backend.Load API to retry errors during ongoing download Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-16 21:12:14 -05:00
Alexander Neumann	cccb2fc7e7	Merge pull request #1583 from restic/close-open-backend-files Close backend files in case of errors	2018-01-26 21:57:28 +01:00
Alexander Neumann	909d9273cc	Close backend files in case of errors	2018-01-25 21:05:57 +01:00
Alexander Neumann	663c57ab4d	debug: Remove manual Str() call Log()	2018-01-25 20:49:41 +01:00
Alexander Neumann	b0c6e53241	Fix calls to repo/backend.List() everywhere	2018-01-21 21:15:09 +01:00
Igor Fedorenko	231076fa4a	checker: Optimize checker.Packs() Use result of single repository.List() to find both missing and orphaned data packs. For 500GB repository this eliminates ~100K repository.Test() calls and improves check time by >30M in my environment (~45min before this change and ~7min after). Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-01-18 20:50:39 -05:00
Alexander Neumann	931e6ed2ac	Use Seal/Open everywhere	2017-11-01 10:30:40 +01:00
George Armhold	bcdebfb84e	small cleanup: - be explicit when discarding returned errors from .Close(), etc. - remove named return values from funcs when naked return not used - fix some "err" shadowing when redeclaration not needed	2017-10-25 12:03:55 -04:00
Tobias Klein	087c3fe1dc	tests updated	2017-09-09 13:26:35 +02:00
Alexander Neumann	23c903074c	Move restic package to internal/restic	2017-07-24 17:43:32 +02:00
Alexander Neumann	6caeff2408	Run goimports	2017-07-23 14:21:03 +02:00
Alexander Neumann	83d1a46526	Moves files	2017-07-23 14:19:13 +02:00

1 2 3

116 Commits