restic

mirror of https://github.com/octoleo/restic.git synced 2024-11-26 23:06:32 +00:00

Author	SHA1	Message	Date
Michael Eischer	9ea1a78bd4	FindUsedBlobs: Check for seen blobs before loading trees The only effective change in behavior is that that toplevel nodes can also be skipped.	2020-08-01 12:29:16 +02:00
Michael Eischer	184103647a	FindUsedBlobs: merge seen into blobs BlobSet The seen BlobSet always contained a subset of the entries in blobs. Thus use blobs instead and avoid the memory overhead of the second set. Suggested-by: Alexander Weiss <alex@weissfam.de>	2020-08-01 12:29:16 +02:00
Alexander Weiss	f8316948d1	Optimize FUSE - make command `restic mount` faster and consume less memory - Add Open() functionality to dir - only access index for blobs when file is read - Implement NodeOpener and put one-time file stuff there - Add comment about locking as suggested by bazil.org/fuse => Thanks at Michael Eischer for suggesting the last two improvements	2020-07-28 23:01:18 +02:00
Michael Eischer	ea97ff1ba4	rclone: Skip crash test when rclone is not found	2020-07-26 12:06:18 +02:00
Michael Eischer	01b9581453	rclone: Better field names for stdio conn	2020-07-26 00:29:25 +02:00
Michael Eischer	3cd927d180	rclone: Give rclone time to finish before closing stdin pipe Calling `Close()` on the rclone backend sometimes failed during test execution with 'signal: Broken pipe'. The stdio connection closed both the stdin and stdout file descriptors at the same moment, therefore giving rclone no chance to properly send any final http2 data frames. Now the stdin connection to rclone is closed first and will only be forcefully closed after a timeout. In case rclone exits before the timeout then the stdio connection will be closed normally.	2020-07-26 00:29:25 +02:00
greatroar	bf7b1f12ea	rclone: Add test for pipe handling when rclone exits	2020-07-26 00:29:25 +02:00
Michael Eischer	8554332894	rclone: Close rclone side of stdio_conn pipes restic did not notice when the rclone subprocess exited unexpectedly. restic manually created pipes for stdin and stdout and used these for the connection to the rclone subprocess. The process creating a pipe gets file descriptors for the sender and receiver side of a pipe and passes them on to the subprocess. The expected behavior would be that reads or writes in the parent process fail / return once the child process dies as a pipe would now just have a reader or writer but not both. However, this never happened as restic kept the reader and writer file descriptors of the pipes. `cmd.StdinPipe` and `cmd.StdoutPipe` close the subprocess side of pipes once the child process was started and close the parent process side of pipes once wait has finished. We can't use these functions as we need access to the raw `os.File` so just replicate that behavior.	2020-07-26 00:29:25 +02:00
greatroar	3e93b36ca4	Make rclone.New private	2020-07-26 00:28:45 +02:00
Michael Eischer	c847aace35	Rename Index interface to MasterIndex The interface is now only implemented by repository.MasterIndex.	2020-07-25 21:19:46 +02:00
Alexander Weiss	9d1fb94c6c	make Lookup() return all blobs + simplify syntax	2020-07-25 21:18:34 +02:00
MichaelEischer	020cab8e08	Merge pull request #2787 from greatroar/no-blobsize-cache Remove blob size cache from restic mount	2020-07-25 20:46:35 +02:00
greatroar	07da61baee	Remove blob size cache from restic mount	2020-07-25 19:39:14 +02:00
MichaelEischer	3d530dfc91	Merge pull request #2827 from aawsome/archiver-test-contents Make self-healing work when backing up with parent snapshot	2020-07-25 13:13:18 +02:00
MichaelEischer	bbc960f957	Merge pull request #2635 from greatroar/optimize-sortbycached Optimize sorting blobs by cache status	2020-07-25 12:35:42 +02:00
greatroar	309598c237	Simplify sortCachedPacksFirst test in internal/repository The test now uses the fact that the sort is stable. It's not guaranteed to be, but the test is cleaner and more exhaustive. sortCachedPacksFirst no longer needs a return value.	2020-07-25 12:12:59 +02:00
greatroar	03d23e6faa	Speed up blob sorting in internal/repository name old time/op new time/op delta SortCachedPacksFirst-8 208µs ± 3% 186µs ± 3% -10.74% (p=0.000 n=10+8) name old alloc/op new alloc/op delta SortCachedPacksFirst-8 213kB ± 0% 139kB ± 0% -34.62% (p=0.000 n=10+10) name old allocs/op new allocs/op delta SortCachedPacksFirst-8 1.03k ± 0% 1.03k ± 0% -0.19% (p=0.000 n=10+10)	2020-07-25 12:12:59 +02:00
greatroar	b10acd2af7	Test and benchmark blob sorting in internal/repository	2020-07-25 12:12:58 +02:00
Alexander Weiss	9175795fdb	Check contents in archiver When backing up with a parent snapshot and the file is not changed, also check if contents are still available in index.	2020-07-25 08:18:28 +02:00
rawtaz	5d8d70542f	Merge pull request #2852 from MichaelEischer/drop-go-1.11 Drop support for Go version 1.11	2020-07-25 01:17:36 +02:00
Michael Eischer	7c23381a2b	Drop support for Go version 1.11	2020-07-24 18:52:39 +02:00
MichaelEischer	34181b13a2	Merge pull request #2328 from MichaelEischer/no-repeated-checks Fix duplicate tree checks within `restic check`	2020-07-22 22:08:02 +02:00
Alexander Weiss	a666a6d576	Add tests and merge indexes in index benchmarks	2020-07-22 21:54:02 +02:00
Alexander Weiss	e388d962a5	Merge final indexes together for faster index access	2020-07-22 21:54:02 +02:00
Alexander Weiss	3b7a3711e6	Add more realistic index benchmarks + reduce test size of BenchmarkMasterIndexLookupParallel	2020-07-21 07:18:20 +02:00
Michael Eischer	9b0e718852	checker: Test that blob types are not confused	2020-07-20 23:43:47 +02:00
MichaelEischer	82c908871d	Merge pull request #2812 from greatroar/chaining Chaining hash table for repository.Index	2020-07-20 23:29:36 +02:00
Michael Eischer	ddf0b8cd0b	checker: Properly distinguish between data and tree blobs If a data blob and a tree blob with the same ID (= same content) exist, then the checker did not report a data or tree blob as unused when the blob of the other type was still in use.	2020-07-20 22:58:39 +02:00
Michael Eischer	2d0c138c9b	checker: Test that check only decodes trees once The `DuplicateTree` flag is necessary to ensure that failures cannot be swallowed. The old checker implementation ignores errors from LoadTree if the corresponding tree was already checked.	2020-07-20 22:51:53 +02:00
Michael Eischer	ef325ffc02	checker: Cleanup error handling code This change only moves code around but does not result in any change in behavior.	2020-07-20 22:51:53 +02:00
Michael Eischer	7a165f32a9	checker: Traverse trees in depth-first order Backups traverse the file tree in depth-first order and saves trees on the way back up. This results in tree packs filled in a way comparable to the reverse Polish notation. In order to check tree blobs in that order, the treeFilter would have to delay the forwarding of tree nodes until all children of it are processed which would complicate the implementation. Therefore do the next similar thing and traverse the tree in depth-first order, but process trees already on the way down. The tree blob ids are added in reverse order to the backlog, which is once again reverted when removing the ids from the back of the backlog.	2020-07-20 22:51:53 +02:00
Michael Eischer	36c69e3ca7	checker: Unify blobs, processed trees and referenced blobs map The blobRefs map and the processedTrees IDSet are merged to reduce the memory usage. The blobRefs map now uses separate flags to track blob usage as data or tree blob. This prevents skipping of trees whose content is identical to an already processed data blob. A third flag tracks whether a blob exists or not, which removes the need for the blobs IDSet.	2020-07-20 22:51:47 +02:00
Michael Eischer	35d8413639	checker: Remove dead index map	2020-07-20 22:37:31 +02:00
Michael Eischer	c66a0e408c	checker: Reduce cost of debug log Avoid duplicate allocation of the Subtree list.	2020-07-20 22:37:31 +02:00
Michael Eischer	70f4c014ef	checker: Decode identical tree nodes only once Even though the checkTreeWorker skips already processed chunks, filterTrees did queue the same tree blob on every occurence. This becomes a serious performance bottleneck for larger number of snapshots that cover mostly the same directories. Therefore decode a tree blob exactly once.	2020-07-20 22:37:31 +02:00
Michael Eischer	f0d8710611	Add benchmark for checker scaling with snapshot count	2020-07-20 22:37:31 +02:00
Alexander Neumann	8074879c5f	Remove 'go generate'	2020-07-19 17:28:42 +02:00
greatroar	7bda28f31f	Chaining hash table for repository.Index These are faster to construct but slower to access. The allocation rate is halved, the peak memory usage almost halved compared to standard map. Benchmark results on linux/amd64, -benchtime=3s -count=20: name old time/op new time/op delta PackerManager-8 178ms ± 0% 178ms ± 0% ~ (p=0.231 n=20+20) DecodeIndex-8 4.54s ± 0% 4.30s ± 0% -5.20% (p=0.000 n=18+17) DecodeIndexParallel-8 4.54s ± 0% 4.30s ± 0% -5.22% (p=0.000 n=19+18) IndexHasUnknown-8 44.4ns ± 5% 50.5ns ±11% +13.82% (p=0.000 n=19+17) IndexHasKnown-8 48.3ns ± 0% 51.5ns ±12% +6.68% (p=0.001 n=16+20) IndexAlloc-8 758ms ± 1% 616ms ± 1% -18.69% (p=0.000 n=19+19) IndexAllocParallel-8 234ms ± 3% 204ms ± 2% -12.60% (p=0.000 n=20+18) MasterIndexLookupSingleIndex-8 122ns ± 0% 145ns ± 9% +18.44% (p=0.000 n=14+20) MasterIndexLookupMultipleIndex-8 369ns ± 2% 429ns ± 8% +16.27% (p=0.000 n=20+20) MasterIndexLookupSingleIndexUnknown-8 68.4ns ± 5% 74.9ns ±13% +9.47% (p=0.000 n=20+20) MasterIndexLookupMultipleIndexUnknown-8 315ns ± 3% 369ns ±11% +17.14% (p=0.000 n=20+20) MasterIndexLookupParallel/known,indices=5-8 743ns ± 1% 816ns ± 2% +9.87% (p=0.000 n=17+17) MasterIndexLookupParallel/unknown,indices=5-8 238ns ± 1% 260ns ± 2% +9.14% (p=0.000 n=19+20) MasterIndexLookupParallel/known,indices=10-8 1.01µs ± 3% 1.11µs ± 2% +9.79% (p=0.000 n=19+20) MasterIndexLookupParallel/unknown,indices=10-8 222ns ± 0% 269ns ± 2% +20.83% (p=0.000 n=16+20) MasterIndexLookupParallel/known,indices=20-8 1.06µs ± 2% 1.19µs ± 2% +12.95% (p=0.000 n=19+18) MasterIndexLookupParallel/unknown,indices=20-8 413ns ± 1% 530ns ± 1% +28.19% (p=0.000 n=18+20) SaveAndEncrypt-8 30.2ms ± 1% 30.4ms ± 0% +0.71% (p=0.000 n=19+19) LoadTree-8 540µs ± 1% 576µs ± 1% +6.73% (p=0.000 n=20+20) LoadBlob-8 5.64ms ± 0% 5.64ms ± 0% ~ (p=0.883 n=18+17) LoadAndDecrypt-8 5.93ms ± 0% 5.95ms ± 1% ~ (p=0.247 n=20+19) LoadIndex-8 25.1ms ± 0% 24.5ms ± 1% -2.54% (p=0.000 n=18+17) name old speed new speed delta PackerManager-8 296MB/s ± 0% 296MB/s ± 0% ~ (p=0.229 n=20+20) SaveAndEncrypt-8 139MB/s ± 1% 138MB/s ± 0% -0.71% (p=0.000 n=19+19) LoadBlob-8 177MB/s ± 0% 177MB/s ± 0% ~ (p=0.890 n=18+17) LoadAndDecrypt-8 169MB/s ± 0% 168MB/s ± 1% ~ (p=0.227 n=20+19) name old alloc/op new alloc/op delta PackerManager-8 91.8kB ± 0% 91.8kB ± 0% ~ (p=0.772 n=12+19) IndexAlloc-8 786MB ± 0% 400MB ± 0% -49.04% (p=0.000 n=20+18) IndexAllocParallel-8 786MB ± 0% 401MB ± 0% -49.04% (p=0.000 n=19+15) SaveAndEncrypt-8 21.0MB ± 0% 21.0MB ± 0% +0.00% (p=0.000 n=19+19) name old allocs/op new allocs/op delta PackerManager-8 1.41k ± 0% 1.41k ± 0% ~ (all equal) IndexAlloc-8 977k ± 0% 907k ± 0% -7.18% (p=0.000 n=20+20) IndexAllocParallel-8 977k ± 0% 907k ± 0% -7.17% (p=0.000 n=19+15) SaveAndEncrypt-8 73.0 ± 0% 73.0 ± 0% ~ (all equal)	2020-07-19 13:58:22 +02:00
greatroar	255ba83c4b	Parallel index benchmarks + benchmark optimizations createRandomIndex was using the global RNG, which locks on every call It was also using twice as many random numbers as necessary and doing a float division in every iteration of the inner loop. BenchmarkDecodeIndex was using too short an input, especially for a parallel version. (It may now be using one that is a bit large.) Results on linux/amd64, -benchtime=3s -count=20: name old time/op new time/op delta PackerManager-8 178ms ± 0% 178ms ± 0% ~ (p=0.165 n=20+20) DecodeIndex-8 13.6µs ± 2% 4539886.8µs ± 0% +33293901.38% (p=0.000 n=20+18) IndexHasUnknown-8 44.4ns ± 7% 44.4ns ± 5% ~ (p=0.873 n=20+19) IndexHasKnown-8 49.2ns ± 3% 48.3ns ± 0% -1.86% (p=0.000 n=20+16) IndexAlloc-8 802ms ± 1% 758ms ± 1% -5.51% (p=0.000 n=20+19) MasterIndexLookupSingleIndex-8 124ns ± 1% 122ns ± 0% -1.41% (p=0.000 n=20+14) MasterIndexLookupMultipleIndex-8 373ns ± 2% 369ns ± 2% -1.13% (p=0.001 n=20+20) MasterIndexLookupSingleIndexUnknown-8 67.8ns ± 3% 68.4ns ± 5% ~ (p=0.753 n=20+20) MasterIndexLookupMultipleIndexUnknown-8 316ns ± 3% 315ns ± 3% ~ (p=0.846 n=20+20) SaveAndEncrypt-8 30.5ms ± 1% 30.2ms ± 1% -1.09% (p=0.000 n=19+19) LoadTree-8 527µs ± 1% 540µs ± 1% +2.37% (p=0.000 n=19+20) LoadBlob-8 5.65ms ± 0% 5.64ms ± 0% -0.21% (p=0.000 n=19+18) LoadAndDecrypt-8 7.07ms ± 2% 5.93ms ± 0% -16.15% (p=0.000 n=19+20) LoadIndex-8 32.1ms ± 2% 25.1ms ± 0% -21.64% (p=0.000 n=20+18) name old speed new speed delta PackerManager-8 296MB/s ± 0% 296MB/s ± 0% ~ (p=0.159 n=20+20) SaveAndEncrypt-8 138MB/s ± 1% 139MB/s ± 1% +1.10% (p=0.000 n=19+19) LoadBlob-8 177MB/s ± 0% 177MB/s ± 0% +0.21% (p=0.000 n=19+18) LoadAndDecrypt-8 141MB/s ± 2% 169MB/s ± 0% +19.24% (p=0.000 n=19+20) name old alloc/op new alloc/op delta PackerManager-8 91.8kB ± 0% 91.8kB ± 0% ~ (p=0.826 n=19+12) IndexAlloc-8 786MB ± 0% 786MB ± 0% +0.01% (p=0.000 n=20+20) SaveAndEncrypt-8 21.0MB ± 0% 21.0MB ± 0% -0.00% (p=0.012 n=20+19) name old allocs/op new allocs/op delta PackerManager-8 1.41k ± 0% 1.41k ± 0% ~ (all equal) IndexAlloc-8 977k ± 0% 977k ± 0% +0.01% (p=0.022 n=20+20) SaveAndEncrypt-8 73.0 ± 0% 73.0 ± 0% ~ (all equal)	2020-07-19 13:58:05 +02:00
Lars Lehtonen	9ac90cf5cd	internal/fuse: fix dropped test error	2020-07-12 21:42:31 -07:00
greatroar	58719e1f47	Replace mount's per-file cache by a global LRU cache	2020-07-12 18:27:16 +02:00
greatroar	d42c169458	Fix quadratic file reading in restic mount	2020-07-12 18:27:16 +02:00
greatroar	02bec13ef2	Fix repository_test.BenchmarkSaveAndEncrypt The benchmark was actually testing the speed of index lookups. name old time/op new time/op delta SaveAndEncrypt-8 101ns ± 2% 31505824ns ± 1% +31311591.31% (p=0.000 n=10+10) name old speed new speed delta SaveAndEncrypt-8 41.7TB/s ± 2% 0.0TB/s ± 1% -100.00% (p=0.000 n=10+10) name old alloc/op new alloc/op delta SaveAndEncrypt-8 1.00B ± 0% 20989508.40B ± 0% +2098950740.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta SaveAndEncrypt-8 0.00 123.00 ± 0% +Inf% (p=0.000 n=10+9) (The actual speed is ca. 131MiB/s.)	2020-07-05 17:41:42 +02:00
MichaelEischer	212607dc8a	Merge pull request #2760 from greatroar/backend-benchmark Fix backend benchmarks + a micro-optimization	2020-06-17 23:17:05 +02:00
greatroar	190d8e2f51	Flatten backend.LimitedReadCloser structure This inlines the io.LimitedReader into the LimitedReadCloser body to achieve fewer allocations. Results on linux/amd64: name old time/op new time/op delta Backend/BenchmarkLoadPartialFile-8 412µs ± 4% 413µs ± 4% ~ (p=0.634 n=17+17) Backend/BenchmarkLoadPartialFileOffset-8 455µs ±13% 441µs ±10% ~ (p=0.072 n=20+18) name old speed new speed delta Backend/BenchmarkLoadPartialFile-8 10.2GB/s ± 3% 10.2GB/s ± 4% ~ (p=0.817 n=16+17) Backend/BenchmarkLoadPartialFileOffset-8 9.25GB/s ±12% 9.54GB/s ± 9% ~ (p=0.072 n=20+18) name old alloc/op new alloc/op delta Backend/BenchmarkLoadPartialFile-8 888B ± 0% 872B ± 0% -1.80% (p=0.000 n=15+15) Backend/BenchmarkLoadPartialFileOffset-8 888B ± 0% 872B ± 0% -1.80% (p=0.000 n=15+15) name old allocs/op new allocs/op delta Backend/BenchmarkLoadPartialFile-8 18.0 ± 0% 17.0 ± 0% -5.56% (p=0.000 n=15+15) Backend/BenchmarkLoadPartialFileOffset-8 18.0 ± 0% 17.0 ± 0% -5.56% (p=0.000 n=15+15)	2020-06-17 13:11:45 +02:00
greatroar	f4cd2a7120	Make backend benchmarks fairer by removing checks Checking whether the right data is returned takes up half the time in some benchmarks. Results for local backend benchmarks on linux/amd64: name old time/op new time/op delta Backend/BenchmarkLoadFile-8 4.89ms ± 0% 2.72ms ± 1% -44.26% (p=0.008 n=5+5) Backend/BenchmarkLoadPartialFile-8 936µs ± 6% 439µs ±15% -53.07% (p=0.008 n=5+5) Backend/BenchmarkLoadPartialFileOffset-8 940µs ± 1% 456µs ±10% -51.50% (p=0.008 n=5+5) Backend/BenchmarkSave-8 23.9ms ±14% 24.8ms ±41% ~ (p=0.690 n=5+5) name old speed new speed delta Backend/BenchmarkLoadFile-8 3.43GB/s ± 0% 6.16GB/s ± 1% +79.40% (p=0.008 n=5+5) Backend/BenchmarkLoadPartialFile-8 4.48GB/s ± 6% 9.63GB/s ±14% +114.78% (p=0.008 n=5+5) Backend/BenchmarkLoadPartialFileOffset-8 4.46GB/s ± 1% 9.22GB/s ±10% +106.74% (p=0.008 n=5+5) Backend/BenchmarkSave-8 706MB/s ±13% 698MB/s ±31% ~ (p=0.690 n=5+5)	2020-06-17 13:11:45 +02:00
Alexander Weiss	1361341c58	don't save duplicate packIDs when using internal/repository/Index.Store	2020-06-14 07:56:24 +02:00
Alexander Weiss	ce4a2f4ca6	save packIDs and duplicates separately A side remark to the definition of Index.blob: Another possibility would have been to use: blob map[restic.BlobHandle]indexEntry This would have led to the following sizes: key: 32 + 1 = 33 bytes value: 8 bytes indexEntry: 8 + 4 + 4 = 16 bytes each packID: 32 bytes To save N index entries, we would therefore have needed: N OF * (33 + 8) bytes + N * 16 + N * 32 bytes / BP = N * 82 bytes More precicely, using a pointer instead of a direct entry is the better memory choice if: OF * 8 bytes + entrysize < OF * entrysize <=> entrysize > 8 bytes * OF/(OF-1) Under the assumption of OF=1.5, this means using pointers would have been the better choice if sizeof(indexEntry) > 24 bytes.	2020-06-14 07:56:21 +02:00
Alexander Weiss	cf979e2b81	make offset and length uint32	2020-06-14 07:50:19 +02:00
Michael Eischer	d92e2c5769	simplify index code	2020-06-14 07:50:19 +02:00

1 2 3 4 5 ...

674 Commits