2
2
mirror of https://github.com/octoleo/restic.git synced 2024-06-01 16:40:50 +00:00
Commit Graph

1391 Commits

Author SHA1 Message Date
MichaelEischer
9ad3ad5972
Merge pull request #3850 from lbausch/go1.19
Update tests to Go 1.19
2022-08-07 14:56:17 +02:00
MichaelEischer
2930a102de
Merge pull request #3731 from metalsp0rk/feature/min-packsize-flag
Feature: min packsize flag
2022-08-07 14:54:45 +02:00
Michael Eischer
f3fdc66b32
restic: Use stable sorting in snapshot policy
sort.Sort is not guaranteed to be stable. Go 1.19 has changed the
sorting algorithm which resulted in changes of the sort order. When
comparing snapshots with identical timestamp but different paths and
tags lists, there is not meaningful order among them. So just keep their
order stable.
2022-08-07 14:10:40 +02:00
Michael Eischer
0b7291b8b2 mount: Fix parent inode used by snapshots dir 2022-08-07 13:03:32 +02:00
greatroar
cfa80e2c6b mount: remove unused inode field from root node 2022-08-07 13:03:26 +02:00
MichaelEischer
74ae76036f
Merge pull request #2913 from aawsome/mount-snapshot-slashes
mount: Make snapshots dir structure customizable
2022-08-07 12:27:59 +02:00
Michael Eischer
caa17988a3 fuse: Redesign snapshot dirstruct
Cleanly separate the directory presentation and the snapshot directory
structure. SnapshotsDir now translates the dirStruct into a format
usable by the fuse library and contains only minimal special case rules.
All decisions have moved into SnapshotsDirStructure which now creates a
fully preassembled tree data structure.
2022-08-07 12:13:06 +02:00
Michael Eischer
1ed775e3a8 debug: support roundtripper logging also for release builds
Different from debug builds do not use the eofDetectRoundTripper if
logging is disabled.
2022-08-05 23:49:39 +02:00
Michael Eischer
38becfc436 debug: enable debug support for release builds 2022-08-05 23:49:39 +02:00
Michael Eischer
82c268c917 Remove unused hooks mechanism 2022-08-05 23:49:39 +02:00
Michael Eischer
7266f07c87 repository: StreamPack in parts if there are too large gaps
For large pack sizes we might be only interested in the first and last
blob of a pack file. Thus stream a pack file in multiple parts if the
gaps between requested blobs grow too large.
2022-08-05 23:48:36 +02:00
Michael Eischer
7f3b2be1e8 s3: Disable multipart uploads below 200MB 2022-08-05 23:48:36 +02:00
Michael Eischer
1b076cda97 rename option to --pack-size 2022-08-05 23:47:43 +02:00
Kyle Brennan
1e3f05c3f1 repository: prevent header overfill 2022-08-05 23:47:12 +02:00
Michael Eischer
0a6fa602c8 add option for setting min pack size 2022-08-05 23:47:12 +02:00
Michael Eischer
2db7733ee3 fuse: remove unused MetaDir 2022-08-05 23:46:46 +02:00
Michael Eischer
f678f7cb04 fuse: cleanup test 2022-08-05 23:46:46 +02:00
Alexander Weiss
1751afae26 Make snapshots dirs in mount command customizable 2022-08-05 23:46:46 +02:00
Alexander Weiss
57f4003f2f Generalize fuse snapshot dirs implemetation
+ allow "/" in tags and snapshot template
2022-08-05 23:46:46 +02:00
Alexander Weiss
696c18e031 Add possibility to set snapshot ID (used in test) 2022-08-05 23:46:46 +02:00
MichaelEischer
04a8ee80fb
Merge pull request #3829 from MichaelEischer/prune-refactor
Split prune into slightly small functions
2022-08-05 23:29:52 +02:00
greatroar
ad6ac680af internal/restic: Handle EINVAL for xattr on Solaris
Also make the errors a bit less verbose by not prepending the operation,
since pkg/xattr already does that. Old errors looked like

    Listxattr: xattr.list /myfiles/.zfs/snapshot: invalid argument
2022-08-01 12:45:17 +02:00
Michael Eischer
73053674d9 repository: Test fallback to existing blobs 2022-07-30 17:37:07 +02:00
Michael Eischer
623770eebb repository: try to recover from invalid blob while repacking
If a blob that should be kept is invalid, Repack will now try to request
the blob using LoadBlob. Only return an error if that fails.
2022-07-30 17:37:07 +02:00
greatroar
2bdc40e612 Speed up restic init over slow SFTP links
pkg/sftp.Client.MkdirAll(d) does a Stat to determine if d exists and is
a directory, then a recursive call to create the parent, so the calls
for data/?? each take three round trips. Doing a Mkdir first should
eliminate two round trips for 255/256 data directories as well as all
but one of the top-level directories.

Also, we can do all of the calls concurrently. This may reintroduce some
of the Stat calls when multiple goroutines try to create the same
parent, but at the default number of connections, that should not be
much of a problem.
2022-07-30 13:09:08 +02:00
greatroar
23ebec717c Remove stale comments from backend/sftp
The preExec and postExec functions were removed in
0bdb131521 from 2018.
2022-07-30 13:07:25 +02:00
Michael Eischer
4a10ebed15 archiver: reduce memory usage for large files
FutureBlob now uses a Take() method as a more memory-efficient way to
retrieve the futures result. In addition, futures are now collected
while saving the file. As only a limited number of blobs can be queued
for uploading, for a large file nearly all FutureBlobs already have
their result ready, such that the FutureBlob object just consumes
memory.
2022-07-23 14:45:07 +02:00
Michael Eischer
b817681a11 archiver: Incrementally serialize tree nodes
That way it is not necessary to keep both the Nodes forming a Tree and
the serialized JSON version in memory.
2022-07-23 14:45:07 +02:00
Michael Eischer
c206a101a3 archiver: unify FutureTree/File into futureNode
There is no real difference between the FutureTree and FutureFile
structs. However, differentiating both increases the size of the
FutureNode struct.

The FutureNode struct is now only 16 bytes large on 64bit platforms.
That way is has a very low overhead if the corresponding file/directory
was not processed yet.

There is a special case for nodes that were reused from the parent
snapshot, as a go channel seems to have 96 bytes overhead which would
result in a memory usage regression.
2022-07-23 14:45:07 +02:00
Michael Eischer
32f4997733 archiver: remove unused fileInfo from progress callback 2022-07-23 14:16:23 +02:00
Michael Eischer
dcb00fd2d1 archiver: cleanup Saver interface 2022-07-23 14:16:23 +02:00
Michael Eischer
79321a195c archiver: remove dead attribute from FutureNode 2022-07-23 14:16:23 +02:00
Michael Eischer
5a6f2f9fa0 Fix S3 legacy layout migration 2022-07-23 11:19:32 +02:00
Michael Eischer
04e49924fb checker: Fix S3 legacy layout detection 2022-07-23 11:19:32 +02:00
Michael Eischer
fcb3ddf181 check: Complain about usage of s3 legacy layout 2022-07-23 11:19:32 +02:00
Michael Eischer
8b8bd4e8ac check: complain about mixed pack files 2022-07-23 11:19:32 +02:00
MichaelEischer
443cc49afd
Merge pull request #3830 from MichaelEischer/cleanup-repo
Extract Load/SaveTree/JSONUnpacked from repository
2022-07-23 10:46:13 +02:00
MichaelEischer
1f5369e072
Merge pull request #3831 from MichaelEischer/move-code
Move code out of the restic package and consolidate backend specific code
2022-07-23 10:33:05 +02:00
Michael Eischer
9729e6d7ef backend: extract readerat from restic package 2022-07-17 15:29:09 +02:00
Michael Eischer
c44b21d366 restorer: extract hardlinks index from restic package 2022-07-17 13:45:42 +02:00
Michael Eischer
8c11fc3ec9 crypto: move crypto buffer helpers 2022-07-17 13:42:23 +02:00
Michael Eischer
a0cef9f247 limiter: move to internal/backend 2022-07-17 13:40:15 +02:00
Michael Eischer
163ab9c025 mock: move to internal/backend 2022-07-17 13:40:06 +02:00
Michael Eischer
89d3ce852b repository: extract Load/StoreJSONUnpacked
A Load/Store method for each data type is much clearer. As a result the
repository no longer needs a method to load / store json.
2022-07-17 13:22:00 +02:00
Michael Eischer
fbcbd5318c repository: extract LoadTree/SaveTree
The repository has no real idea what a Tree is. So these methods never
belonged there.
2022-07-17 13:11:28 +02:00
Michael Eischer
5639c41b6a azure: Strip ? prefix from sas token 2022-07-16 23:55:18 +02:00
Roger Gammans
64a7ec5341 azure: add SAS authentication option 2022-07-16 23:55:18 +02:00
Lorenz Bausch
d6e3c7f28e
Wording: change repo to repository 2022-07-08 20:05:35 +02:00
Michael Eischer
ce89018902 Fix data race in blob_saver
After the `BlobSaver` job is submitted, the buffer can be released and
reused by another `FileSaver` even before `BlobSaver.Save` returns. That
FileSaver will the change `buf.Data` leading to wrong backup statistics.

Found by `go test -race ./...`:

WARNING: DATA RACE
Write at 0x00c0000784a0 by goroutine 41:
  github.com/restic/restic/internal/archiver.(*FileSaver).saveFile()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:176 +0x789
  github.com/restic/restic/internal/archiver.(*FileSaver).worker()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af
  github.com/restic/restic/internal/archiver.NewFileSaver.func2()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/michael/go/pkg/mod/golang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57 +0x91

Previous read at 0x00c0000784a0 by goroutine 29:
  github.com/restic/restic/internal/archiver.(*BlobSaver).Save()
      /home/michael/Projekte/restic/restic/internal/archiver/blob_saver.go:57 +0x1dd
  github.com/restic/restic/internal/archiver.(*BlobSaver).Save-fm()
      <autogenerated>:1 +0xac
  github.com/restic/restic/internal/archiver.(*FileSaver).saveFile()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:191 +0x855
  github.com/restic/restic/internal/archiver.(*FileSaver).worker()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af
  github.com/restic/restic/internal/archiver.NewFileSaver.func2()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/michael/go/pkg/mod/golang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57 +0x91
2022-07-03 14:47:53 +02:00
Michael Eischer
6f53ecc1ae adapt workers based on whether an operation is CPU or IO-bound
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
2022-07-03 12:19:26 +02:00
Michael Eischer
753e56ee29 repository: Limit to a single pending pack file
Use only a single not completed pack file to keep the number of open and
active pack files low. The main change here is to defer hashing the pack
file to the upload step. This prevents the pack assembly step to become
a bottleneck as the only task is now to write data to the temporary pack
file.

The tests are cleaned up to no longer reimplement packer manager
functions.
2022-07-02 22:42:34 +02:00
Michael Eischer
fa25d6118e archiver: Reduce tree saver concurrency
Large amount of tree savers have no obvious benefit, however they can
increase the amount of (potentially large) trees kept in memory.
2022-07-02 22:42:34 +02:00
Michael Eischer
bba1e81719 archiver: Limit blob saver count to GOMAXPROCS
Now with the asynchronous uploaders there's no more benefit from using
more blob savers than we have CPUs. Thus use just one blob saver for
each CPU we are allowed to use.
2022-07-02 22:42:34 +02:00
Michael Eischer
120ccc8754 repository: Rework blob saving to use an async pack uploader
Previously, SaveAndEncrypt would assemble blobs into packs and either
return immediately if the pack is not yet full or upload the pack file
otherwise. The upload will block the current goroutine until it
finishes.

Now, the upload is done using separate goroutines. This requires changes
to the error handling. As uploads are no longer tied to a SaveAndEncrypt
call, failed uploads are signaled using an errgroup.

To count the uploaded amount of data, the pack header overhead is no
longer returned by `packer.Finalize` but rather by
`packer.HeaderOverhead`. This helper method is necessary to continue
returning the pack header overhead directly to the responsible call to
`repository.SaveBlob`. Without the method this would not be possible,
as packs are finalized asynchronously.
2022-07-02 22:42:34 +02:00
MichaelEischer
3e1de52e0a
Merge pull request #3805 from greatroar/global
cmd/restic, limiter: Move config knowledge to internal packages
2022-07-02 21:56:35 +02:00
MichaelEischer
621023a50b
Merge pull request #3772 from MichaelEischer/fix-mixed-index
rebuild-index: correctly rebuild index for mixed packs
2022-07-02 20:10:02 +02:00
MichaelEischer
90e9c5c4cc
Merge pull request #3729 from MichaelEischer/full-ids-in-check
Include full IDs in check output
2022-07-02 20:09:39 +02:00
Michael Eischer
cdaf9b4f26 Don't crash if SecretString is uninitialized 2022-07-02 19:44:28 +02:00
Michael Eischer
5e0f1c3cef check: remove dead code 2022-07-02 19:28:57 +02:00
Michael Eischer
0df022fa6d check: Print full ids
The short ids are not always unique. In addition, recovering from
damages is easier when having the full ids as that makes it easier to
access the corresponding files.
2022-07-02 19:28:57 +02:00
Michael Eischer
04c23fa95d rebuild-index: correctly rebuild index for mixed packs
For mixed packs, data and tree blobs were stored in separate index
entries. This results in warning from the check command and maybe other
problems.
2022-07-02 19:24:02 +02:00
MichaelEischer
bb5f196b09
Merge pull request #3733 from restic/improve-stats
Improve stats
2022-07-02 19:07:31 +02:00
MichaelEischer
c16f989d4a
Merge pull request #3470 from MichaelEischer/sanitize-debug-log
Sanitize debug log
2022-07-02 19:00:54 +02:00
Michael Eischer
a6e9e08034 Account for pack header overhead at each entry
This will miss the pack header crypto overhead and the length field,
which only amount to a few bytes per pack file.
2022-07-02 18:55:58 +02:00
Alexander Neumann
6c4ceaf1e7 Print number of bytes added to the repo
This includes optional compression and crypto overhead.
2022-07-02 18:55:12 +02:00
Alexander Neumann
99634c0936 Return real size from SaveBlob 2022-07-02 18:55:12 +02:00
MichaelEischer
fdc53a9d32
Merge pull request #3787 from MichaelEischer/refactor-repository
repository: (Mostly) index-related cleanups
2022-07-02 18:54:04 +02:00
Michael Eischer
6923353c43 redact swift auth token in debug output 2022-07-02 18:47:35 +02:00
Michael Eischer
5a11d14082 redacted keys/token in backend config debug log 2022-07-02 18:47:35 +02:00
Michael Eischer
0936d864a4 redact http authorization header in debug log output 2022-07-02 18:47:35 +02:00
Michael Eischer
ec7c9ce88b drop unused repository.Loader interface 2022-07-02 18:39:59 +02:00
Michael Eischer
2cd7e90ad1 repository: cleanup 2022-07-02 18:39:59 +02:00
Michael Eischer
c1a8fa4290 repository: remove unused packIDToIndex field 2022-07-02 18:39:59 +02:00
Michael Eischer
e68c3a4e62 repository: simplify CreateIndexFromPacks 2022-07-02 18:39:59 +02:00
Michael Eischer
1974ad7ce2 repository: hide MasterIndex.FinalizeFullIndexes / FinalizeNotFinalIndexes 2022-07-02 18:39:59 +02:00
Michael Eischer
ef53ca4a5a repository: remove MasterIndex.All() 2022-07-02 18:39:59 +02:00
Michael Eischer
bf81bf0795 repository: Properly set id for finalized index
As MergeFinalIndex and index uploads can occur concurrently, it is
necessary for MergeFinalIndex to check whether the IDs for an index were
already set before merging it. Otherwise, we'd loose the ID of an index
which is set _after_ uploading it.
2022-07-02 18:39:59 +02:00
Michael Eischer
e0a7852b8b repository: remove unused (Master)Index.Count 2022-07-02 18:39:58 +02:00
Michael Eischer
8ef2968f28 repository: remove unused index.ListPack 2022-07-02 18:39:12 +02:00
Michael Eischer
e4f20dea61 repository: inline index.encode 2022-07-02 18:39:12 +02:00
Michael Eischer
fe5a8e137a repository: remove unused index.Store 2022-07-02 18:39:12 +02:00
Michael Eischer
628ae799ca repository: make flushPacks private 2022-07-02 18:39:12 +02:00
Michael Eischer
ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Michael Eischer
a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
greatroar
a0fa9c6e9f Revert "restic prune: Merge three loops over the index"
This reverts commit 8bdfcf779f.
Should fix #3809. Also needed to make #3290 apply cleanly.
2022-06-30 15:27:34 +02:00
greatroar
90d2c0502b cmd/restic, limiter: Move config knowledge to internal packages
The GlobalOptions struct now embeds a backend.TransportOptions, so it
doesn't need to construct one in open and create. The upload and
download limits are similarly now a struct in internal/limiter that is
embedded in GlobalOptions.
2022-06-22 18:29:58 +02:00
MichaelEischer
bc96879d41
Merge pull request #3785 from MichaelEischer/replace-tomb-usage
Remove usage of tomb package
2022-06-19 14:42:48 +02:00
MichaelEischer
307f14604f
Merge pull request #3795 from greatroar/sema
backend: Move semaphores to a dedicated package
2022-06-18 17:12:01 +02:00
MichaelEischer
19581dbc18
Merge pull request #3786 from greatroar/prune
restic prune: Merge three loops over the index
2022-06-18 16:54:50 +02:00
greatroar
8bdfcf779f restic prune: Merge three loops over the index
There were three loops over the index in restic prune, to find
duplicates, to determine sizes (in pack.Size) and to generate packInfos.
These three are now one loop. This way, prune doesn't need to construct
a set of duplicate blobs, pack.Size doesn't need to contain special
logic for prune's use case (the onlyHdr argument) and pack.Size doesn't
need to construct a map only to have it immediately transformed into a
different map.

Some quick testing on a 160GiB local repo doesn't show running time or
memory use of restic prune --dry-run changing significantly.
2022-06-18 10:40:33 +02:00
greatroar
910d917b71 backend: Move semaphores to a dedicated package
... called backend/sema. I resisted the temptation to call the main
type sema.Phore. Also, semaphores are now passed by value to skip a
level of indirection when using them.
2022-06-18 10:01:58 +02:00
MichaelEischer
2c893fe43c
Merge pull request #3798 from greatroar/errors
all: Move away from pkg/errors, easy cases
2022-06-17 19:01:40 +02:00
greatroar
f92ecf13c9 all: Move away from pkg/errors, easy cases
github.com/pkg/errors is no longer getting updates, because Go 1.13
went with the more flexible errors.{As,Is} function. Use those instead:
errors from pkg/errors already support the Unwrap interface used by 1.13
error handling. Also:

* check for io.EOF with a straight ==. That value should not be wrapped,
  and the chunker (whose error is checked in the cases changed) does not
  wrap it.
* Give custom Error methods pointer receivers, so there's no ambiguity
  when type-switching since the value type will no longer implement error.
* Make restic.ErrAlreadyLocked private, and rename it to
  alreadyLockedError to match the stdlib convention that error type
  names end in Error.
* Same with rest.ErrIsNotExist => rest.notExistError.
* Make s3.Backend.IsAccessDenied a private function.
2022-06-14 08:36:38 +02:00
Jayson Wang
f144920ed5 fix handling of maxKeys in SearchKey 2022-06-12 14:19:06 +02:00
Alexander Neumann
1dd4b9b60e
Merge pull request #3788 from greatroar/sftp-posix-rename
backend/sftp: Support atomic rename
2022-06-06 19:39:48 +02:00
Alexander Neumann
07114ccb21
Merge pull request #3789 from greatroar/fix-loadblob
internal/repository: Fix LoadBlob + fuzz test
2022-06-06 19:33:42 +02:00
greatroar
c9557b2822 internal/repository: Fix LoadBlob + fuzz test
When given a buf that is big enough for a compressed blob but not its
decompressed contents, the copy at the end of LoadBlob would skip the
last part of the contents.

Fixes #3783.
2022-06-06 17:02:28 +02:00
greatroar
fa8f02292e backend/sftp: Support atomic rename
... if the server has posix-rename@openssh.com.
OpenSSH introduced this extension in 2008:
7c29661471
2022-06-06 13:40:42 +02:00
Michael Eischer
e002b09d57 archiver: free workers once finished 2022-06-05 15:48:10 +02:00
Michael Eischer
408ac1a0c2 archiver: remove tomb usage 2022-06-05 15:47:52 +02:00