If we decide to recalculate the metadata we shouldn't start from
whatever we loaded from the database, as that data is wrong. We should
start from a clean slate.
I was working on indirecting version vectors, and that resulted in some
refactoring and improving the existing block indirection stuff. We may
or may not end up doing the version vector indirection, but I think
these changes are reasonable anyhow and will simplify the diff
significantly if we do go there. The main points are:
- A bunch of renaming to make the indirection and GC not about "blocks"
but about "indirection".
- Adding a cutoff so that we don't actually indirect for small block
lists. This gets us better performance when handling small files as it
cuts out the indirection for quite small loss in space efficiency.
- Being paranoid and always recalculating the hash on put. This costs
some CPU, but the consequences if a buggy or malicious implementation
silently substituted the block list by lying about the hash would be bad.
I was working on indirecting version vectors, and that resulted in some
refactoring and improving the existing block indirection stuff. We may
or may not end up doing the version vector indirection, but I think
these changes are reasonable anyhow and will simplify the diff
significantly if we do go there. The main points are:
- A bunch of renaming to make the indirection and GC not about "blocks"
but about "indirection".
- Adding a cutoff so that we don't actually indirect for small block
lists. This gets us better performance when handling small files as it
cuts out the indirection for quite small loss in space efficiency.
- Being paranoid and always recalculating the hash on put. This costs
some CPU, but the consequences if a buggy or malicious implementation
silently substituted the block list by lying about the hash would be bad.
One of the causes of "panic: database is closed" is that we try to send
summaries after it's been closed. Calculating summaries can take a long
time and if we have a lot of folders it's not unreasonable to think
that we might be stopped in this loop, so prepare to bail here.
* push
During NAT discovery we block for 10s (NatTimeoutS) before returning.
This is mostly noticeable when Ctrl-C:ing a Syncthing directly after
startup as we wait for those ten seconds before shutting down. This
makes it check the context a little bit more frequently.
This adds metadata updates to the same write batch as the underlying
file change. The odds of a metadata update going missing is greatly
reduced.
Bonus change: actually commit the transaction in recalcMeta.
lib/db: Recover sequence number and metadata on startup (fixes#6335)
If we crashed after writing new file entries but before updating
metadata in the database the sequence number and metadata will be wrong.
This fixes that.
We could potentially get a snapshot and then fail to get a releaser,
leaking the snapshot. This takes the releaser first and makes sure to
release it on snapshot error.
The readWriteTransaction offered both commit() (the one to use) and
Commit() (via embedding) where the latter didn't close the read
transaction. This removes the lower cased variant in order to prevent
the mistake.
The only place where the mistake was made was the new gc runner, where
it would leave a read snapshot open forever.
During some other work I discovered these tests weren't great, so I've
rewritten them to be a little better. The real changes here are:
- Don't play games with not starting the folder and such, and don't
construct a fake folder instance -- just use the one the model has. The
folder starts and scans but the folder contents are empty at this point
so that's fine.
- Use a fakefs instead of a temp dir.
- To support the above, implement a fakefs option `?content=true` to
make the fakefs actually retain written content. Use sparingly,
obviously, but it means the fakefs can usually be used instead of an
on disk real directory.