Group the global list of files by version, instead of having one flat list for all devices. This removes lots of duplicate protocol.Vectors.
Co-authored-by: Jakob Borg <jakob@kastelo.net>
If we fail to take the rename shortcut we may crash on a later loop,
because we do trickiness with the indexes but the original buckets[key]
in "range buckets[key]" isn't re-evaluated so i exceeds the max index.
This adds indirection of large version vectors in the same manner as we
already to block lists. The effect is the same: less duplicated data in
some situations.
To mitigate the impact for when this indirection
wouldn't be needed I've added an indirection cutoff for both blocks and
the new version vector stuff: we don't do the indirection at all for
small block lists or small version vectors, instead storing it directly
like we used to do. This is faster for small files and small setups.
This makes version vector values clock based instead of just incremented
from zero. The effect is that a vector that is created from scratch
(after database reset) will have a higher value for the local device
than what it could have been previously, causing a conflict. That is, if
we are A and we had
{A: 42, B: 12}
in the old scheme, a reset and rescan would give us
{A: 1}
which is a strict ancestor of the older file (this might be wrong). With
the new scheme we would instead have
{A: someClockTime, b: otherClockTime}
and the new version after reset would become
{A: someClockTime+delta}
which is in conflict with the previous entry (better).
In case the clocks are wrong (current time is less than the value in the
vector) we fall back to just simple increment like today.
This scheme is ineffective if we suffer a database reset while at the
same time setting the clock back far into the past. It's however no
worse than what we already do.
This loses the ability to emit the "added" event, as we can't look for
the magic 1 entry any more. That event was however already broken
(#5541).
Another place where we infer meaning from the vector itself is in
receive only folders, but there the only criteria is that the vector is
one item long and includes just ourselves, which remains the case with
this change.
* wip
- In the few places where we wrap errors, use the new Go 1.13 "%w"
construction instead of %s or %v.
- Where we create errors with constant strings, consistently use
errors.New and not fmt.Errorf.
- Remove capitalization from errors in the few places where we had that.
One of the causes of "panic: database is closed" is that we try to send
summaries after it's been closed. Calculating summaries can take a long
time and if we have a lot of folders it's not unreasonable to think
that we might be stopped in this loop, so prepare to bail here.
* push
During some other work I discovered these tests weren't great, so I've
rewritten them to be a little better. The real changes here are:
- Don't play games with not starting the folder and such, and don't
construct a fake folder instance -- just use the one the model has. The
folder starts and scans but the folder contents are empty at this point
so that's fine.
- Use a fakefs instead of a temp dir.
- To support the above, implement a fakefs option `?content=true` to
make the fakefs actually retain written content. Use sparingly,
obviously, but it means the fakefs can usually be used instead of an
on disk real directory.