* lib/db: Add ExpirePendingFolders().
Use-case is to drop any no-longer-pending folders for a specific
device when parsing its ClusterConfig message where previously offered
folders are not mentioned any more.
The timestamp in ObservedFolder is stored with only second precision,
so round to seconds here as well. This allows calling the function
within the same second of adding or updating entries.
* lib/model: Weed out pending folders when receiving ClusterConfig.
Filter the entries by timestamp, which must be newer than or equal to
the reception time of the ClusterConfig. For just mentioned ones,
this assumption will hold as AddOrUpdatePendingFolder() updates the
timestamp.
* lib/model, gui: Notify when one or more pending folders expired.
Introduce new event type FolderOfferCancelled and use it to trigger a
complete refreshCluster() cycle. Listing individual entries would be
much more code and probably just as much work to answer the API
request.
* lib/model: Add comment and rename ExpirePendingFolders().
* lib/events: Rename FolderOfferCancelled to ClusterPendingChanged.
* lib/model: Reuse ClusterPendingChanged event for cleanPending()
Changing the config does not necessarily mean that the
/resut/cluster/pending endpoints need to be refreshed, but only if
something was actually removed. Detect this and indicate it through
the ClusterPendingChanged event, which is already hooked up to requery
respective endpoints within the GUI.
No more need for a separate refreshCluster() in reaction to
ConfigSaved event or calling refreshConfig().
* lib/model: Gofmt.
* lib/db: Warn instead of info log for failed removal.
* gui: Fix pending notifications not loading on GUI start.
* lib/db: Use short device ID in log message.
* lib/db: Return list of expired folder IDs after deleting them.
* lib/model: Refactor Pending...Changed events.
* lib/model: Adjust format of removed pending folders enumeration.
Use an array of objects with device / folder ID properties, matching
the other places where it's used.
* lib/db: Drop invalid entries in RemovePendingFoldersBeforeTime().
* lib/model: Gofmt.
My local gofmt did not complain here, strangely...
* gui: Handle PendingDevicesChanged event.
Even though it currently only holds one device at a time, wrap the
contents in an array under the "added" property name.
* lib/model: Fix null values in PendingFoldersChanged removed member.
* gui: Handle PendingFoldersChanged event.
* lib/model: Simplify construction of expiredPendingList.
* lib/model: Reduce code duplication in cleanPending().
Use goto and a label for the common parts of calling the DB removal
function and building the event data part.
* lib/events, gui: Mark ...Rejected events deprecated.
Extend comments explaining the conditions when the replacement event
types are emitted.
* lib/model: Wrap removed devices in array of objects as well.
* lib/db: Use iter.Value() instead of needless db.Get(iter.Key())
* lib/db: Add comment explaining RemovePendingFoldersBeforeTime().
* lib/model: Rename fields folderID and deviceID in event data.
* lib/db: Only list actually expired IDs as removed.
Skip entries where Delete() failed as well as invalid entries that got
removed automatically.
* lib/model: Gofmt
This adds two new configuration options:
// The number of connections at which we stop trying to connect to more
// devices, zero meaning no limit. Does not affect incoming connections.
ConnectionLimitEnough int
// The maximum number of connections which we will allow in total, zero
// meaning no limit. Affects incoming connections and prevents
// attempting outgoing connections.
ConnectionLimitMax int
These can be used to limit the number of concurrent connections in
various ways.
This removes the switch for using a Badger database, because it has bugs
that it seems there is no interest in fixing, and no actual bug tracker
to track them in.
It retains the actual implementation for the sole purpose of being able
to do the conversion back to LevelDB if anyone is actually running with
USE_BADGER. At some point in a couple of versions we can remove the
implementation as well.
Since iterators must be released before committing or discarding a
transaction we have the pattern of both deferring a release plus doing
it manually. But we can't release twice because we track this with a
WaitGroup that will panic when there are more Done()s than Add()s. This
just adds a boolean to let an iterator keep track.
If the GC finds a key k that it wants to keep, it records that in a
Bloom filter. If a key k' can be removed but its hash collides with k,
it will be kept. Since the old Bloom filter code was completely
deterministic, the next run would encounter the same collision, assuming
k must still be kept.
A randomized hash function that uses all the SHA-256 bits solves this
problem: the second run has a non-zero probability of removing k', as
long as the Bloom filter is not completely full.
This makes Go 1.15 test/vet happy, avoiding "conversion from untyped int
to string yields a string of one rune" warning where we do
string(KeyTypeWhatever) in namespaced.go.
It also clarifies and enforces the currently allowed range of these
numbers so I think it's fine.
This matches the convention of the stdlib and avoids ambiguity: when
customErr{} and &customErr{} both implement error, client code needs to
check for both.
Memory use should remain the same, since storing a non-pointer type in
an interface value still copies the value to the heap.
Group the global list of files by version, instead of having one flat list for all devices. This removes lots of duplicate protocol.Vectors.
Co-authored-by: Jakob Borg <jakob@kastelo.net>
This reduces the size of our write batches before we flush them. This
has two effects: reducing the amount of data lost if we crash when
updating the database, and reducing the amount of memory used when we do
large updates without checkpoint (e.g., deleting a folder).
I ran our SyncManyFiles benchmark as it is the one doing most
transactions, however there was no relevant change in any metric (it's
limited by our fsync I expect). This is good as any visible change would
just be a decrease in performance.
I don't have a benchmark on deleting a large folder, taking that part on
trust for now...
This adds indirection of large version vectors in the same manner as we
already to block lists. The effect is the same: less duplicated data in
some situations.
To mitigate the impact for when this indirection
wouldn't be needed I've added an indirection cutoff for both blocks and
the new version vector stuff: we don't do the indirection at all for
small block lists or small version vectors, instead storing it directly
like we used to do. This is faster for small files and small setups.
This makes the GC runner a service that will stop fairly quickly when
told to.
As a bonus, STTRACE=app will print the service tree on the way out,
including any errors they've flagged.
As of the latest database checker we are again putting files without
blocks. I'm not 100% convinced that's a great idea, but we also do it
for ignored files apparently so it looks like we probably should support
it. This adds an escape hatch that must be manually enabled...
- In the few places where we wrap errors, use the new Go 1.13 "%w"
construction instead of %s or %v.
- Where we create errors with constant strings, consistently use
errors.New and not fmt.Errorf.
- Remove capitalization from errors in the few places where we had that.
If we decide to recalculate the metadata we shouldn't start from
whatever we loaded from the database, as that data is wrong. We should
start from a clean slate.
I was working on indirecting version vectors, and that resulted in some
refactoring and improving the existing block indirection stuff. We may
or may not end up doing the version vector indirection, but I think
these changes are reasonable anyhow and will simplify the diff
significantly if we do go there. The main points are:
- A bunch of renaming to make the indirection and GC not about "blocks"
but about "indirection".
- Adding a cutoff so that we don't actually indirect for small block
lists. This gets us better performance when handling small files as it
cuts out the indirection for quite small loss in space efficiency.
- Being paranoid and always recalculating the hash on put. This costs
some CPU, but the consequences if a buggy or malicious implementation
silently substituted the block list by lying about the hash would be bad.
This adds metadata updates to the same write batch as the underlying
file change. The odds of a metadata update going missing is greatly
reduced.
Bonus change: actually commit the transaction in recalcMeta.