Two small behavior changes: don't "charge" the data to the global rate
limit until it's been accepted by the device specific limiter, and fix
the send/recv direction in the log print on per device rate limits.
clearAddresses write locks the struct and then calls notify. notify in turn tries to obtain a read lock on the same mutex. The result was a deadlock. This change unlocks the struct before calling notify.
Given that we've taken on the resposibility of maintaining this forked
package I've added it to the Syncthing organization. We still vendor it
like an external package, because it's convenient to keep it as a fork
of upstream to easier merge and file pull requests towards them.
When dropping delta index IDs due to upgrade, only drop our local one.
Previously, when dropping all of them, we would trigger a full send in
both directions on first connect after upgrade. Then the other side
would upgrade, doing the same thing. Net effect is full index data gets
sent twice in both directions.
With this change we just drop our local ID, meaning we will send our
full index on first connect after upgrade. When the other side upgrades,
they will do the same. This is a bit less cruel.
Unignored files are marked as conflicting while scanning, which is then resolved
in the subsequent pull. Automatically reconciles needed items on send-only
folders, if they do not actually differ except for internal metadata.
This doesn't happen today, but it might in the future if the block size
were increased or made variable and we were talking to a client from the
future.
* lib/db: Don't panic on negative counts (fixes#4659)
So, negative counts should never happen and hence the original idea to
panic. However, this sucks as the panic will happen in a folder runner,
be automatically swallowed by suture, and the runner gets restarted but
now we are in a bad state. (Related: #4758)
At the time of writing the global list is somewhat in flux (we've
changed how ignored files are handled, invalid bits, etc.) and I think
that can cause unusual conditions here. Hence just fixing up the numbers
instead until the next full recount.
When scanner.Walk detects a change, it now returns the new file info as well as the old file info. It also finds deleted and ignored files while scanning.
Also directory deletions are now always committed to db after their children to prevent temporary failure on remote due to non-empty directory.
This removes a number of timing related things, leaving just the total
test timeout now bumped to one minute. Normally we get the filesystem
events within a second or so, so this doesn't affect the test time in
the successfull case. If we don't actually get the events we expect
within a minute I think we are legitimately in "failed" territory.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4715
LGTM: imsodin, AudriusButkevicius
Since #4340 pulls aren't happening every 10s anymore and may be delayed up to 1h.
This means that no folder error event reaches the web UI for a long time, thus no
failed items will show up for a long time. Now errors are populated when the
web UI is opened.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4650
LGTM: AudriusButkevicius
It turns out that ZFS doesn't do any normalization when storing files,
but does do normalization "as part of any comparison process".
In practice, this seems to mean that if you LStat a normalized filename,
ZFS will return the FileInfo for the un-normalized version of that
filename.
This meant that our test to see whether a separate file with a
normalized version of the filename already exists was failing, as we
were detecting the same file.
The fix is to use os.SameFile, to see whether we're getting the same
FileInfo from the normalized and un-normalized versions of the same
filename.
One complication is that ZFS also seems to apply its magic to os.Rename,
meaning that we can't use it to rename an un-normalized file to its
normalized filename. Instead we have to move via a temporary object. If
the move to the temporary object fails, that's OK, we can skip it and
move on. If the move from the temporary object fails however, I'm not
sure of the best approach: the current one is to leave the temporary
file name as-is, and get Syncthing to syncronize it, so at least we
don't lose the file. I'm not sure if there are any implications of this
however.
As part of reworking normalizePath, I spotted that it appeared to be
returning the wrong thing: the doc and the surrounding code expecting it
to return the normalized filename, but it was returning the
un-normalized one. I fixed this, but it seems suspicious that, if the
previous behaviour was incorrect, noone ever ran afoul of it. Maybe all
filesystems will do some searching and give you a normalized filename if
you request an unnormalized one.
As part of this, I found that TestNormalization was broken: it was
passing, when in fact one of the files it should have verified was
present was missing. Maybe this was related to the above issue with
normalizePath's return value, I'm not sure. Fixed en route.
Kindly tested by @khinsen on the forum, and it appears to work.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4646
This adds one new feature, that discovery servers can have ?nolookup to
be used only for announces. The default set of discovery servers is
changed to:
- discovery.s.n used for lookups. This is dual stack load balanced over
all discovery servers, and returns both IPv4 and IPV6 results when they
exist.
- discovery-v4.s.n used for announces. This has IPv4 addresses only and
the discovery servers will update the unspecified address with the IPv4
source address, as usual.
- discovery-v6.s.n which is exactly the same for IPv6.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4647
This no longer pokes at model internals, and only touches the config.
As a result, model handles this in CommitConfiguration, which restarts
the folders if things change, which repopulate m.folderDevice, m.deviceFolder
and other interal mappings.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4639
These files always have the symlink bit set, because they are reparse
points. Nonetheless they are not symlinks, and Lstat reports a size for
them. We use this fact to disambiguate, and hope fervently that nothing
else matches this description so it comes back to bite us...
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4622
Just because there are a ton of people struggling to set env vars.
Perhaps this should live in advanced settings, and perhaps we should have a button to view the log.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4604
LGTM: calmh, imsodin
Also attempt to handle this nicer by ignoring the truncate failure when
it doesn't matter, and recover by deleting the temp file when it does.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4594
This keeps the data we need about sequence numbers and object counts
persistently in the database. The sizeTracker is expanded into a
metadataTracker than handled multiple folders, and the Counts struct is
made protobuf serializable. It gains a Sequence field to assist in
tracking that as well, and a collection of Counts become a CountsSet
(for serialization purposes).
The initial database scan is also a consistency check of the global
entries. This shouldn't strictly be necessary. Nonetheless I added a
created timestamp to the metadata and set a variable to compare against
that. When the time since the metadata creation is old enough, we drop
the metadata and rebuild from scratch like we used to, while also
consistency checking.
A new environment variable STCHECKDBEVERY can override this interval,
and for example be set to zero to force the check immediately.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4547
LGTM: imsodin
So STDEADLOCK seems to do the same thing as STDEADLOCKTIMEOUT, except in
the other package. Consolidate?
STDEADLOCKTHRESHOLD is actually called STLOCKTHRESHOLD, correct the help
text.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4598
Fix the folder restart behavior (ignore Label), improve the API for that
(imho).
Also removes the tab switch animation in the settings modal, because
annoying.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4577
This should address issue as described in https://forum.syncthing.net/t/stun-nig-party-with-paused-devices/10942/13
Essentially the model and the connection service goes out of sync in terms of thinking if we are connected or not.
Resort to model as being the ultimate source of truth.
I can't immediately pin down how this happens, yet some ideas.
ConfigSaved happens in separate routine, so it's possbile that we have some sort of device removed yet connection comes in parallel kind of thing.
However, in this case the connection exists in the model, and does not exist in the connection service and the only way for the connection to be removed
in the connection service is device removal from the config.
Given the subject, this might also be related to the device being paused.
Also, adds more info to the logs
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4533
We need to reset prevSeq so that we force a full check when someone
reconnects - the sequence number may not have changed due to the
reconnect. (This is a regression; we did this before f6ea2a7.)
Also add an optimization: we schedule a pull after scanning, but there
is no need to do so if no changes were detected. This matters now
because the scheduled pull actually traverses the database which is
expensive.
This, however, makes the pull not happen on initial scan if there were
no changes during the initial scan. Compensate by always scheduling a
pull after initial scan in the rwfolder itself.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4508
LGTM: imsodin, AudriusButkevicius
This is step one of a hundred fifty on the path to case insensitivity.
It brings in the basic case folding mechanism and adds it to the
mtimefs, as this is something outside the fileset that touches stuff in
the database based on name. No effort to convert or handle existing
entries when the insensitivity is changed, I don't think we need it...
Useless by itself but includes tests and will reduce the review load
along the way.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4521
This makes it OK to not have any listeners working. Specifically,
- We don't complain about an empty listener address
- We don't complain about not having anything to announce to global
discovery servers
- We don't send local discovery packets when there is nothing to
announce.
The last point also fixes a thing where the list of addresses for local
discovery was set at startup time and never refreshed.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4517
Well Tested(TM)
Introduces a potential issue where we always pick some connectable but dodgy connection that breaks
soon after the TLS handshake.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4489
Diff is large due to comment reformatting and indentation but all it
does is wrap the file mtime/size/permissions check in an "if
stat.IsRegular()".
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4507
This removes a significant, complex chunk of database code. The
"replace" operation walked both the old and new in lockstep and made the
relevant changes to make the new situation correct. But since delta
indexes we pretty much never need this - we just used replace to drop
the existing data and start over.
This makes that explicit and removes the complexity.
(This is one of those things that would be annoying to make case
insensitive, while the actual "drop and then insert" that we do is
easier.)
This is fairly well unit tested...
The one change to the tests is to cover the fact that previously replace
with something identical didn't bump the sequence number, while
obviously removing everything and re-inserting does. This is not
behavior we depend on anywhere.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4500
LGTM: imsodin, AudriusButkevicius
With VPNs and stuff we can get a single failure on an interface that
supposedly supports broadcasts without it being fatal.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4415
The folder marker conversion forgot to hide the .stfolder. This adds
that, for those who have not yet been converted.
Also adds Hide() calls to the folder start, to mend historical
unhidedness. (I'm sure this will upset someone who is manually managing
their .stignores in the other direction...)
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4384
When STHASHING is set, don't benchmark as it's already decided. If weak
hashing isn't set to "auto", don't benchmark that either.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4349
These functions were very naive and slow. We haven't done much about
them because they pretty much don't matter at all for Syncthing
performance. They are however called very often in the discovery server
and these optimizations have a huge effect on the CPU load on the
public discovery servers.
The code isn't exactly obvious, but we have good test coverage on all
these functions.
benchmark old ns/op new ns/op delta
BenchmarkLuhnify-8 12458 1045 -91.61%
BenchmarkUnluhnify-8 12598 1074 -91.47%
BenchmarkChunkify-8 10792 104 -99.04%
benchmark old allocs new allocs delta
BenchmarkLuhnify-8 18 1 -94.44%
BenchmarkUnluhnify-8 18 1 -94.44%
BenchmarkChunkify-8 44 2 -95.45%
benchmark old bytes new bytes delta
BenchmarkLuhnify-8 1278 64 -94.99%
BenchmarkUnluhnify-8 1278 64 -94.99%
BenchmarkChunkify-8 42552 128 -99.70%
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4346
Currently all errors during pulling and the first of these errors again on
finishing are logged to info. Besides that the errors logged when finishing
are stored in f.errors. This PR moves all logging during pulling to the debug
channel (they might still be relevant in some obscure debugging case) and
uses the stored errors to log the main error per fail when all pulling
iterations are done and failed.
Additional instead of trying 11 times it now only tries 3 times.
This is the first part of what is discussed here:
https://forum.syncthing.net/t/reduce-verboseness-of-puller/10261
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4338
This updates kcp and uses our own fork which:
1. Keys sessions not just by remote address, but by remote address +
conversation id 2. Allows not to close connections that were passed directly
to the library. 3. Resets cache key if the session gets terminated.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4339
LGTM: calmh
Prior to this, the following is possible:
- Create a symlink "foo -> /somewhere", it gets synced
- Delete "foo", it gets versioned
- Create "foo/bar", it gets synced
- Delete "foo/bar", it gets versioned in "/somewhere/bar"
With this change, versioners should never version symlinks.
Otherwise all the lines from includes will be shown in the web UI instead of
just the #include ... line. This problem was introduced in #3996.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4248
LGTM: calmh
This removes the special handling of minor versions as major when the
actual major is zero, and adds the special case that upgrades from 0.x
to 1.x are considered minor. 0.x to 2.x or 1.x to 2.x etc are still
considered major.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4226
This solves the erratic test failures on model.TestIgnores by ensuring
that the ignore patterns are reloaded even in the face of unchanged
timestamps.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4208
The folder already knew how to stop properly, but the fs.Walk() didn't
and can potentially take a very long time. This adds context support to
Walk and the underlying scanning stuff, and passes in an appropriate
context from above. The stop channel in model.folder is replaced with a
context for this purpose.
To test I added an infiniteFS that represents a large amount of data
(not actually infinite, but close) and verify that walking it is
properly stopped. For that to be implemented smoothly I moved out the
Walk function to it's own type, as typically the implementer of a new
filesystem type might not need or want to reimplement Walk.
It's somewhat tricky to test that this actually works properly on the
actual sendReceiveFolder and so on, as those are started from inside the
model and the filesystem isn't easily pluggable etc. Instead I've tested
that part manually by adding a huge folder and verifying that pause,
resume and reconfig do the right things by looking at debug output.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4117
So, when first implementing the database layer I added panics on every
unexpected error condition mostly to be sure to flush out bugs and
inconsistencies. Then it became sort of standard, and we don't seem to
have many bugs here any more so the panics are usually caused by things
like checksum errors on read. But it's not an optimal user experience to
crash all the time.
Here I've weeded out most of the panics, while retaining a few "can't
happen" ones like errors on marshalling and write that we really can't
recover from.
For the rest, I'm mostly treating any read error as "entry didn't
exist". This should mean we'll rescan the file and correct the info (if
scanning) or treat it as a new file and do conflict handling (when
pulling). In some cases things like our global stats may be slightly
incorrect until a restart, if a database entry goes suddenly missing
during runtime.
All in all, I think this makes us a bit more robust and friendly without
introducing too many risks for the user. If the database is truly toast,
probably many other things on the system will be toast as well...
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4118
Harmonize how we use batches in the model, using ProtoSize() to judge
the actual weight of the entire batch instead of estimating. Use smaller
batches in the block map - I think we might have though that batch.Len()
in the leveldb was the batch size in bytes, but it's actually number of
operations.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4114
The mechanism to disallow manual scans before the initial scan completed
(#3996) , had the side effect, that if the initial scan failed, no further
scans are allowed. So this marks the initial scan as finished regardless of
whether it succeeded or not.
There was also redundant code in rofolder and a pointless check for folder
health in scanSubsIfHealthy (happens in internalScanFolderSubdirs as well).
This also moves logging from folder.go to ro/rw-folder.go to include the
information about whether it is send-only or send-receive
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4104
This adds a parameter "events" to the /rest/events endpoint. It should
be a comma separated list of the events the consumer is interested in.
When not given it defaults to the current set of events, so it's
backwards compatible.
The API service then manages subscriptions, creating them as required
for each requested event mask. Old subscriptions are not "garbage
collected" - it's assumed that in normal usage the set of event
subscriptions will be small enough. Possibly lower than before, as we
will not set up the disk event subscription unless it's actually used.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4092
This deprecates the current minDiskFreePct setting and introduces
minDiskFree. The latter is, in it's serialized form, a string with a
unit. We accept percentages ("2.35%") and absolute values ("250 k", "12.5
Gi"). Common suffixes are understood. The config editor lets the user
enter the string, and validates it.
We still default to "1 %", but the user can change that to an absolute
value at will.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4087
LGTM: AudriusButkevicius, imsodin