Use the proper encoding function in the relay server when constructing
the URL. In the pool server, parse and re-encode the query values to
sanitize whatever the client sent.
This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `os.MkdirTemp`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* lib/api: Note ItemStarted and ItemFinished for default filtering.
The reasoning why LocalChangeDetected and RemoteChangeDetected events
are not included in the event stream by default (without explicit
filter mask requested) also holds for the ItemStarted and ItemFinished
events. They should be excluded as well when we start to break the
API compatibility for some reason.
* gui: Enumerate unused event types in the eventService.
Define constants for the unused event types as well, for completeness'
sake. They are intentionally not handled in the GUI currently.
* cmd/syncthing: Harmonize uppercase CLI argument placeholders.
Use ALL-UPPERCASE and connecting dashes to distinguish argument
placeholders from literal argument options (e.g. "cpu" or "heap" for
profiling). The dash makes it clear which words form a single
argument and where a new argument starts.
This style is already used for the "syncthing cli debug file" command.
* lib/model: Simplify event data structure.
Using map[string]interface{} is not necessary when all values are
known to be strings.
The intention was that if no peers are given, we shouldn't start the
listener. We did that anyway, because:
- splitting an empty string on comma returns a slice with one empty
string in it
- parsing the empty string as a device ID returns the empty device ID
so we end up with a valid replication peer which is the empty device ID.
* cmd/syncthing: Remove unnecessary function arguments.
The openGUI() function does not need a device ID to work, and there is
only one caller anyway which uses EmptyDeviceID.
The loadOrDefaultConfig() function is always called with the same
dummy values.
* cmd/syncthing: Avoid misleading info messages from monitor process.
In order to check whether panic reporting is enabled, the monitor
process utilizes the loadOrDefaultConfig() function. In case there is
no config file yet, info messages may be logged during creation if the
config Wrapper, which is discarded immediately after.
Stop using the DefaultConfig() utility function from lib/syncthing and
directly generate a minimal config instead to avoid these.
Add comments to loadOrDefaultConfig() explaining its limited purpose.
* cmd/syncthing/generate: Always write updated config file.
Previously, an existing config file was left untouched unless either
of the --gui-user or --gui-password options was given. Remove that
condition and simplify the checking code.
* lib/config: Factor out ProbeFreePorts().
* cmd/syncthing: Add option --skip-port-probing.
Applies to both the "generate" and "serve" subcommands, as well as the
deprecated --generate option, just as the --no-default-folder flag.
* cmd/syncthing: Remove unnecessary function arguments.
The openGUI() function does not need a device ID to work, and there is
only one caller anyway which uses EmptyDeviceID.
The loadOrDefaultConfig() function is always called with the same
dummy values.
* cmd/syncthing: Avoid misleading info messages from monitor process.
In order to check whether panic reporting is enabled, the monitor
process utilizes the loadOrDefaultConfig() function. In case there is
no config file yet, info messages may be logged during creation if the
config Wrapper, which is discarded immediately after.
Stop using the DefaultConfig() utility function from lib/syncthing and
directly generate a minimal config instead to avoid these.
Add comments to loadOrDefaultConfig() explaining its limited purpose.
* cmd/syncthing/generate: Always write updated config file.
Previously, an existing config file was left untouched unless either
of the --gui-user or --gui-password options was given. Remove that
condition and simplify the checking code.
The locking protocol in nat.Mapping was racy:
* Mapping.addressMap RLock'd, but then returned a map shared between
caller and Mapping, so the lock didn't do anything.
* Operations inside Service.{verifyExistingMappings,acquireNewMappings}
would lock the map for every update, but that means callers to
Mapping.ExternalAddresses can be looping over the map while the
Service methods are concurrently modifying it. When the Go runtime
detects that happening, it panics.
* Mapping.expires was read and updated without locking.
The Service methods now lock the map once and release the lock only when
done.
Also, subscribers no longer get the added and removed addresses, because
none of them were using the information. This was changed for a previous
attempt to retain the fine-grained locking and not reverted because it
simplifies the code.
Accept a subcommand as an alternative to the --generate option. It
accepts a custom config directory through either the --home or
--config options, using the default location if neither is given.
Add the options --gui-user and --gui-password to "generate", but not
the "serve --generate" option form. If either is given, an existing
config will not abort the command, but rather load, modify and save it
with the new credentials. The password can be read from standard
input by passing only a single dash as argument.
Config modification is skipped if the value matches what's already in
the config.
* cmd/syncthing: Utilize lib/locations package in generate().
Instead of manually joining paths with "magic" file names, get them
from the centralized locations helper lib.
* cmd/syncthing: Simplify logging for --generate option.
Visible change: No more timestamp prefixes.
LoadOrGenerateCertificate() takes two file path arguments, but then
uses the locations package to determine the actual path. Fix that
with a minimally invasive change, by using the arguments instead.
Factor out GenerateCertificate().
The only caller of this function is cmd/syncthing, which passes the
same values, so this is technically a no-op.
* lib/tlsutil: Make storing generated certificate optional. Avoid
temporary cert and key files in tests, keep cert in memory.
Consistently use double dashes and fix typos -conf, -data-dir and
-verify.
Applies also to tests running the syncthing binary for consistency.
* Fix mismatched option name --conf in cli subcommand.
According to the source code comments, the cli option flags should
mirror those from the serve subcommand where applicable. That one is
actually called --config though.
* cli: Fix help text option placeholders.
The urfave/cli package uses the Value field of StringFlag to provide a
default value, not to name the placeholder. That is instead done with
backticks around some part of the Usage field.
* cli: Add missing --data flag in subcommand help text.
The urfave/cli based option parsing uses a fake flags collection to
generate help texts matching the used global options. But the --data
option was omitted from it, although it is definitely required when
using --config as well. Note that it cannot just be ignored, as some
debug stuff actually uses the DB:
syncthing cli --data=/bar --config=/foo debug index dump
* cmd/strelaypoolsrv: Fix minor grammar, use https in links
Add a few minor grammatical/stylistic fixes. Use `https` instead of
`http` in the MaxMind link in the footer.
Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com>
* wip
Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com>
* cmd/syncthing: Don't fail early on api setup error (fixes 7558)
* switch to factory pattern
* refactor config command to show help on nothing
* wip
* wip
* already abort in before
This is a mostly pointless change to make security scanners and static
analysis tools happy, as they all hate seeing md5. None of our md5 uses
were security relevant, but still. Only visible effect of this change is
that our temp file names for very long file names become slightly longer
than they were previously...
This adds a couple of dummy asset files protected by the "noassets"
build tag. The purpose is that it should be possible for, for example,
CI tools and static analysis things to compile and analyze the source
tree without our custom asset generation step. Also makes `go test -tags
noassets ./...` work without building assets first.
This truncates times meant for API consumption to second precision,
where fractions won't typically matter or add any value. Exception to
this is timestamps on logs and events, and of course I'm not touching
things like file metadata.
I'm not 100% certain this is an exhaustive change, but it's the things I
found by grepping and following the breadcrumbs from lib/api...
I also considered general-but-ugly solutions, like having the API
serializer itself do reflection magic or even regexps on returned
objects, but decided against it because aurgh...
When cap(permanentRelays) >= len(permanentRelays) + len(knownRelays),
append(permanentRelays, knownRelays...)
returns a slice of the array underlying permanentRelays. The subsequent
rand.Shuffle then mixes the permanent and known relays. Sequential
requests may cause strelaypoolsrv to forget its permanent relays. Worse,
concurrent requests may cause shuffling of the same slice on multiple
processors concurrently.
Co-authored-by: greatroar <@>
With this change we emulate a case sensitive filesystem on top of
insensitive filesystems. This means we correctly pick up case-only renames
and throw a case conflict error when there would be multiple files differing
only in case.
This safety check has a small performance hit (about 20% more filesystem
operations when scanning for changes). The new advanced folder option
`caseSensitiveFS` can be used to disable the safety checks, retaining the
previous behavior on systems known to be fully case sensitive.
Co-authored-by: Jakob Borg <jakob@kastelo.net>
* Fix ui, hide report date
* Undo Goland madness
* UR now web scale
* Fix migration
* Fix marshaling, force tick on start
* Fix tests
* Darwin build
* Split "all" build target, add package name as a tag
* Remove pq and sql dep from syncthing, split build targets
* Empty line
* Revert "Empty line"
This reverts commit f74af2b067dadda8a343714123512bd545a643c3.
* Revert "Remove pq and sql dep from syncthing, split build targets"
This reverts commit 8fc295ad007c5bb7886c557f492dacf51be307ad.
* Revert "Split "all" build target, add package name as a tag"
This reverts commit f4dc88995106d2b06042f30bea781a0feb08e55f.
* Normalise contract types
* Fix build add more logging
This matches the convention of the stdlib and avoids ambiguity: when
customErr{} and &customErr{} both implement error, client code needs to
check for both.
Memory use should remain the same, since storing a non-pointer type in
an interface value still copies the value to the heap.
This extracts the extra tags from any `[foo]` stuff at the end of the
version and sends them to Sentry for indexing.
If I need to modify that regexp again I'll probably write a from scratch
tokenizer and parser for our version string instead...
Group the global list of files by version, instead of having one flat list for all devices. This removes lots of duplicate protocol.Vectors.
Co-authored-by: Jakob Borg <jakob@kastelo.net>
* cmd/stindex: Unify access to key from cached variable.
Avoid calling the Key() method from the iterator each time the value
is needed. Just reuse the cache variable already assigned before the
switch block.
* cmd/stindex: Display the prefix byte value for unknown key types.
Make it easier to diagnose corrupt / unknown key type entries by
showing their decimal value, correlating with the definitions in
keyer.go.
* cmd/stindex: Add missing KeyType values in stindex dump code.
Recently added DB key prefixes KeyTypeBlockListMap and KeyTypeVersion
were unknown to the stindex dumping tool. Add basic parsing to dump
their key structure.
This adds indirection of large version vectors in the same manner as we
already to block lists. The effect is the same: less duplicated data in
some situations.
To mitigate the impact for when this indirection
wouldn't be needed I've added an indirection cutoff for both blocks and
the new version vector stuff: we don't do the indirection at all for
small block lists or small version vectors, instead storing it directly
like we used to do. This is faster for small files and small setups.
Storing assets as []byte requires every compiled-in asset to be copied
into writable memory at program startup. That currently takes up 1.6MB
per syncthing process. Strings stay in the RODATA section and should be
shared between processes running the same binary.
This changes the build script to build all the things in one go
invocation, instead of one invocation per cmd. This is a lot faster
because it means more things get compiled concurrently. It's especially
a lot faster when things *don't* need to be rebuilt, possibly because it
only needs to build the dependency map and such once instead of once per
binary.
In order for this to work we need to be able to pass the same ldflags to
all the binaries. This means we can't set the program name with an
ldflag.
When it needs to rebuild everything (go clean -cache):
( ./old-build -gocmd go1.14.2 build all 2> /dev/null; ) 65.82s user 11.28s system 574% cpu 13.409 total
( ./new-build -gocmd go1.14.2 build all 2> /dev/null; ) 63.26s user 7.12s system 1220% cpu 5.766 total
On a subsequent run (nothing to build, just link the binaries):
( ./old-build -gocmd go1.14.2 build all 2> /dev/null; ) 26.58s user 7.53s system 582% cpu 5.853 total
( ./new-build -gocmd go1.14.2 build all 2> /dev/null; ) 18.66s user 2.45s system 1090% cpu 1.935 total
Successful LRU cache lookups modify the cache's recency list, so
RWMutex.RLock isn't enough protection.
Secondarily, multiple concurrent lookups with the same key should not
create separate rate limiters, so release the lock only when presence
of the key in the cache has been ascertained.
Co-authored-by: greatroar <@>
We set the STRESTART environment when starting the inner process after
the first time, but this didn't persist when restarting the monitor
process. Now it does.
- In the few places where we wrap errors, use the new Go 1.13 "%w"
construction instead of %s or %v.
- Where we create errors with constant strings, consistently use
errors.New and not fmt.Errorf.
- Remove capitalization from errors in the few places where we had that.
I was working on indirecting version vectors, and that resulted in some
refactoring and improving the existing block indirection stuff. We may
or may not end up doing the version vector indirection, but I think
these changes are reasonable anyhow and will simplify the diff
significantly if we do go there. The main points are:
- A bunch of renaming to make the indirection and GC not about "blocks"
but about "indirection".
- Adding a cutoff so that we don't actually indirect for small block
lists. This gets us better performance when handling small files as it
cuts out the indirection for quite small loss in space efficiency.
- Being paranoid and always recalculating the hash on put. This costs
some CPU, but the consequences if a buggy or malicious implementation
silently substituted the block list by lying about the hash would be bad.
Also retain the interval over restarts by storing last GC time in the
database. This to make sure that GC eventually happens even if the
interval is configured to a long time (say, a month).
* lib/db: Deduplicate block lists in database (fixes#5898)
This moves the block list in the database out from being just a field on
the FileInfo to being an object of its own. When putting a FileInfo we
marshal the block list separately and store it keyed by the sha256 of
the marshalled block list. When getting, if we are not doing a
"truncated" get, we do an extra read and unmarshal for the block list.
Old block lists are cleared out by a periodic GC sweep. The alternative
would be to use refcounting, but:
- There is a larger risk of getting that wrong and either dropping a
block list in error or keeping them around forever.
- It's tricky with our current database, as we don't have dirty reads.
This means that if we update two FileInfos with identical block lists in
the same transaction we can't just do read/modify/write for the ref
counters as we wouldn't see our own first update. See above about
tracking this and risks about getting it wrong.
GC uses a bloom filter for keys to avoid heavy RAM usage. GC can't run
concurrently with FileInfo updates so there is a new lock around those
operation at the lowlevel.
The end result is a much more compact database, especially for setups
with many peers where files get duplicated many times.
This is per-key-class stats for a large database I'm currently working
with, under the current schema:
```
0x00: 9138161 items, 870876 KB keys + 7397482 KB data, 95 B + 809 B avg, 1637651 B max
0x01: 185656 items, 10388 KB keys + 1790909 KB data, 55 B + 9646 B avg, 924525 B max
0x02: 916890 items, 84795 KB keys + 3667 KB data, 92 B + 4 B avg, 192 B max
0x03: 384 items, 27 KB keys + 5 KB data, 72 B + 15 B avg, 87 B max
0x04: 1109 items, 17 KB keys + 17 KB data, 15 B + 15 B avg, 69 B max
0x06: 383 items, 3 KB keys + 0 KB data, 9 B + 2 B avg, 18 B max
0x07: 510 items, 4 KB keys + 12 KB data, 9 B + 24 B avg, 41 B max
0x08: 1349 items, 12 KB keys + 10 KB data, 9 B + 8 B avg, 17 B max
0x09: 194 items, 0 KB keys + 123 KB data, 5 B + 634 B avg, 11484 B max
0x0a: 3 items, 0 KB keys + 0 KB data, 14 B + 7 B avg, 30 B max
0x0b: 181836 items, 2363 KB keys + 10694 KB data, 13 B + 58 B avg, 173 B max
Total 10426475 items, 968490 KB keys + 9202925 KB data.
```
Note 7.4 GB of data in class 00, total size 9.2 GB. After running the
migration we get this instead:
```
0x00: 9138161 items, 870876 KB keys + 2611392 KB data, 95 B + 285 B avg, 4788 B max
0x01: 185656 items, 10388 KB keys + 1790909 KB data, 55 B + 9646 B avg, 924525 B max
0x02: 916890 items, 84795 KB keys + 3667 KB data, 92 B + 4 B avg, 192 B max
0x03: 384 items, 27 KB keys + 5 KB data, 72 B + 15 B avg, 87 B max
0x04: 1109 items, 17 KB keys + 17 KB data, 15 B + 15 B avg, 69 B max
0x06: 383 items, 3 KB keys + 0 KB data, 9 B + 2 B avg, 18 B max
0x07: 510 items, 4 KB keys + 12 KB data, 9 B + 24 B avg, 41 B max
0x09: 194 items, 0 KB keys + 123 KB data, 5 B + 634 B avg, 11484 B max
0x0a: 3 items, 0 KB keys + 0 KB data, 14 B + 17 B avg, 51 B max
0x0b: 181836 items, 2363 KB keys + 10694 KB data, 13 B + 58 B avg, 173 B max
0x0d: 44282 items, 1461 KB keys + 61081 KB data, 33 B + 1379 B avg, 1637399 B max
Total 10469408 items, 969939 KB keys + 4477905 KB data.
```
Class 00 is now down to 2.6 GB, with just 61 MB added in class 0d.
There will be some additional reads in some cases which theoretically
hurts performance, but this will be more than compensated for by smaller
writes and better compaction.
On my own home setup which just has three devices and a handful of
folders the difference is smaller in absolute numbers of course, but
still less than half the old size:
```
0x00: 297122 items, 20894 KB keys + 306860 KB data, 70 B + 1032 B avg, 103237 B max
0x01: 115299 items, 7738 KB keys + 17542 KB data, 67 B + 152 B avg, 419 B max
0x02: 1430537 items, 121223 KB keys + 5722 KB data, 84 B + 4 B avg, 253 B max
...
Total 1947412 items, 151268 KB keys + 337485 KB data.
```
to:
```
0x00: 297122 items, 20894 KB keys + 37038 KB data, 70 B + 124 B avg, 520 B max
0x01: 115299 items, 7738 KB keys + 17542 KB data, 67 B + 152 B avg, 419 B max
0x02: 1430537 items, 121223 KB keys + 5722 KB data, 84 B + 4 B avg, 253 B max
...
0x0d: 18041 items, 595 KB keys + 71964 KB data, 33 B + 3988 B avg, 101109 B max
Total 1965447 items, 151863 KB keys + 139628 KB data.
```
* wip
* wip
* wip
* wip
This PR does two things, because one lead to the other:
- Move the leveldb specific stuff into a small "backend" package that
defines a backend interface and the leveldb implementation. This allows,
potentially, in the future, switching the db implementation so another
KV store should we wish to do so.
- Add proper error handling all along the way. The db and backend
packages are now errcheck clean. However, I drew the line at modifying
the FileSet API in order to keep this manageable and not continue
refactoring all of the rest of Syncthing. As such, the FileSet methods
still panic on database errors, except for the "database is closed"
error which is instead handled by silently returning as quickly as
possible, with the assumption that we're anyway "on the way out".