Lorenz Bausch
9fb81c4246
Validate exclude patterns
2022-05-07 21:12:47 +02:00
Lorenz Bausch
e7fd200237
Keep original pattern for later use
2022-05-07 21:08:09 +02:00
Michael Eischer
cd190bee14
filter: short circuit if no negative patterns
2022-03-20 13:33:08 +01:00
Vincent Bernat
2ee07ded2b
filter: ability to use negative patterns
...
This is quite similar to gitignore. If a pattern is suffixed by an
exclamation mark and match a file that was previously matched by a
regular pattern, the match is cancelled. Notably, this can be used
with `--exclude-file` to cancel the exclusion of some files.
Like for gitignore, once a directory is excluded, it is not possible
to include files inside the directory. For example, a user wanting to
only keep `*.c` in some directory should not use:
~/work
!~/work/*.c
But:
~/work/*
!~/work/*.c
I didn't write documentation or changelog entry. I would like to get
feedback if this is the right approach for excluding/including files
at will for backups. I use something like this as an exclude file to
backup my home:
$HOME/**/*
!$HOME/Documents
!$HOME/code
!$HOME/.emacs.d
!$HOME/games
# [...]
node_modules
*~
*.o
*.lo
*.pyc
# [...]
$HOME/code/linux/*
!$HOME/code/linux/.git
# [...]
There are some limitations for this change:
- Patterns are not mixed accross methods: patterns from file are
handled first and if a file is excluded with this method, it's not
possible to reinclude it with `--exclude !something`.
- Patterns starting with `!` are now interpreted as a negative
pattern. I don't think anyone was relying on that.
- The whole list of patterns is walked for each match. We may
optimize later by exiting early if we know no pattern is starting
with `!`.
Fix #233
2022-03-20 13:33:08 +01:00
Michael Eischer
12606b575f
filter: Cleanup variable naming
2022-03-20 13:33:08 +01:00
Michael Eischer
5f145f0c7e
filter: introduce pattern struct
2022-03-20 13:33:08 +01:00
Michael Eischer
55bea6e7a6
filter: Fix crash for '**' pattern
2021-05-14 23:50:31 +02:00
Michael Eischer
88c8e903d2
filter: Fix glob matching on absolute path marker on windows
...
A pattern part containing "/" is used to mark a path or a pattern as
absolute. However, on Windows the path separator is "\" such that glob
patterns like "?" could match the marker. The code now explicitly skips
the marker when the pattern does not represent an absolute path.
2020-10-09 16:11:05 +02:00
greatroar
740758a5fa
Optimize filter pattern matching
...
By replacing "**" with "", checking for this special path component can
be reduced to a length-zero check.
name old time/op new time/op delta
FilterLines-8 44.7ms ± 5% 44.9ms ± 5% ~ (p=0.631 n=10+10)
FilterPatterns/Relative-8 13.6ms ± 4% 13.4ms ± 5% ~ (p=0.165 n=10+10)
FilterPatterns/Absolute-8 10.9ms ± 5% 10.7ms ± 4% ~ (p=0.052 n=10+10)
FilterPatterns/Wildcard-8 53.7ms ± 5% 50.4ms ± 5% -6.00% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-8 128ms ± 2% 95ms ± 1% -25.54% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-8 3.57MB ± 0% 3.57MB ± 0% ~ (p=1.000 n=9+8)
FilterPatterns/Absolute-8 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.903 n=9+8)
FilterPatterns/Wildcard-8 19.7MB ± 0% 19.7MB ± 0% -0.00% (p=0.022 n=10+9)
FilterPatterns/ManyNoMatch-8 3.57MB ± 0% 3.57MB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-8 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Absolute-8 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Wildcard-8 88.7k ± 0% 88.7k ± 0% ~ (all equal)
FilterPatterns/ManyNoMatch-8 22.2k ± 0% 22.2k ± 0% ~ (all equal)
2020-10-09 15:51:14 +02:00
Michael Eischer
8388e67c4b
filter: cleanup path separator conversion
2020-10-07 21:14:07 +02:00
Michael Eischer
0acc3c5923
filter: special case patterns without globbing characters
...
In case a part of a path is a simple string, we can just check for
equality without complex parsing in filepath.Match.
name old time/op new time/op delta
FilterLines-4 34.8ms ±17% 41.2ms ±23% +18.36% (p=0.000 n=10+10)
FilterPatterns/Relative-4 21.7ms ± 6% 12.1ms ±23% -44.46% (p=0.000 n=10+10)
FilterPatterns/Absolute-4 10.0ms ± 5% 9.1ms ±11% -9.80% (p=0.006 n=10+9)
FilterPatterns/Wildcard-4 47.0ms ± 7% 42.2ms ± 5% -10.19% (p=0.000 n=9+10)
FilterPatterns/ManyNoMatch-4 190ms ± 1% 131ms ±20% -31.47% (p=0.000 n=8+10)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.870 n=9+9)
FilterPatterns/Absolute-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.145 n=10+10)
FilterPatterns/Wildcard-4 14.3MB ± 0% 19.7MB ± 0% +37.91% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.421 n=10+9)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Absolute-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Wildcard-4 88.7k ± 0% 88.7k ± 0% ~ (all equal)
FilterPatterns/ManyNoMatch-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
2020-10-07 20:55:43 +02:00
Michael Eischer
bcc3bddcf4
filter: only check whether a child path could match when necessary
...
When checking excludes there is no need to test whether a child path
could also match the pattern, as it is by definition excluded.
Previously childMayMatch was calculated but then discarded. For simple
absolute paths this can account for half the time spent for checking
pattern matches.
name old time/op new time/op delta
FilterPatterns/Relative-4 23.3ms ± 9% 21.7ms ± 6% -6.68% (p=0.004 n=10+10)
FilterPatterns/Absolute-4 13.9ms ± 7% 10.0ms ± 5% -27.61% (p=0.000 n=10+10)
FilterPatterns/Wildcard-4 51.4ms ± 7% 47.0ms ± 7% -8.51% (p=0.001 n=9+9)
FilterPatterns/ManyNoMatch-4 551ms ± 9% 190ms ± 1% -65.41% (p=0.000 n=10+8)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.665 n=10+9)
FilterPatterns/Absolute-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.480 n=9+10)
FilterPatterns/Wildcard-4 14.3MB ± 0% 14.3MB ± 0% ~ (p=0.431 n=9+10)
FilterPatterns/ManyNoMatch-4 3.57MB ± 0% 3.57MB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Absolute-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Wildcard-4 88.7k ± 0% 88.7k ± 0% ~ (all equal)
FilterPatterns/ManyNoMatch-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
2020-10-07 20:47:52 +02:00
Michael Eischer
17c53efb0d
filter: Optimize double wildcard expansion
...
This only allocates a single slice to expand the double wildcard and
only copies the pattern prefix once.
name old time/op new time/op delta
FilterPatterns/Relative-4 22.7ms ± 5% 23.3ms ± 9% ~ (p=0.353 n=10+10)
FilterPatterns/Absolute-4 14.2ms ±13% 13.9ms ± 7% ~ (p=0.853 n=10+10)
FilterPatterns/Wildcard-4 266ms ±16% 51ms ± 7% -80.67% (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4 554ms ± 6% 551ms ± 9% ~ (p=0.436 n=10+10)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.349 n=10+10)
FilterPatterns/Absolute-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.073 n=10+9)
FilterPatterns/Wildcard-4 141MB ± 0% 14MB ± 0% -89.89% (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4 3.57MB ± 0% 3.57MB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Absolute-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Wildcard-4 1.63M ± 0% 0.09M ± 0% -94.56% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
2020-10-07 20:47:48 +02:00
Michael Eischer
7959796269
filter: Special case for absolute paths
...
name old time/op new time/op delta
FilterPatterns/Relative-4 23.6ms ±20% 22.7ms ± 5% ~ (p=0.684 n=10+10)
FilterPatterns/Absolute-4 32.3ms ± 8% 14.2ms ±13% -56.01% (p=0.000 n=10+10)
FilterPatterns/Wildcard-4 334ms ±17% 266ms ±16% -20.56% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 709ms ± 7% 554ms ± 6% -21.89% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 3.57MB ± 0% 3.57MB ± 0% +0.00% (p=0.046 n=9+10)
FilterPatterns/Absolute-4 3.57MB ± 0% 3.57MB ± 0% ~ (p=0.464 n=10+10)
FilterPatterns/Wildcard-4 141MB ± 0% 141MB ± 0% ~ (p=0.163 n=9+10)
FilterPatterns/ManyNoMatch-4 3.57MB ± 0% 3.57MB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Absolute-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
FilterPatterns/Wildcard-4 1.63M ± 0% 1.63M ± 0% ~ (p=0.072 n=10+10)
FilterPatterns/ManyNoMatch-4 22.2k ± 0% 22.2k ± 0% ~ (all equal)
2020-10-07 20:47:29 +02:00
Michael Eischer
375c2a56de
filter: Parse filter patterns only once
...
name old time/op new time/op delta
FilterPatterns/Relative-4 30.3ms ±10% 23.6ms ±20% -22.12% (p=0.000 n=10+10)
FilterPatterns/Absolute-4 49.0ms ± 3% 32.3ms ± 8% -33.94% (p=0.000 n=8+10)
FilterPatterns/Wildcard-4 345ms ± 9% 334ms ±17% ~ (p=0.315 n=10+10)
FilterPatterns/ManyNoMatch-4 3.93s ± 2% 0.71s ± 7% -81.98% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 4.63MB ± 0% 3.57MB ± 0% -22.98% (p=0.000 n=9+9)
FilterPatterns/Absolute-4 8.54MB ± 0% 3.57MB ± 0% -58.20% (p=0.000 n=10+10)
FilterPatterns/Wildcard-4 146MB ± 0% 141MB ± 0% -2.93% (p=0.000 n=9+9)
FilterPatterns/ManyNoMatch-4 907MB ± 0% 4MB ± 0% -99.61% (p=0.000 n=9+9)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 66.6k ± 0% 22.2k ± 0% -66.67% (p=0.000 n=10+10)
FilterPatterns/Absolute-4 88.7k ± 0% 22.2k ± 0% -75.00% (p=0.000 n=10+10)
FilterPatterns/Wildcard-4 1.70M ± 0% 1.63M ± 0% -3.92% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 4.46M ± 0% 0.02M ± 0% -99.50% (p=0.000 n=10+10)
2020-10-07 20:47:27 +02:00
Michael Eischer
b8eacd1364
filter: Reduce redundant path and pattern splitting
...
A single call to filter.List will split the path only once and also
split each search pattern only once and use it for both match and
childMatch.
name old time/op new time/op delta
FilterPatterns/Relative-4 62.1ms ±15% 30.3ms ±10% -51.22% (p=0.000 n=9+10)
FilterPatterns/Absolute-4 111ms ±10% 49ms ± 3% -56.08% (p=0.000 n=10+8)
FilterPatterns/Wildcard-4 393ms ±15% 345ms ± 9% -12.30% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 10.0s ± 3% 3.9s ± 2% -60.53% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
FilterPatterns/Relative-4 16.4MB ± 0% 4.6MB ± 0% -71.76% (p=0.000 n=10+9)
FilterPatterns/Absolute-4 31.4MB ± 0% 8.5MB ± 0% -72.77% (p=0.000 n=9+10)
FilterPatterns/Wildcard-4 168MB ± 0% 146MB ± 0% -13.19% (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4 3.23GB ± 0% 0.91GB ± 0% -71.96% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
FilterPatterns/Relative-4 178k ± 0% 67k ± 0% -62.50% (p=0.000 n=10+10)
FilterPatterns/Absolute-4 266k ± 0% 89k ± 0% -66.67% (p=0.000 n=10+10)
FilterPatterns/Wildcard-4 1.87M ± 0% 1.70M ± 0% -9.47% (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4 17.7M ± 0% 4.5M ± 0% -74.87% (p=0.000 n=9+10)
2020-10-07 18:13:19 +02:00
Alexander Neumann
14aead94b3
filter: Allow double wildcard in ChildMatch
2018-06-09 23:18:13 +02:00
Michael Pratt
92eb1cbffd
filter: document recursive wildcards
...
Match/ChildMatch accept a ** pattern which is not noted in the doc
string, nor do any of the docs or tests specify whether the match is
greedy (i.e., can 'foo/**/bar' match paths with additional intermediate
bar directories?).
Add a note to the doc string and add test cases for greedy matches.
2017-09-04 14:38:48 -07:00
Loic Nageleisen
4a36993c19
Smarter filter when children won't match
...
This improves restore performance by several orders of magniture by not
going through the whole tree recursively when we can anticipate that no
match will ever occur.
2017-08-16 15:25:02 +02:00
Alexander Neumann
6caeff2408
Run goimports
2017-07-23 14:21:03 +02:00
Alexander Neumann
83d1a46526
Moves files
2017-07-23 14:19:13 +02:00