Commit Graph

28 Commits

Author SHA1 Message Date
Michael Eischer a1eb923876 remove no longer necessary conditional compiles 2022-11-27 13:18:44 +01:00
Michael Eischer 8e0ca80547 filter: deduplicate error handling for pattern validation 2022-09-09 23:12:41 +02:00
Lorenz Bausch 36bd464e8c
Add tests for validating exclude patterns 2022-05-11 22:41:00 +02:00
Lorenz Bausch 9fb81c4246
Validate exclude patterns 2022-05-07 21:12:47 +02:00
Lorenz Bausch e7fd200237
Keep original pattern for later use 2022-05-07 21:08:09 +02:00
Michael Eischer cd190bee14 filter: short circuit if no negative patterns 2022-03-20 13:33:08 +01:00
Vincent Bernat 2ee07ded2b filter: ability to use negative patterns
This is quite similar to gitignore. If a pattern is suffixed by an
exclamation mark and match a file that was previously matched by a
regular pattern, the match is cancelled. Notably, this can be used
with `--exclude-file` to cancel the exclusion of some files.

Like for gitignore, once a directory is excluded, it is not possible
to include files inside the directory. For example, a user wanting to
only keep `*.c` in some directory should not use:

    ~/work
    !~/work/*.c

But:

    ~/work/*
    !~/work/*.c

I didn't write documentation or changelog entry. I would like to get
feedback if this is the right approach for excluding/including files
at will for backups. I use something like this as an exclude file to
backup my home:

    $HOME/**/*
    !$HOME/Documents
    !$HOME/code
    !$HOME/.emacs.d
    !$HOME/games
    # [...]
    node_modules
    *~
    *.o
    *.lo
    *.pyc
    # [...]
    $HOME/code/linux/*
    !$HOME/code/linux/.git
    # [...]

There are some limitations for this change:

 - Patterns are not mixed accross methods: patterns from file are
   handled first and if a file is excluded with this method, it's not
   possible to reinclude it with `--exclude !something`.

 - Patterns starting with `!` are now interpreted as a negative
   pattern. I don't think anyone was relying on that.

 - The whole list of patterns is walked for each match. We may
   optimize later by exiting early if we know no pattern is starting
   with `!`.

Fix #233
2022-03-20 13:33:08 +01:00
Michael Eischer 12606b575f filter: Cleanup variable naming 2022-03-20 13:33:08 +01:00
Michael Eischer 5f145f0c7e filter: introduce pattern struct 2022-03-20 13:33:08 +01:00
Vincent Bernat 13c40d4199 filter: additional tests for filter.List() 2022-03-20 13:33:08 +01:00
Michael Eischer 55bea6e7a6 filter: Fix crash for '**' pattern 2021-05-14 23:50:31 +02:00
Michael Eischer 88c8e903d2 filter: Fix glob matching on absolute path marker on windows
A pattern part containing "/" is used to mark a path or a pattern as
absolute. However, on Windows the path separator is "\" such that glob
patterns like "?" could match the marker. The code now explicitly skips
the marker when the pattern does not represent an absolute path.
2020-10-09 16:11:05 +02:00
greatroar 740758a5fa Optimize filter pattern matching
By replacing "**" with "", checking for this special path component can
be reduced to a length-zero check.

name                          old time/op    new time/op    delta
FilterLines-8                   44.7ms ± 5%    44.9ms ± 5%     ~     (p=0.631 n=10+10)
FilterPatterns/Relative-8       13.6ms ± 4%    13.4ms ± 5%     ~     (p=0.165 n=10+10)
FilterPatterns/Absolute-8       10.9ms ± 5%    10.7ms ± 4%     ~     (p=0.052 n=10+10)
FilterPatterns/Wildcard-8       53.7ms ± 5%    50.4ms ± 5%   -6.00%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-8     128ms ± 2%      95ms ± 1%  -25.54%  (p=0.000 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-8       3.57MB ± 0%    3.57MB ± 0%     ~     (p=1.000 n=9+8)
FilterPatterns/Absolute-8       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.903 n=9+8)
FilterPatterns/Wildcard-8       19.7MB ± 0%    19.7MB ± 0%   -0.00%  (p=0.022 n=10+9)
FilterPatterns/ManyNoMatch-8    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-8        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-8        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-8        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-8     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-09 15:51:14 +02:00
Michael Eischer 8388e67c4b filter: cleanup path separator conversion 2020-10-07 21:14:07 +02:00
Michael Eischer 0acc3c5923 filter: special case patterns without globbing characters
In case a part of a path is a simple string, we can just check for
equality without complex parsing in filepath.Match.

name                          old time/op    new time/op    delta
FilterLines-4                   34.8ms ±17%    41.2ms ±23%  +18.36%  (p=0.000 n=10+10)
FilterPatterns/Relative-4       21.7ms ± 6%    12.1ms ±23%  -44.46%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4       10.0ms ± 5%     9.1ms ±11%   -9.80%  (p=0.006 n=10+9)
FilterPatterns/Wildcard-4       47.0ms ± 7%    42.2ms ± 5%  -10.19%  (p=0.000 n=9+10)
FilterPatterns/ManyNoMatch-4     190ms ± 1%     131ms ±20%  -31.47%  (p=0.000 n=8+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.870 n=9+9)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.145 n=10+10)
FilterPatterns/Wildcard-4       14.3MB ± 0%    19.7MB ± 0%  +37.91%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.421 n=10+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:55:43 +02:00
Michael Eischer 54a124de3b filter: explicitly test separate ListWithChild function 2020-10-07 20:47:52 +02:00
Michael Eischer bcc3bddcf4 filter: only check whether a child path could match when necessary
When checking excludes there is no need to test whether a child path
could also match the pattern, as it is by definition excluded.
Previously childMayMatch was calculated but then discarded. For simple
absolute paths this can account for half the time spent for checking
pattern matches.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       23.3ms ± 9%    21.7ms ± 6%   -6.68%  (p=0.004 n=10+10)
FilterPatterns/Absolute-4       13.9ms ± 7%    10.0ms ± 5%  -27.61%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4       51.4ms ± 7%    47.0ms ± 7%   -8.51%  (p=0.001 n=9+9)
FilterPatterns/ManyNoMatch-4     551ms ± 9%     190ms ± 1%  -65.41%  (p=0.000 n=10+8)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.665 n=10+9)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.480 n=9+10)
FilterPatterns/Wildcard-4       14.3MB ± 0%    14.3MB ± 0%     ~     (p=0.431 n=9+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:52 +02:00
Michael Eischer 17c53efb0d filter: Optimize double wildcard expansion
This only allocates a single slice to expand the double wildcard and
only copies the pattern prefix once.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       22.7ms ± 5%    23.3ms ± 9%     ~     (p=0.353 n=10+10)
FilterPatterns/Absolute-4       14.2ms ±13%    13.9ms ± 7%     ~     (p=0.853 n=10+10)
FilterPatterns/Wildcard-4        266ms ±16%      51ms ± 7%  -80.67%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4     554ms ± 6%     551ms ± 9%     ~     (p=0.436 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.349 n=10+10)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.073 n=10+9)
FilterPatterns/Wildcard-4        141MB ± 0%      14MB ± 0%  -89.89%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        1.63M ± 0%     0.09M ± 0%  -94.56%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:48 +02:00
Michael Eischer 7959796269 filter: Special case for absolute paths
name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       23.6ms ±20%    22.7ms ± 5%     ~     (p=0.684 n=10+10)
FilterPatterns/Absolute-4       32.3ms ± 8%    14.2ms ±13%  -56.01%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        334ms ±17%     266ms ±16%  -20.56%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     709ms ± 7%     554ms ± 6%  -21.89%  (p=0.000 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%   +0.00%  (p=0.046 n=9+10)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.464 n=10+10)
FilterPatterns/Wildcard-4        141MB ± 0%     141MB ± 0%     ~     (p=0.163 n=9+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        1.63M ± 0%     1.63M ± 0%     ~     (p=0.072 n=10+10)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:29 +02:00
Michael Eischer 375c2a56de filter: Parse filter patterns only once
name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       30.3ms ±10%    23.6ms ±20%  -22.12%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4       49.0ms ± 3%    32.3ms ± 8%  -33.94%  (p=0.000 n=8+10)
FilterPatterns/Wildcard-4        345ms ± 9%     334ms ±17%     ~     (p=0.315 n=10+10)
FilterPatterns/ManyNoMatch-4     3.93s ± 2%     0.71s ± 7%  -81.98%  (p=0.000 n=9+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       4.63MB ± 0%    3.57MB ± 0%  -22.98%  (p=0.000 n=9+9)
FilterPatterns/Absolute-4       8.54MB ± 0%    3.57MB ± 0%  -58.20%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        146MB ± 0%     141MB ± 0%   -2.93%  (p=0.000 n=9+9)
FilterPatterns/ManyNoMatch-4     907MB ± 0%       4MB ± 0%  -99.61%  (p=0.000 n=9+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        66.6k ± 0%     22.2k ± 0%  -66.67%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4        88.7k ± 0%     22.2k ± 0%  -75.00%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        1.70M ± 0%     1.63M ± 0%   -3.92%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     4.46M ± 0%     0.02M ± 0%  -99.50%  (p=0.000 n=10+10)
2020-10-07 20:47:27 +02:00
Michael Eischer b8eacd1364 filter: Reduce redundant path and pattern splitting
A single call to filter.List will split the path only once and also
split each search pattern only once and use it for both match and
childMatch.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       62.1ms ±15%    30.3ms ±10%  -51.22%  (p=0.000 n=9+10)
FilterPatterns/Absolute-4        111ms ±10%      49ms ± 3%  -56.08%  (p=0.000 n=10+8)
FilterPatterns/Wildcard-4        393ms ±15%     345ms ± 9%  -12.30%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     10.0s ± 3%      3.9s ± 2%  -60.53%  (p=0.000 n=10+9)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       16.4MB ± 0%     4.6MB ± 0%  -71.76%  (p=0.000 n=10+9)
FilterPatterns/Absolute-4       31.4MB ± 0%     8.5MB ± 0%  -72.77%  (p=0.000 n=9+10)
FilterPatterns/Wildcard-4        168MB ± 0%     146MB ± 0%  -13.19%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4    3.23GB ± 0%    0.91GB ± 0%  -71.96%  (p=0.000 n=10+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4         178k ± 0%       67k ± 0%  -62.50%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4         266k ± 0%       89k ± 0%  -66.67%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        1.87M ± 0%     1.70M ± 0%   -9.47%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     17.7M ± 0%      4.5M ± 0%  -74.87%  (p=0.000 n=9+10)
2020-10-07 18:13:19 +02:00
Michael Eischer e73c281142 filter: Benchmark absolute paths, wildcards and long filter lists 2020-10-07 17:54:36 +02:00
Michael Eischer fdd3b14db3 filter: test some corner cases 2020-10-07 17:09:44 +02:00
Alexander Neumann 14aead94b3 filter: Allow double wildcard in ChildMatch 2018-06-09 23:18:13 +02:00
Michael Pratt 92eb1cbffd filter: document recursive wildcards
Match/ChildMatch accept a ** pattern which is not noted in the doc
string, nor do any of the docs or tests specify whether the match is
greedy (i.e., can 'foo/**/bar' match paths with additional intermediate
bar directories?).

Add a note to the doc string and add test cases for greedy matches.
2017-09-04 14:38:48 -07:00
Loic Nageleisen 4a36993c19 Smarter filter when children won't match
This improves restore performance by several orders of magniture by not
going through the whole tree recursively when we can anticipate that no
match will ever occur.
2017-08-16 15:25:02 +02:00
Alexander Neumann 6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann 83d1a46526 Moves files 2017-07-23 14:19:13 +02:00