From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>, dana@dana.is
Subject: Re: [PATCH] wildmatch: change behavior of "foo**bar" in WM_PATHNAME mode
Date: Mon, 29 Oct 2018 14:24:00 +0100 [thread overview]
Message-ID: <87h8h4lwcv.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20181027084823.23382-1-pclouds@gmail.com>
On Sat, Oct 27 2018, Nguyễn Thái Ngọc Duy wrote:
> In WM_PATHNAME mode (or FNM_PATHNAME), '*' does not match '/' and '**'
> can but only in three patterns:
>
> - '**/' matches zero or more leading directories
> - '/**/' matches zero or more directories in between
> - '/**' matches zero or more trailing directories/files
>
> When '**' is present but not in one of these patterns, the current
> behavior is consider the pattern invalid and stop matching. In other
> words, 'foo**bar' never matches anything, whatever you throw at it.
>
> This behavior is arguably a bit confusing partly because we can't
> really tell the user their pattern is invalid so that they can fix
> it. So instead, tolerate it and make '**' act like two regular '*'s
> (which is essentially the same as a single asterisk). This behavior
> seems more predictable.
>
> Noticed-by: dana <dana@dana.is>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> Documentation/gitignore.txt | 3 ++-
> t/t3070-wildmatch.sh | 4 ++--
> wildmatch.c | 4 ++--
> wildmatch.h | 1 -
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
> index d107daaffd..1c94f08ff4 100644
> --- a/Documentation/gitignore.txt
> +++ b/Documentation/gitignore.txt
> @@ -129,7 +129,8 @@ full pathname may have special meaning:
> matches zero or more directories. For example, "`a/**/b`"
> matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
>
> - - Other consecutive asterisks are considered invalid.
> + - Other consecutive asterisks are considered regular asterisks and
> + will match according to the previous rules.
>
> NOTES
> -----
> diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
> index 46aca0af10..891d4d7cb9 100755
> --- a/t/t3070-wildmatch.sh
> +++ b/t/t3070-wildmatch.sh
> @@ -237,7 +237,7 @@ match 0 0 0 0 foobar 'foo\*bar'
> match 1 1 1 1 'f\oo' 'f\\oo'
> match 1 1 1 1 ball '*[al]?'
> match 0 0 0 0 ten '[ten]'
> -match 0 0 1 1 ten '**[!te]'
> +match 1 1 1 1 ten '**[!te]'
> match 0 0 0 0 ten '**[!ten]'
> match 1 1 1 1 ten 't[a-g]n'
> match 0 0 0 0 ten 't[!a-g]n'
> @@ -253,7 +253,7 @@ match 1 1 1 1 ']' ']'
> # Extended slash-matching features
> match 0 0 1 1 'foo/baz/bar' 'foo*bar'
> match 0 0 1 1 'foo/baz/bar' 'foo**bar'
> -match 0 0 1 1 'foobazbar' 'foo**bar'
> +match 1 1 1 1 'foobazbar' 'foo**bar'
> match 1 1 1 1 'foo/baz/bar' 'foo/**/bar'
> match 1 1 0 0 'foo/baz/bar' 'foo/**/**/bar'
> match 1 1 1 1 'foo/b/a/z/bar' 'foo/**/bar'
> diff --git a/wildmatch.c b/wildmatch.c
> index d074c1be10..9e9e2a2f95 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -104,8 +104,8 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
> dowild(p + 1, text, flags) == WM_MATCH)
> return WM_MATCH;
> match_slash = 1;
> - } else
> - return WM_ABORT_MALFORMED;
> + } else /* WM_PATHNAME is set */
> + match_slash = 0;
> } else
> /* without WM_PATHNAME, '*' == '**' */
> match_slash = flags & WM_PATHNAME ? 0 : 1;
> diff --git a/wildmatch.h b/wildmatch.h
> index b8c826aa68..5993696298 100644
> --- a/wildmatch.h
> +++ b/wildmatch.h
> @@ -4,7 +4,6 @@
> #define WM_CASEFOLD 1
> #define WM_PATHNAME 2
>
> -#define WM_ABORT_MALFORMED 2
> #define WM_NOMATCH 1
> #define WM_MATCH 0
> #define WM_ABORT_ALL -1
This patch looks good to me, but I think it's a bad state of affairs to
keep changing these semantics and not having something like a
"gitwildmatch" doc were we document this matching syntax.
Also I still need to dig up the work for using PCRE as an alternate
matching engine, the PCRE devs produced a bug-for-bug compatible version
of our wildmatch function (all the more reason to document it), so I
think they'll need to change it now that this is in, but I haven't
rebased those ancient patches yet.
Do you have any thoughts on how to proceed with getting this documented
/ into some stable state where we can specify it? Even if we don't end
up using PCRE as a matching engine (sometimes it was faster, sometimes
slower) I think it would be very useful if we can spew out "here's your
pattern as a regex" for self-documentation purposes.
Then that can be piped into e.g. "perl -Mre=debug" to see a step-by-step
guide for how the pattern compiles, and why it does or doesn't match a
given thing.
next prev parent reply other threads:[~2018-10-29 13:24 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-11 10:19 [BUG] gitignore documentation inconsistent with actual behaviour dana
2018-10-11 10:37 ` dana
2018-10-11 11:08 ` Ævar Arnfjörð Bjarmason
2018-10-14 2:14 ` dana
2018-10-14 12:15 ` Duy Nguyen
2018-10-14 22:56 ` Junio C Hamano
2018-10-15 15:27 ` Duy Nguyen
2018-10-20 5:26 ` Duy Nguyen
2018-10-20 5:53 ` dana
2018-10-20 6:03 ` Duy Nguyen
2018-10-20 6:26 ` dana
2018-10-27 8:48 ` [PATCH] wildmatch: change behavior of "foo**bar" in WM_PATHNAME mode Nguyễn Thái Ngọc Duy
2018-10-28 6:25 ` Torsten Bögershausen
2018-10-28 6:35 ` Duy Nguyen
2018-10-29 2:28 ` Junio C Hamano
2018-10-29 13:24 ` Ævar Arnfjörð Bjarmason [this message]
2018-10-29 15:53 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h8h4lwcv.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=dana@dana.is \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).