From: Duy Nguyen <pclouds@gmail.com>
To: dana <dana@dana.is>
Cc: git@vger.kernel.org, "Ævar Arnfjörð" <avarab@gmail.com>,
"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [BUG] gitignore documentation inconsistent with actual behaviour
Date: Sat, 20 Oct 2018 07:26:24 +0200 [thread overview]
Message-ID: <20181020052624.GA31433@duynguyen.home> (raw)
In-Reply-To: <C16A9F17-0375-42F9-90A9-A92C9F3D8BBA@dana.is>
On Thu, Oct 11, 2018 at 05:19:06AM -0500, dana wrote:
> Hello,
>
> I'm a contributor to ripgrep, which is a grep-like tool that supports using
> gitignore files to control which files are searched in a repo (or any other
> directory tree). ripgrep's support for the patterns in these files is based on
> git's official documentation, as seen here:
>
> https://git-scm.com/docs/gitignore
>
> One of the most common reports on the ripgrep bug tracker is that it does not
> allow patterns like the following real-world examples, where a ** is used along
> with other text within the same path component:
>
> **/**$$*.java
> **.orig
> **local.properties
> !**.sha1
>
> The reason it doesn't allow them is that the gitignore documentation explicitly
> states that they're invalid:
>
> ...
I've checked the code and run some tests. There is a twist here. "**"
is only special when matched in "pathname" mode. That is when the
pattern contains at least one slash. In your patterns above, that only
applies to the first pattern.
When '**' is special, if it's neither '**/', '/**/' or '/**', it _is_
considered invalid (i.e. bad pattern) and the pattern will not match
anything.
The confusion comes from when '**' is not special for the remaining
three patterns, it's considered as regular '*' and still matches
stuff.
So, I think we have two options. The document could be clarified with
something like this
-- 8< --
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index d107daaffd..500cd43939 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -100,7 +100,8 @@ PATTERN FORMAT
a shell glob pattern and checks for a match against the
pathname relative to the location of the `.gitignore` file
(relative to the toplevel of the work tree if not from a
- `.gitignore` file).
+ `.gitignore` file). Note that the "two consecutive asterisks" rule
+ below does not apply.
- Otherwise, Git treats the pattern as a shell glob: "`*`" matches
anything except "`/`", "`?`" matches any one character except "`/`"
@@ -129,7 +130,8 @@ full pathname may have special meaning:
matches zero or more directories. For example, "`a/**/b`"
matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
- - Other consecutive asterisks are considered invalid.
+ - Other consecutive asterisks are considered invalid and the pattern
+ is ignored.
NOTES
-----
-- 8< --
Or we could make the behavior consistent. If '**' is invalid, just
consider it two separate regular '*'. Then all four of your patterns
will behave the same way. The change for that is quite simple
-- 8< --
diff --git a/wildmatch.c b/wildmatch.c
index d074c1be10..64087bf02c 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -104,8 +104,10 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
dowild(p + 1, text, flags) == WM_MATCH)
return WM_MATCH;
match_slash = 1;
- } else
- return WM_ABORT_MALFORMED;
+ } else {
+ /* without WM_PATHNAME, '*' == '**' */
+ match_slash = flags & WM_PATHNAME ? 0 : 1;
+ }
} else
/* without WM_PATHNAME, '*' == '**' */
match_slash = flags & WM_PATHNAME ? 0 : 1;
-- 8< --
Which way should we go? I'm leaning towards the second one...
--
Duy
next prev parent reply other threads:[~2018-10-20 5:26 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-11 10:19 [BUG] gitignore documentation inconsistent with actual behaviour dana
2018-10-11 10:37 ` dana
2018-10-11 11:08 ` Ævar Arnfjörð Bjarmason
2018-10-14 2:14 ` dana
2018-10-14 12:15 ` Duy Nguyen
2018-10-14 22:56 ` Junio C Hamano
2018-10-15 15:27 ` Duy Nguyen
2018-10-20 5:26 ` Duy Nguyen [this message]
2018-10-20 5:53 ` dana
2018-10-20 6:03 ` Duy Nguyen
2018-10-20 6:26 ` dana
2018-10-27 8:48 ` [PATCH] wildmatch: change behavior of "foo**bar" in WM_PATHNAME mode Nguyễn Thái Ngọc Duy
2018-10-28 6:25 ` Torsten Bögershausen
2018-10-28 6:35 ` Duy Nguyen
2018-10-29 2:28 ` Junio C Hamano
2018-10-29 13:24 ` Ævar Arnfjörð Bjarmason
2018-10-29 15:53 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181020052624.GA31433@duynguyen.home \
--to=pclouds@gmail.com \
--cc=avarab@gmail.com \
--cc=dana@dana.is \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).