git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Johannes Sixt <j6t@kdbg.org>,
	Danial Alihosseini <danial.alihosseini@gmail.com>,
	Jeff King <peff@peff.net>, Derrick Stolee <dstolee@microsoft.com>,
	git@vger.kernel.org
Subject: Re: git 2.34.0: Behavior of `**` in gitignore is different from previous versions.
Date: Fri, 19 Nov 2021 21:57:25 +0100	[thread overview]
Message-ID: <211119.86zgpz272g.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <429375f7-ec3e-596f-5f79-c724570c8397@gmail.com>


On Fri, Nov 19 2021, Derrick Stolee wrote:

> On 11/19/2021 3:05 PM, Johannes Sixt wrote:
>> Am 19.11.21 um 15:51 schrieb Derrick Stolee:
>>> What is unclear to me is what exactly "match a directory" means.
>>> If we ignore a directory, then we ignore everything inside it (until
>>> another pattern says we should care about it), but the converse
>>> should also hold: if we have a pattern like "!data/**/", then that
>>> should mean "include everything inside data/<A>/ where <A> is any
>>> directory name".
>>>
>>> My inability to form a mental model where the existing behavior
>>> matches the documented specification is an indicator that this was
>>> changed erroneously. A revert patch is included at the end of this
>>> message.
>>>
>>> If anyone could help clarify my understanding here, then maybe
>>> there is room for improving the documentation.
>> 
>> You form a wrong mental model when you start with the grand picture of a
>> working tree. That is, when you say
>> 
>> - here I have theeeeeese many files and directories,
>> - and I want to ignore some: foo/**/,
>> - but I don't want to ignore others: !bar/**/.
>> 
>> This forms the wrong mental model because that is not how Git sees the
>> working tree: it never has a grand picture of all of its contents.
>> 
>> Git only ever sees the contents of one directory. When Git determines
>> that a sub-directory is ignored, then that one's contents are never
>> inspected, and there is no opportunity to un-ignore some of the
>> sub-directory's contents.
>
> So the problem is this: I want to know "I have a file named <X>, and
> a certain pattern set, does <X> match the patterns or not?" but in
> fact it's not just "check <X> against the patterns in order" but
> actually "check every parent directory of <X> in order to see if
> any directory is unmatched, which would preclude any later matches
> to other parents of <X>"
>
> So really, to check a path, we really want to first iterate on the
> parent directories. If we get a match on a positive pattern on level
> i, then we check level (i+1) for a match on a negative pattern. If
> we find that negative pattern match, then continue. If we do not see
> a negative match, then we terminate by matching the entire path <X>.
>
> I'm still not seeing a clear way of describing the matching procedure
> here for a single path, and that's fine. Me understanding is not a
> necessary condition for fixing this bug.

Just watching this thread on the sidelines I think it would help if it
can be distilled down to a wildatch() test that doesn't have to do with
the pathspec matching code.

I.e. can you stick the "should this match?" into t3070 and it does the
same thing, or is this to do with the pathspec-specific sugar on top,
either that it splits paths and then matches them, that there's some
information about the path type in there added on top, or that it's to
do with the specifics of the exclude/include gitignore matching?

FWIW I have some old WIP patches somewhere where I made this match
behavior much faster by compiling the (using a mode PCREv2 has) glob
syntax into PCRE's, which are then JIT'ed, and matched.

To do that I had to unpeel this whole truncation of the pattern thing,
and IIRC it didn't matter for speed (or maybe it did just with the
wildmatch code?).

Maybe all of this is irrelevant, sorry. I haven't looked into this issue
at all, just skimmed this growing thread over the past day, maybe some
of the above helps, or not...

  reply	other threads:[~2021-11-19 21:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 16:41 git 2.34.0: Behavior of `**` in gitignore is different from previous versions Danial Alihosseini
2021-11-18 17:04 ` Jeff King
2021-11-18 22:09   ` Derrick Stolee
2021-11-19  4:01     ` Danial Alihosseini
2021-11-19 14:51       ` Derrick Stolee
2021-11-19 17:06         ` Danial Alihosseini
2021-11-19 20:05         ` Johannes Sixt
2021-11-19 20:33           ` Derrick Stolee
2021-11-19 20:57             ` Ævar Arnfjörð Bjarmason [this message]
2021-11-20 22:41             ` Chris Torek
2021-11-21  0:46               ` Junio C Hamano
2021-11-23 12:21                 ` Philip Oakley
2021-11-23 21:13                   ` Danial Alihosseini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=211119.86zgpz272g.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=danial.alihosseini@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=j6t@kdbg.org \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).