git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git Mailing List <git@vger.kernel.org>, Samuel Lijin <sxlijin@gmail.com>
Subject: Re: [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item
Date: Thu, 5 Apr 2018 13:06:30 -0700	[thread overview]
Message-ID: <CABPp-BF0GaYMubkckqwRSo_2J_JNtd+xOtSUOLUZH3xhE0rRHQ@mail.gmail.com> (raw)
In-Reply-To: <20180405190446.GB21164@sigill.intra.peff.net>

On Thu, Apr 5, 2018 at 12:04 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Apr 05, 2018 at 11:36:45AM -0700, Elijah Newren wrote:
>
>> > Do we care about matching the name "foo" against the patchspec_item "foo/"?
>> >
>> > That matches now, but wouldn't after your patch.
>>
<snip>
>> So I should probably make the check handle both cases:
>>
>> @@ -383,8 +383,9 @@ static int match_pathspec_item(const struct
>> pathspec_item *item, int prefix,
>>         /* Perform checks to see if "name" is a super set of the pathspec */
>>         if (flags & DO_MATCH_LEADING_PATHSPEC) {
>>                 /* name is a literal prefix of the pathspec */
>> +               int offset = name[namelen-1] == '/' ? 1 : 0;
>>                 if ((namelen < matchlen) &&
>> -                   (match[namelen] == '/') &&
>> +                   (match[namelen-offset] == '/') &&
>>                     !ps_strncmp(item, match, name, namelen))
>>                         return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
>
> That seems reasonable to me, and your "offset" trick here should prevent
> us from getting confused. Can namelen ever be zero here? I guess
> probably not (I could see an empty pathspec, but an empty path does not
> make sense).

Right, I don't see how an empty path would make sense.

> There are other similar trailing-slash matches in that function, but I'm
> not sure of all the cases in which they're used. I don't know if any of
> those would need similar treatment (sorry for being vague; I expect I'd
> need a few hours to dig into how the pathspec code actually works, and I
> don't have that today).

If it'd only take you a few hours, then you're a lot faster than me.
It took me a while to start wrapping my head around it.

The other trailing-slash matches in the function are all correct,
according to the testsuite.  (I'm not sure I like the
DO_MATCH_DIRECTORY stuff, but it is encoded in tests and backward
compatibility is important.)  In particular, changing the earlier code
to have the same offset trick would make it claim that e.g. either
"a/b" or "a/b/" as names match unconditionally against "a/b/c" as a
pathspec.  We need it to be conditional: we only want that to be
considered a match when checking whether we want to recurse into the
directory for other matches, not when checking whether the directory
itself matches the pathspec.  Thus, it should be behind a separate
flag, in a subsequent check, which is what this series does (namely
with DO_MATCH_LEADING_PATHSPEC).

To be more precise, here is how a matrix of pathnames and pathspecs
would be treated by match_pathspec_item(), where I am abbreviating
names like MATCH_RECURSIVELY_LEADING_PATHSPEC to LEADING):

                               Pathspecs
                |    a/b    |    a/b/    |   a/b/c
          ------+-----------+------------+-----------
          a/b   |  EXACT    | RECURSIVE  |  LEADING[3]
  Names   a/b/  |  EXACT[1] |  EXACT     |  LEADING[2]
          a/b/c | RECURSIVE | RECURSIVE  |  EXACT

[1] Only if DO_MATCH_DIRECTORY is passed.  Otherwise,
    this is NOT a match at all.
[2] Only if DO_MATCH_LEADING_PATHSPEC is passed,
    after applying this series.  Otherwise, not a match
    at all.
[3] Without the fix in this thread that you highlighted,
    and assuming we apply patch 7, this would actually
    mistakenly return RECURSIVE.


Now for a separate question: How much of the above would you like
added to the commit message...or even as a comment in the code to make
it clearer to other folks trying to make sense of it?


Elijah

  reply	other threads:[~2018-04-05 20:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
2018-04-05 17:49   ` Jeff King
2018-04-05 18:36     ` Elijah Newren
2018-04-05 19:04       ` Jeff King
2018-04-05 20:06         ` Elijah Newren [this message]
2018-04-06 17:53           ` Jeff King
2018-04-05 17:34 ` [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
2018-04-05 18:58   ` Jeff King
2018-04-05 19:15     ` Elijah Newren
2018-04-05 19:31       ` Jeff King
2018-04-09  2:07         ` Junio C Hamano
2018-04-05 17:34 ` [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BF0GaYMubkckqwRSo_2J_JNtd+xOtSUOLUZH3xhE0rRHQ@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sxlijin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).