git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCH] submodule operations: tighten pathspec errors
Date: Mon, 6 Jun 2016 12:31:22 -0700	[thread overview]
Message-ID: <CAGZ79kY+K4-hbzL-KGgsyTHvM0BoXBLawvj2GmWJ9tbTuiSGbg@mail.gmail.com> (raw)
In-Reply-To: <xmqqy46owr0n.fsf@gitster.mtv.corp.google.com>

On Wed, Jun 1, 2016 at 2:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Thu, May 26, 2016 at 1:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>>> @@ -36,10 +37,9 @@ static int module_list_compute(int argc, const char **argv,
>>>>
>>>>       for (i = 0; i < active_nr; i++) {
>>>>               const struct cache_entry *ce = active_cache[i];
>>>> -
>>>> -             if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> -                                 0, ps_matched, 1) ||
>>>> -                 !S_ISGITLINK(ce->ce_mode))
>>>> +             if (!S_ISGITLINK(ce->ce_mode) ||
>>>> +                 !match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> +                                 0, ps_matched, 1))
>>>>                       continue;
>>>
>>> OK, this is the crucial bit in this patch. pathspec matches are now
>>> done only against gitlinks, so any unmatch is "pattern or path
>>> given, but there was no such submodule".
>>
>> right.
>
> That changes the semantics, and its user visible effect may deserve
> to be in the documentation, no?
>
> It used to be that "git submodule--helper list COPYING" did not
> complain, but with this change, it would.  Which may be a good
> change, but "git submodule--helper list sub*" where most but not all
> of glob expansion done by the shell are submodule directories may be
> a common thing people may do, and it may be irritating to see the
> unmatch complaints.  I dunno.

I fixed all the variable names and was confident with what I had,
but this is an issue indeed as the error message is plain wrong:
    git$ git submodule--helper list sub*
    error: pathspec 'submodule.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.h' did not match any file(s) known to git.
    error: pathspec 'submodule-config.o' did not match any file(s) known to git.
    error: pathspec 'submodule.h' did not match any file(s) known to git.
    error: pathspec 'submodule_multiple_references' did not match any
file(s) known to git.
    error: pathspec 'submodule.o' did not match any file(s) known to git.
    error: pathspec 'submodulespec.o' did not match any file(s) known to git.

(I realize I have some stray object files laying around)

So I do not think we can do

-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))
+               if (!S_ISGITLINK(ce->ce_mode) ||
+                   !match_pathspec(pathspec, ce->name, ce_namelen(ce),
+                                   0, ps_matched, 1))

or we need to have custom error reporting, which checks if any file
still matches the pathspec (and ignore non gitlinks)

>
> When we know we are not going to complain (i.e. --unmatch-ok option

I'd rather go with ignore-unmatch as git-rm does use that.

> is given from the command line), I think it is perfectly fine (and
> it is even preferrable) to swap the order of the check.  The mode
> check done with S_ISGITLINK() is much cheaper and it is likely to
> yield true much less often, which in turn would allow us to make
> fewer calls to more expensive match_pathspec().

As said, in that case we would then need a
    pathspec_mark_ps_matched(ce->name, ce_namelen(ce), ps_matched)

>
> But when we want to diagnose typo (i.e. --unmatch-ok was not given),
> we may want to preserve the current order, so that the "sub*"
> example in a few paragraphs ago would not irritate the users.

I see.

     if (!S_ISGITLINK(ce->ce_mode)) {
        pathspec_mark_ps_matched(...);
        continue;
    } else {
        if ( !match_pathspec(pathspec, ce->name, ce_namelen(ce),
                                     0, ps_matched, 1))
        continue;
    }
    ...

doesn't look very appealing, so I guess we'd just keep the current behavior
of
-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))


>
>>> It is tempting to update report_path_error() return "OK" when its
>>> first parameter is NULL.
>>
>> such that we can do a
>>
>>     if (report_path_error(unmatch_ok ? NULL : ps_matched, pathspec, prefix)))
>>         result = -1;
>
> I actually was thinking about setting ps_matched to NULL so that
> both match_pathspec() and report_path_error() would get NULL.
> match_pathspec() has to do _more_ work when ps_matched[] aka seen[]
> is given.

Yes for one call; No from the birds eye view;
the first lines that use the seen[]:

    if (seen && seen[i] == MATCHED_EXACTLY)
        continue;

so if we have  `ps->nr > 0` then it is beneficial to have a
seen array, I think?

>
>>>> @@ -407,7 +410,7 @@ cmd_foreach()
>>>>       # command in the subshell (and a recursive call to this function)
>>>>       exec 3<&0
>>>>
>>>> -     git submodule--helper list --prefix "$wt_prefix"|
>>>> +     git submodule--helper list ${unmatch:+--unmatch} --prefix "$wt_prefix"|
>>>
>>> For this to work, somebody must ensure that the variable unmatch is
>>> either unset or set to empty unless the user gave --error-unmatch to
>>> us.  There is a block of empty assignments hear the beginning of
>>> this file for that very purpose, i.e. resetting a stray environment
>>> variable that could be in user's environment.

done.

>>>
>>> The patch itself does not look too bad.  I do not have an opinion on
>>> which one should be the default, and I certainly would understand if
>>> you want to keep the default loose (i.e. not complaining) with an
>>> optional error checking, but whichever default you choose, the
>>> option and variable names need to be clarified to avoid confusion.
>>
>> Ok I'll fix the variable names; I think for consistency with e.g.
>> ls-files --error-unmatch
>> we would want to be loose by default and strict on that option.
>
> I do not think the "typo-prevention" safety measure should suddenly
> be turned into opt-in; I'd suggest "--unmatch-ok" instead.

--ignore-unmatch has the same meaning and is taken by git-rm already?

      parent reply	other threads:[~2016-06-06 19:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-21  1:21 [PATCH] submodule operations: tighten pathspec errors Stefan Beller
2016-05-26 20:00 ` Junio C Hamano
2016-06-01 20:55   ` Stefan Beller
2016-06-01 21:14     ` Junio C Hamano
2016-06-01 21:20       ` Junio C Hamano
2016-06-06 19:31       ` Stefan Beller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kY+K4-hbzL-KGgsyTHvM0BoXBLawvj2GmWJ9tbTuiSGbg@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).