git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Σταύρος Ντέντος" <stdedos@gmail.com>
Cc: git <git@vger.kernel.org>,
	"Σταύρος Ντέντος" <stdedos+git@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>
Subject: Re: [PATCH v2 1/1] pathspec: warn for a no-glob entry that contains `**`
Date: Thu, 25 Mar 2021 17:41:45 -0700	[thread overview]
Message-ID: <xmqq35wiu4om.fsf@gitster.g> (raw)
In-Reply-To: <20210325233648.31162-2-stdedos+git@gmail.com> ("Σταύρος	Ντέντος"'s message of "Fri, 26 Mar 2021 01:36:49 +0200")

"Σταύρος Ντέντος"  <stdedos@gmail.com> writes:

> From: Stavros Ntentos <133706+stdedos@users.noreply.github.com>
>
> If a pathspec given that contains `**`, chances are that someone is
> naively expecting that it will do what the manual has told him
> (i.e. that `**` will match 0-or-more directories).

OK. this is a well written introduction of the problem being solved.

But chances are that the user may have meant to give double-asterisk
just out of habit, very well knowing that it would not be an error
to give ** when a single * suffices.

Which leads us to suspect that it is a bad change to make this a
warning that unconditionally fires and cannot squelch.  It may be
better to implement it as an advice, where the user who knowingly
uses that construct can say "yes, I know what I am doing" and
squelch it.

> However, without an explicit `:(glob)` magic, that will fall out the sky:
> the two `**` will merge into one star, which surrounded by slashes, will
> match any directory name.

I am not sure everything after the "the sky:" you wrote is what you
meant to write.  Without being marked with a "glob" magic, a
wildcard character in a pattern matches even a slash, so these two

	git ls-files 'Documentation**v2.txt'
	git ls-files 'Documentation*v2.txt'

give the identical result and there is nothing about "surrounded by
slashes" involved in it.

So, perhaps taking the first two paragraphs together and rewriting:

	When '/**/' appears in the pathspec, the user may be
	expecting that it would be matched using the "wildmatch"
	semantics, matching 0 or more directories.  But that is
	not what happens without ":(glob)" magic.

or did you want to warn about "foo**bar" as if it were "foo/**/bar"?

The output from these commands:

	git ls-files ':(glob)Documentation/**/*stash.txt'
	git ls-files ':(glob)Documentation***stash.txt'
	git ls-files ':(glob)Documentation**stash.txt'

seems to tell me that the "zero-or-more directories" magic happens
only when /**/ form is used, not when two asterisks are placed next
to each other in a random context.

> These changes attempt to bring awareness to this issue.

In any case, after presenting what problem we are trying to address,
we present our approach / solution in a form as if we are giving
orders to the codebase to "become like so".

	Teach the pathspec parser to emit an advice message when a
	substring "/**/" appears in a pathspec element that does not
	have a ":(glob)" magic.  Make sure we don't disturb users
	who use ":(literal)" magic with such a substring, as it is
	clear they want to find these strings literally.

perhaps.

> Signed-off-by: Stavros Ntentos <133706+stdedos@users.noreply.github.com>
> ---
>  pathspec.c                 | 13 +++++++++++++
>  pathspec.h                 |  1 +
>  t/t6130-pathspec-noglob.sh | 13 +++++++++++++
>  3 files changed, 27 insertions(+)
>
> diff --git a/pathspec.c b/pathspec.c
> index 7a229d8d22..d5b9c0d792 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -1,3 +1,4 @@
> +#include <string.h>

Never include any header file before including <git-compat-util.h>.
Including <git-compat-util.h> indirectly by including <cache.h> or
<builtin.h> counts as including it as the first thing.

As all necessary standard system headers are supposed to be given by
including <git-compat-util.h>, if you needed to explicitly include
the above, that may mean <git-compat-util.h>, which should cause the
header that lists necessary function, is not working properly on
your platform and needs to be adjusted. Are you on a minority
platform we haven't adjusted the header inclusion order for, and
what function are you trying to use that does not work without
adding this extra #include here?  We already use strstr() in many
places, so that is not the function we are missing system includes
for.

Or perhaps this is a remnant of what you added while you were
experimenting, without realizing that "#include <cache.h>" that is
already there already gives you anything you need.  If that is the
case, iow, if the code works without the extra include, please
remove that line.

> @@ -588,6 +589,8 @@ void parse_pathspec(struct pathspec *pathspec,
>  
>  		init_pathspec_item(item + i, flags, prefix, prefixlen, entry);
>  
> +		check_missing_glob(entry, item[i].magic);
> +

We have only one caller of the helper (i.e. this one) and nowhere
else, and the helper is small enough ...

> @@ -739,3 +742,13 @@ int match_pathspec_attrs(const struct index_state *istate,
>  
>  	return 1;
>  }
> +
> +void check_missing_glob(const char *pathspec_entry, int flags) {
> +	if (flags & (PATHSPEC_GLOB | PATHSPEC_LITERAL)) {
> +		return;
> +	}
> +
> +	if (strstr(pathspec_entry, "**")) {
> +		warning(_("Pathspec provided contains `**`, but no :(glob) magic.\nIt will not match 0 or more directories!"));
> +	}
> +}

... hence, there is no reason why this helper should exist, let
alone be a public function.  Also, there is no reason to split the
conditional into two.  Just "it is OK if we have GLOB or LITERAL,
otherwise see if there is /**/" is sufficient.

It would be more sensible to just add the check to parse_pathspec()
directly.

Also, our warning messages do not begin with an uppercase.

See attached patch for all of the above, but it is for illustration
purposes only; not even compile tested.  I am not illustrating how
to turn this into an advice message that the user can squelch with
the illustration patch.

> diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh
> index ba7902c9cd..af6cd16f76 100755
> --- a/t/t6130-pathspec-noglob.sh
> +++ b/t/t6130-pathspec-noglob.sh
> @@ -157,4 +157,17 @@ test_expect_success '**/ does not work with :(literal) and --glob-pathspecs' '
>  	test_must_be_empty actual
>  '
>  

> +cat > expected <<"EOF"
> +warning: Pathspec provided contains `**`, but no :(glob) magic.
> +EOF

Please don't imitate ancient tests.  These days, all preparations
for individual tests, including the expected outcome, are to be done
inside the test itself.  Study nearby tests in the same script for
better examples.

> +test_expect_success '** without :(glob) warns of lacking glob magic' '
> +	test_might_fail git stash -- "**/bar" 2>warns &&
> +	grep -Ff expected warns

Same comment.  Nearby examples all set up expeced output and never
uses "grep" to see if the expectation is satisfied.  Imitate them.

> +'
> +
> +test_expect_success '** with :(literal) does not warn of lacking glob magic' '
> +	test_might_fail git stash -- ":(literal)**/bar" 2>warns &&
> +	! grep -Ff expected warns

Ditto.

Thanks.


 pathspec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git i/pathspec.c w/pathspec.c
index 18b3be362a..5b0eed5a65 100644
--- i/pathspec.c
+++ w/pathspec.c
@@ -598,6 +598,10 @@ void parse_pathspec(struct pathspec *pathspec,
 			die(_("pathspec '%s' is beyond a symbolic link"), entry);
 		}
 
+		if (!(item[i].magic & (PATHSPEC_GLOB | PATHSPEC_LITERAL)) &&
+		    strstr(entry, "**"))
+			warning(_("** in '%s'without :(glob) magic:"), entry);
+
 		if (item[i].nowildcard_len < item[i].len)
 			pathspec->has_wildcard = 1;
 		pathspec->magic |= item[i].magic;

  reply	other threads:[~2021-03-26  0:42 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 20:44 Pathspec does not accept / does not warn for specific pathspecs Σταύρος Ντέντος
2021-02-26 23:27 ` Junio C Hamano
2021-03-25 10:22   ` [PATCH v1 0/1] pathspec: warn for a no-glob entry that contains `**` Σταύρος Ντέντος
2021-03-25 10:22     ` [PATCH v1 1/1] " Σταύρος Ντέντος
2021-03-25 11:00       ` Bagas Sanjaya
2021-03-25 11:04       ` Bagas Sanjaya
2021-03-25 16:09         ` Stavros Ntentos
2021-03-25 20:09           ` Junio C Hamano
2021-03-25 23:36   ` [PATCH v2 0/1] " Σταύρος Ντέντος
2021-03-25 23:36   ` [PATCH v2 1/1] " Σταύρος Ντέντος
2021-03-26  0:41     ` Junio C Hamano [this message]
2021-03-26  1:32       ` Eric Sunshine
2021-03-26  3:02         ` Junio C Hamano
2021-03-26  5:13           ` Jeff King
2021-03-26 15:43             ` Stavros Ntentos
2021-03-26 15:48     ` [PATCH v3 1/2] " Stavros Ntentos
2021-03-26 15:48       ` [PATCH v3 2/2] pathspec: convert no-glob warn to advice Stavros Ntentos
2021-03-26  2:40   ` [RFC PATCH v1 0/1] pathspec: warn: long and short forms are incompatible Σταύρος Ντέντος
2021-03-26  2:40     ` [RFC PATCH v1 1/2] " Σταύρος Ντέντος
2021-03-26  5:28       ` Jeff King
2021-03-26 16:16         ` Stavros Ntentos
2021-03-27  9:41           ` Jeff King
2021-03-27 18:36             ` Junio C Hamano
2021-03-28 15:44               ` Stavros Ntentos
2021-03-26  2:40     ` [RFC PATCH v1 2/2] fixup! " Σταύρος Ντέντος
2021-03-26  8:14       ` Bagas Sanjaya
2021-03-26 15:55         ` Stavros Ntentos
2021-03-28 15:26       ` [RFC PATCH v1 3/3] squash! " Stavros Ntentos
2021-03-26  2:44   ` [RFC PATCH v1 0/1] " Σταύρος Ντέντος
2021-03-26  2:44     ` [RFC PATCH v1] " Σταύρος Ντέντος
2021-04-03 12:26     ` [PATCH v3] pathspec: advice: " Stavros Ntentos
2021-04-04  7:19       ` Junio C Hamano
2021-04-11 15:07         ` Σταύρος Ντέντος
2021-04-11 19:10           ` Junio C Hamano
2021-04-11 20:53             ` Σταύρος Ντέντος
2021-03-28 15:45   ` [PATCH v2] " Stavros Ntentos
2021-03-28 19:12     ` Junio C Hamano
2021-03-30 17:37       ` Junio C Hamano
2021-03-30 19:05       ` Stavros Ntentos
2021-03-30 19:17         ` Stavros Ntentos
2021-03-30 20:36         ` Junio C Hamano
2021-04-03 12:48   ` [PATCH v3] " Stavros Ntentos
2021-04-03 12:51   ` [PATCH v4] " Stavros Ntentos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq35wiu4om.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=stdedos+git@gmail.com \
    --cc=stdedos@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).