git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] doc: ls-tree paths do not support wildcards
@ 2020-05-28 21:23 Steven Willis via GitGitGadget
  2020-05-28 21:51 ` Jeff King
  2020-05-28 22:04 ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Steven Willis via GitGitGadget @ 2020-05-28 21:23 UTC (permalink / raw)
  To: git; +Cc: Steven Willis, Steven Willis

From: Steven Willis <onlynone@gmail.com>

Signed-off-by: Steven Willis <onlynone@gmail.com>
---
    doc: ls-tree paths do not support wildcards
    
    The documentation for ls-tree says that paths can be wildcards, but this
    appears to be incorrect, only raw paths seem to work.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-796%2Fonlynone%2Fls-tree-paths-do-not-support-wildcards-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-796/onlynone/ls-tree-paths-do-not-support-wildcards-v1
Pull-Request: https://github.com/git/git/pull/796

 Documentation/git-ls-tree.txt | 6 +++---
 t/t3102-ls-tree-wildcards.sh  | 6 ++++++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
index a7515714da1..8a8ce20cf51 100644
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -19,7 +19,7 @@ Lists the contents of a given tree object, like what "/bin/ls -a" does
 in the current working directory.  Note that:
 
  - the behaviour is slightly different from that of "/bin/ls" in that the
-   '<path>' denotes just a list of patterns to match, e.g. so specifying
+   '<path>' denotes just a list of files to match, e.g. so specifying
    directory name (without `-r`) will behave differently, and order of the
    arguments does not matter.
 
@@ -74,8 +74,8 @@ OPTIONS
 	Implies --full-name.
 
 [<path>...]::
-	When paths are given, show them (note that this isn't really raw
-	pathnames, but rather a list of patterns to match).  Otherwise
+	When paths are given, show them (note that this is really raw
+	pathnames, not a list of patterns to match).  Otherwise
 	implicitly uses the root level of the tree as the sole path argument.
 
 
diff --git a/t/t3102-ls-tree-wildcards.sh b/t/t3102-ls-tree-wildcards.sh
index 1e16c6b8ea6..6c0f2af1d04 100755
--- a/t/t3102-ls-tree-wildcards.sh
+++ b/t/t3102-ls-tree-wildcards.sh
@@ -33,4 +33,10 @@ test_expect_failure 'ls-tree does not yet support negated pathspec' '
 	test_cmp expect actual
 '
 
+test_expect_failure 'ls-tree does not yet support wildcard pathspec' '
+	git ls-files "a*" >expect &&
+	git ls-tree --name-only -r HEAD "a*" >actual &&
+	test_cmp expect actual
+'
+
 test_done

base-commit: 2d5e9f31ac46017895ce6a183467037d29ceb9d3
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 21:23 [PATCH] doc: ls-tree paths do not support wildcards Steven Willis via GitGitGadget
@ 2020-05-28 21:51 ` Jeff King
  2020-05-28 22:21   ` Junio C Hamano
  2020-05-28 22:04 ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-05-28 21:51 UTC (permalink / raw)
  To: Steven Willis via GitGitGadget; +Cc: git, Steven Willis

On Thu, May 28, 2020 at 09:23:16PM +0000, Steven Willis via GitGitGadget wrote:

> From: Steven Willis <onlynone@gmail.com>
> 
> Signed-off-by: Steven Willis <onlynone@gmail.com>
> ---
>     doc: ls-tree paths do not support wildcards
>     
>     The documentation for ls-tree says that paths can be wildcards, but this
>     appears to be incorrect, only raw paths seem to work.

This part probably should be in the commit message. :)

>  [<path>...]::
> -	When paths are given, show them (note that this isn't really raw
> -	pathnames, but rather a list of patterns to match).  Otherwise
> +	When paths are given, show them (note that this is really raw
> +	pathnames, not a list of patterns to match).  Otherwise
>  	implicitly uses the root level of the tree as the sole path argument.

You're right that we don't match globs, but I don't think it's accurate
to say that these aren't patterns, or that they are raw pathnames. We
_do_ parse them as pathspecs, include magic prefixes. E.g.:

  $ git -C Documentation ls-tree HEAD -- :/Makefile
  100644 blob 90aa329eb7836824a7a45383e4b5b157124d815c	../Makefile

And we complain about pathspec magic that isn't supported in the
matching code:

  $ git ls-tree HEAD -- :^Makefile
  fatal: :^Makefile: pathspec magic not supported by this command: 'exclude' (mnemonic: '!')

But we don't complain about an attempt to use glob characters (which
_could_ really be an attempt to specify a funny-named file, though I
guess you could argue the same for ":" magic).

So I think for now we ought to explain the situation a bit more clearly:
leave this language as-is, but add a new section describing what
patterns we do support.

In the long run it would be nice to actually match regular pathspecs.
That would be a backwards-incompatibility, which I think is why nobody
has pursued it further (and ls-tree is meant to be plumbing that should
stay consistent, so we need to be extra careful). So we'd need a
transition plan. Perhaps:

  1. Deprecate the current behavior in the documentation and release
     notes, encouraging people who want literal matching to use
     --literal-pathspecs or the ":(literal)" magic. AFAICT we've
     supported these since at least 2013 for this command, so it should
     be safe to use unconditionally.

  2. Add a new option, "--use-pathspecs" or similar, that switches the
     matching code to use match_pathspec(). That lets people use the new
     feature immediately if they want to.

  3. When --use-pathspecs is not in use, warn to stderr about any
     wildcard characters in the input. That reinforces the deprecation
     notice in (1) and is likely to get more people's attention.

  4. After several releases, flip the default to --use-pathspecs,
     leaving --no-use-pathspecs as an escape hatch for people who still
     haven't switched their scripts.

  5. After several more releases, eventually remove the old-style
     matching (perhaps leaving --use-pathspecs as a noop).

To be honest, that may be more careful than we absolutely need to be.
We'd only do the wrong thing if you really do have files with glob
metacharacters in their names. And if you're expecting to match names
literally and not using ":(literal)" or similar, your script is
_already_ wrong, since it would be barfing on names starting with a
colon. I have a feeling we made that backwards-incompatible change when
this was converted to pathspecs years ago, and nobody noticed either
way.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 21:23 [PATCH] doc: ls-tree paths do not support wildcards Steven Willis via GitGitGadget
  2020-05-28 21:51 ` Jeff King
@ 2020-05-28 22:04 ` Junio C Hamano
  2020-05-28 23:16   ` Jeff King
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-05-28 22:04 UTC (permalink / raw)
  To: Steven Willis via GitGitGadget; +Cc: git, Steven Willis

"Steven Willis via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Steven Willis <onlynone@gmail.com>
>
> Signed-off-by: Steven Willis <onlynone@gmail.com>
> ---
>     doc: ls-tree paths do not support wildcards
>     
>     The documentation for ls-tree says that paths can be wildcards, but this
>     appears to be incorrect, only raw paths seem to work.

This is not something you would write after the three-dash line.

The documentation does not say paths can be wildcards.  It allows a
list of "patterns to match" and never says they are wildcards.

I think they take the traditional "leading paths" (i.e. by saying
"git ls-tree HEAD t/", you can show all paths that are under t/,
and is different from the "raw path" i.e. "git ls-tree HEAD t").


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 21:51 ` Jeff King
@ 2020-05-28 22:21   ` Junio C Hamano
  2020-05-28 23:04     ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2020-05-28 22:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Steven Willis via GitGitGadget, git, Steven Willis

Jeff King <peff@peff.net> writes:

> So I think for now we ought to explain the situation a bit more clearly:
> leave this language as-is, but add a new section describing what
> patterns we do support.

Thanks; you said a lot better than I could ;-)

> In the long run it would be nice to actually match regular pathspecs.
> That would be a backwards-incompatibility, which I think is why nobody
> has pursued it further (and ls-tree is meant to be plumbing that should
> stay consistent, so we need to be extra careful). So we'd need a
> transition plan. Perhaps:
>
>   1. Deprecate the current behavior in the documentation and release
>      notes, encouraging people who want literal matching to use
>      --literal-pathspecs or the ":(literal)" magic. AFAICT we've
>      supported these since at least 2013 for this command, so it should
>      be safe to use unconditionally.
>
>   2. Add a new option, "--use-pathspecs" or similar, that switches the
>      matching code to use match_pathspec(). That lets people use the new
>      feature immediately if they want to.
>
>   3. When --use-pathspecs is not in use, warn to stderr about any
>      wildcard characters in the input. That reinforces the deprecation
>      notice in (1) and is likely to get more people's attention.

Hmph, if we are serious about deprecation and migration, I would
image that in stage #1, we should do this check already.  When
"--literal-pathspecs" is NOT in use, if a pathspec would change its
meaning if not taken literally (e.g. has glob letters, begins with
:-magic introducer, etc.), we warn and do so from the very beginning
of the migration process.

>   4. After several releases, flip the default to --use-pathspecs,
>      leaving --no-use-pathspecs as an escape hatch for people who still
>      haven't switched their scripts.

Wouldn't --literal-pathspecs be the accepted escape hatch that will
always be accepted, even after --use-pathspecs becomes a no-op?

>   5. After several more releases, eventually remove the old-style
>      matching (perhaps leaving --use-pathspecs as a noop).

> To be honest, that may be more careful than we absolutely need to be.

Yeah, there seem to be some room for optimization, but I think the
key steps are about right if we wanted to do this migration.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 22:21   ` Junio C Hamano
@ 2020-05-28 23:04     ` Jeff King
  2020-05-28 23:17       ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2020-05-28 23:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Steven Willis via GitGitGadget, git, Steven Willis

On Thu, May 28, 2020 at 03:21:34PM -0700, Junio C Hamano wrote:

> >   1. Deprecate the current behavior in the documentation and release
> >      notes, encouraging people who want literal matching to use
> >      --literal-pathspecs or the ":(literal)" magic. AFAICT we've
> >      supported these since at least 2013 for this command, so it should
> >      be safe to use unconditionally.
> >
> >   2. Add a new option, "--use-pathspecs" or similar, that switches the
> >      matching code to use match_pathspec(). That lets people use the new
> >      feature immediately if they want to.
> >
> >   3. When --use-pathspecs is not in use, warn to stderr about any
> >      wildcard characters in the input. That reinforces the deprecation
> >      notice in (1) and is likely to get more people's attention.
> 
> Hmph, if we are serious about deprecation and migration, I would
> image that in stage #1, we should do this check already.  When
> "--literal-pathspecs" is NOT in use, if a pathspec would change its
> meaning if not taken literally (e.g. has glob letters, begins with
> :-magic introducer, etc.), we warn and do so from the very beginning
> of the migration process.

Yeah, sorry, I meant these three steps to all happen at once.

Technically we don't need step (2) in there for the deprecation, but I
think it lets people adjust to the new world order as their solution to
avoid the warning (though I guess literal-pathspecs would also prevent
the warning; we wouldn't be looking for "*" in the input so much as
checking whether the parsed pathspec contains a wildcard).

> >   4. After several releases, flip the default to --use-pathspecs,
> >      leaving --no-use-pathspecs as an escape hatch for people who still
> >      haven't switched their scripts.
> 
> Wouldn't --literal-pathspecs be the accepted escape hatch that will
> always be accepted, even after --use-pathspecs becomes a no-op?

Hmm, I guess so. That wouldn't restore the behavior as it is _now_, but
the current behavior is sort of pointless. It treats pathspec magic as
special, but not globs. I doubt anybody actually wants that; they'd
either want pathspecs or literal paths.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 22:04 ` Junio C Hamano
@ 2020-05-28 23:16   ` Jeff King
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2020-05-28 23:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Steven Willis via GitGitGadget, git, Steven Willis

On Thu, May 28, 2020 at 03:04:33PM -0700, Junio C Hamano wrote:

> The documentation does not say paths can be wildcards.  It allows a
> list of "patterns to match" and never says they are wildcards.
> 
> I think they take the traditional "leading paths" (i.e. by saying
> "git ls-tree HEAD t/", you can show all paths that are under t/,
> and is different from the "raw path" i.e. "git ls-tree HEAD t").

Hmm, that might be weird if we switch to the pathspec matching code,
because we're sort-of recursive and sort-of not. E.g.:

  $ mkdir -p one/two/three
  $ echo content >one/two/three/file
  $ git init && git add . && git commit -m foo

  $ git ls-tree HEAD
  040000 tree 8f303030860565e9c51101f993e1db4feb2792a7	one

  $ git ls-tree -r HEAD
  100644 blob d95f3ad14dee633a758d2e331151e950dd13e4ed	one/two/three/file

  git ls-tree HEAD one
  040000 tree 8f303030860565e9c51101f993e1db4feb2792a7	one

  $ git ls-tree HEAD one/
  040000 tree c89b49675f4ccf1d03f7c95b5f072566ce0d7f83	one/two

So the pathspec match triggers one level of recursion, which is unlike a
pathspec used with non-recursive diff:

  $ git diff-tree --root --abbrev HEAD
  6f4ed388a766b500bcf3bbd9877cc7213c76a959
  :000000 040000 0000000 8f30303 A	one
  
  $ git diff-tree --root --abbrev -r HEAD
  6f4ed388a766b500bcf3bbd9877cc7213c76a959
  :000000 100644 0000000 d95f3ad A	one/two/three/file
  
  $ git diff-tree --root --abbrev HEAD one
  6f4ed388a766b500bcf3bbd9877cc7213c76a959
  :000000 040000 0000000 8f30303 A	one
  
  $ git diff-tree --root --abbrev HEAD one/
  6f4ed388a766b500bcf3bbd9877cc7213c76a959
  :000000 040000 0000000 8f30303 A	one

I think we should be able to reuse the pathspec matching code, but we
may not be able to reuse other tree-walking bits. Or maybe we can. It
looks like we already use read_tree_recursive(), so I guess our custom
show_tree() callback is able to handle that. But definitely something to
watch out for during any transition.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] doc: ls-tree paths do not support wildcards
  2020-05-28 23:04     ` Jeff King
@ 2020-05-28 23:17       ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2020-05-28 23:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Steven Willis via GitGitGadget, git, Steven Willis

Jeff King <peff@peff.net> writes:

> On Thu, May 28, 2020 at 03:21:34PM -0700, Junio C Hamano wrote:
>
>> >   1. Deprecate the current behavior in the documentation and release
>> >      notes, encouraging people who want literal matching to use
>> >      --literal-pathspecs or the ":(literal)" magic. AFAICT we've
>> >      supported these since at least 2013 for this command, so it should
>> >      be safe to use unconditionally.
>> >
>> >   2. Add a new option, "--use-pathspecs" or similar, that switches the
>> >      matching code to use match_pathspec(). That lets people use the new
>> >      feature immediately if they want to.
>> >
>> >   3. When --use-pathspecs is not in use, warn to stderr about any
>> >      wildcard characters in the input. That reinforces the deprecation
>> >      notice in (1) and is likely to get more people's attention.
>> 
>> Hmph, if we are serious about deprecation and migration, I would
>> image that in stage #1, we should do this check already.  When
>> "--literal-pathspecs" is NOT in use, if a pathspec would change its
>> meaning if not taken literally (e.g. has glob letters, begins with
>> :-magic introducer, etc.), we warn and do so from the very beginning
>> of the migration process.
>
> Yeah, sorry, I meant these three steps to all happen at once.
>
> Technically we don't need step (2) in there for the deprecation, but I
> think it lets people adjust to the new world order as their solution to
> avoid the warning (though I guess literal-pathspecs would also prevent
> the warning; we wouldn't be looking for "*" in the input so much as
> checking whether the parsed pathspec contains a wildcard).

I do agree that letting early adopters experiemtn is a great and
necessary step.  And I think I misread the condition you wrote in
your 3.; I do agree that we should warn when --use-pathspecs is not
in use, but if --literal-pathspecs is in use, then the user wants
to match literally even when the pattern has globs and :-magic
introducer sequence, so we shouldn't warn.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-28 23:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 21:23 [PATCH] doc: ls-tree paths do not support wildcards Steven Willis via GitGitGadget
2020-05-28 21:51 ` Jeff King
2020-05-28 22:21   ` Junio C Hamano
2020-05-28 23:04     ` Jeff King
2020-05-28 23:17       ` Junio C Hamano
2020-05-28 22:04 ` Junio C Hamano
2020-05-28 23:16   ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).