git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC/PATCH] recognize pathspec magic without "--" disambiguation
@ 2017-05-25 15:27 Jeff King
  2017-05-26  2:13 ` Junio C Hamano
  2017-05-27  9:54 ` [RFC/PATCH] recognize pathspec magic without "--" disambiguation Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-05-25 15:27 UTC (permalink / raw)
  To: git

For commands that take revisions and pathspecs, magic
pathspecs like ":^Documentation" or ":/Documentation" have
to appear on the right-hand side of the disambiguating "--",
like:

  git log -- :^Documentation

This makes them more annoying to use than they need to be.
We loosened the rules for wildcards in 28fcc0b71 (pathspec:
avoid the need of "--" when wildcard is used, 2015-05-02).
Let's do the same for arguments that look like pathspec
magic (colon followed by any punctuation).

The obvious and desired impact is that you can now do:

  git log :^Documentation

But let's consider related invocations and whether we're
making them better or worse:

   - git log :/foo
      (when "foo" matches a commit message)

      This one should remain the same. Like the existing
      wildcard rule, we're touching only verify_filename(),
      not verify_non_filename(). So cases that _do_ resolve
      as a rev will continue to do so.

   - git log :^foo
      (when "^foo" exists in your index)

      The same logic applies; it would continue to work. And
      ditto for any other weird filenames in your index like
      "(attr)foo".

   - git log :/foo
      (when "foo" does _not_ match a commit message)

      We won't treat this as a revision, because it doesn't
      match anything. So prior to this patch, we'd either
      treat it as a path (if "foo" does exist at the root of
      the project) or complain "this isn't a rev, and nor is
      it a path".  Whereas after this patch, we'd always
      treat it like a path, whether it exists or not (so in
      the second case instead of an error, you're likely to
      get an empty "log", unless "foo" really did exist
      somewhere in your history). So this is a potential
      downside; if the user intended a search of the commit
      messages, they may prefer the error message to the
      attempted pathspec match.

      This same downside actually exists currently when you
      have an asterisk in your regex. E.g.,

        git log :/foo.*bar

      will be treated as a pathspec (if it doesn't match a
      commit message) due to the wildcard matching in
      28fcc0b71.

   - git log :^foo
       (when "^foo" isn't in your index)

      The same outcomes apply here as above, but this
      "downside" is very unlikely to occur, as "^foo" is
      such a funny name (and again, things like "(attr)foo"
      as a filename are sufficiently uncommon not to worry
      about).

   - git log :%foo
      (or any other pathspec magic char that doesn't exist)

      The new check doesn't actually parse the pathspec
      magic, but allows any punctuation (which includes the
      long-form ":(magic)"). At first glance this seems
      overly permissive, but it actually yields a better
      error message here: instead of complaining that
      ":%foo" is neither a rev nor a path, we treat it as a
      pathspec and complain that "%" is not a known magic
      (the same as we would if the "--" were given).

      Of course, the flip side here is when you really
      meant a file named "%foo" in your index, but it didn't
      exist. That seems reasonably unlikely (and the error
      message is clear enough to point the user in the right
      direction).

So the collateral damage doesn't seem too bad (it's really
just the case where :/foo doesn't match anything). There are
two possibilities for reducing that:

  1. Instead of accepting all pathspec magic, just allow
     certain ones like ":^" which are unlikely to be used to
     specify a revision (or alternatively, just disallow
     potentially confusing ones like ":/").

     That works, but it makes the rules inconsistent and
     confusing for the user (i.e., "--" is sometimes needed
     for magic and sometimes not).

  2. Instead of recognizing pathspec magic, teach
     check_filename() to parse off the filename bit and see
     if that exists (e.g., for "^foo" see if "foo" exists).
     We already do this for ":/", but it's done in a very
     ad-hoc way. We parse it ourselves, rather than relying
     on the pathspec code, and handle only "/" magic, and
     not even its ":(top)" long form.

     That could be fixed by asking the pathspec code to
     parse it for us (including all magic), and then trying
     to match the resulting name against the working tree.
     But not all pathspec magic actually results in a
     pathname. For instance, when should ":(attr)foo" be
     valid? We can't just look for "foo" in the filesystem
     (it's an attribute, not a pathname).

     By comparison, recognizing things that look like
     pathspec magic is a simple and easy to understand rule.

Signed-off-by: Jeff King <peff@peff.net>
---
I wrote all the above to try to convince myself that this
doesn't have any serious regression cases. And I think I
did.

But I actually we could make the rules in alternative (2)
above work. check_filename() would ask the pathspec code to
parse each argument and get one of three results:

  1. it's not pathspec magic; treat it like a filename
     (e.g., just "foo", or even bogus stuff like ":%foo")

  2. it is pathspec magic, and here is the matching filename
     that ought to exist (e.g., "foo" from ":^foo" or
     ":(exclude)foo")

  3. it is pathspec magic, but there's no matching filename.
     Assume it's a pathspec (e.g., "(attr)foo").

I'm on the fence on whether it's worth the trouble versus
the simple rule implemented by this patch.

 setup.c                       | 19 ++++++++++++++++++-
 t/t4208-log-magic-pathspec.sh | 20 ++++++++++++++++++++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 0309c2782..1ca286e82 100644
--- a/setup.c
+++ b/setup.c
@@ -181,6 +181,23 @@ static void NORETURN die_verify_filename(const char *prefix,
 
 }
 
+/*
+ * Check for arguments that don't resolve as actual files,
+ * but which look sufficiently like pathspecs that we'll consider
+ * them such for the purposes of rev/pathspec DWIM parsing.
+ */
+static int looks_like_pathspec(const char *arg)
+{
+	if (!no_wildcard(arg))
+		return 1;
+
+	/* pathspec magic like ":^subdir" */
+	if (arg[0] == ':' && ispunct(arg[1]))
+		return 1;
+
+	return 0;
+}
+
 /*
  * Verify a filename that we got as an argument for a pathspec
  * entry. Note that a filename that begins with "-" never verifies
@@ -207,7 +224,7 @@ void verify_filename(const char *prefix,
 {
 	if (*arg == '-')
 		die("bad flag '%s' used after filename", arg);
-	if (check_filename(prefix, arg) || !no_wildcard(arg))
+	if (check_filename(prefix, arg) || looks_like_pathspec(arg))
 		return;
 	die_verify_filename(prefix, arg, diagnose_misspelt_rev);
 }
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 001343e2f..c87dda53c 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -61,4 +61,24 @@ test_expect_success 'command line pathspec parsing for "git log"' '
 	git log --merge -- a
 '
 
+# This differs from the ":/a" check above in that :/in only _looks_
+# like a pathspec, but doesn't match an actual file. So we prefer
+# it as a rev.
+test_expect_success '"git log :/in" should not be ambiguous' '
+	git log :/in
+'
+
+test_expect_success '"git log :^sub" should not be ambiguous' '
+	git log :^sub
+'
+
+test_expect_success '"git log :(exclude)sub" should not be ambiguous' '
+	git log ":(exclude)sub"
+'
+
+test_expect_success '"git log :%foo" should complain of bad magic' '
+	test_must_fail git log :%foo 2>error &&
+	test_i18ngrep pathspec.magic error
+'
+
 test_done
-- 
2.13.0.496.ge44ba89db

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] recognize pathspec magic without "--" disambiguation
  2017-05-25 15:27 [RFC/PATCH] recognize pathspec magic without "--" disambiguation Jeff King
@ 2017-05-26  2:13 ` Junio C Hamano
  2017-05-26 13:24   ` Jeff King
  2017-05-27  9:54 ` [RFC/PATCH] recognize pathspec magic without "--" disambiguation Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-05-26  2:13 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> But let's consider related invocations and whether we're
> making them better or worse:
>
>    - git log :/foo
>       (when "foo" matches a commit message)
>
>       This one should remain the same. Like the existing
>       wildcard rule, we're touching only verify_filename(),
>       not verify_non_filename(). So cases that _do_ resolve
>       as a rev will continue to do so.
>
>    - git log :^foo
>       (when "^foo" exists in your index)
>
>       The same logic applies; it would continue to work. And
>       ditto for any other weird filenames in your index like
>       "(attr)foo".

"git show :$path" where $path happens to be "^foo" would grab the
contents of the $path out of the index and I think that is what you
meant, but use of "git log" in the example made me scratch my head
greatly.

>    - git log :/foo
>       (when "foo" does _not_ match a commit message)
>	...
>       This same downside actually exists currently when you
>       have an asterisk in your regex. E.g.,
>
>         git log :/foo.*bar
>
>       will be treated as a pathspec (if it doesn't match a
>       commit message) due to the wildcard matching in
>       28fcc0b71.

In other words, we are not making things worse?

> I wrote all the above to try to convince myself that this
> doesn't have any serious regression cases. And I think I
> did.
>
> But I actually we could make the rules in alternative (2)
> above work. check_filename() would ask the pathspec code to
> parse each argument and get one of three results:
>
>   1. it's not pathspec magic; treat it like a filename
>      (e.g., just "foo", or even bogus stuff like ":%foo")
>
>   2. it is pathspec magic, and here is the matching filename
>      that ought to exist (e.g., "foo" from ":^foo" or
>      ":(exclude)foo")
>
>   3. it is pathspec magic, but there's no matching filename.
>      Assume it's a pathspec (e.g., "(attr)foo").
>
> I'm on the fence on whether it's worth the trouble versus
> the simple rule implemented by this patch.

Unlike "git log builtin-checkout.c" that errors out (only because
there is no such file in the checkout of the current version) and
makes its solution obvious to the users, this change has the risk of
silently accepting an ambiguous input and computing result that is
different from what the user intended to.  So I am not sure.  

As you pointedout, ":/" especially does look like a likely point of
failure, in that both "this is path at the top" pathspec magic and
"the commit with this string" are not something that we can say with
confidence that are rarely used because they are so esoteric.

As to "is it OK to build a rule that we cannot explain easily?", I
think it is OK to say "if it is not a rev, and if it is not a
pathname in the current working tree, you must disambiguate, but Git
helps by guessing in some cases---if you want to have more control
(e.g. you are a script), explicitly disambiguate and you'd be OK",
and leave the "some cases" vague, as long as we are only making
reasonably conservative guesses.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] recognize pathspec magic without "--" disambiguation
  2017-05-26  2:13 ` Junio C Hamano
@ 2017-05-26 13:24   ` Jeff King
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-05-26 13:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, May 26, 2017 at 11:13:55AM +0900, Junio C Hamano wrote:

> >    - git log :^foo
> >       (when "^foo" exists in your index)
> >
> >       The same logic applies; it would continue to work. And
> >       ditto for any other weird filenames in your index like
> >       "(attr)foo".
> 
> "git show :$path" where $path happens to be "^foo" would grab the
> contents of the $path out of the index and I think that is what you
> meant, but use of "git log" in the example made me scratch my head
> greatly.

Yeah, sorry about that confusion. My original motivation for this was
for use with git-grep, but since the existing tests all used "git log",
I tried to switch to that for consistency. But index paths obviously
make no sense with "git log" (we _do_ still parse it the same as we
would for git-show, etc, even though git-log doesn't do anything with
it).

As an aside, I wonder if git-log should issue a warning when there are
pending objects that it doesn't handle.

> >    - git log :/foo
> >       (when "foo" does _not_ match a commit message)
> >	...
> >       This same downside actually exists currently when you
> >       have an asterisk in your regex. E.g.,
> >
> >         git log :/foo.*bar
> >
> >       will be treated as a pathspec (if it doesn't match a
> >       commit message) due to the wildcard matching in
> >       28fcc0b71.
> 
> In other words, we are not making things worse?

We're making them slightly worse. The "problem" used to trigger with
":/foo.*bar", and now it would trigger with ":/foobar", too. I don't
know if that's a meaningful distinction, though.

> Unlike "git log builtin-checkout.c" that errors out (only because
> there is no such file in the checkout of the current version) and
> makes its solution obvious to the users, this change has the risk of
> silently accepting an ambiguous input and computing result that is
> different from what the user intended to.  So I am not sure.  

Right, that's what I was trying to catalogue in the commit message.

> As you pointedout, ":/" especially does look like a likely point of
> failure, in that both "this is path at the top" pathspec magic and
> "the commit with this string" are not something that we can say with
> confidence that are rarely used because they are so esoteric.
> 
> As to "is it OK to build a rule that we cannot explain easily?", I
> think it is OK to say "if it is not a rev, and if it is not a
> pathname in the current working tree, you must disambiguate, but Git
> helps by guessing in some cases---if you want to have more control
> (e.g. you are a script), explicitly disambiguate and you'd be OK",
> and leave the "some cases" vague, as long as we are only making
> reasonably conservative guesses.

I don't know. That is punting on the issue by not making any promises.
But that's a small consolation to somebody who does:

  $ git log :^foo
  [ok, works great...]
  $ git log :/foo
  fatal: ambiguous argument ':/foo': unknown revision or path not in the working tree.
  [stupid git! why don't you work?]

An explanation like "it's complicated, and the rules are meant to be
conservative and avoid confusion" does not really help Git's reputation
for being arcane.

I dunno. The one saving grace of ":/" is that we actually do handle
the ":/foo" case completely in check_filename(). So going back to my
:/foobar example: if you have a file "foobar" in your root directory, it
_is_ considered ambiguous.

So if that was the one exception, it would probably be OK in practice.

Which again makes me feel a bit like the right solution is doing the
existence checks on the non-magic portion of the pathspec for more magic
besides ":/".

Unfortunately, since writing my last message I did look into asking the
pathspec code to parse the arguments for us, but I think it would take
quite a bit of refactoring. It's very ready to die() or emit warnings on
bogus pathspecs, so it's not a good match for this kind of speculative
"is it a pathspec" test.

The middle ground would be to replicate a simple subset of the rules in
verify_filename(). If we assume that all arguments with ":(" are
pathspecs (which seem rather unlikely to have false positives), then
that leaves only a few short-magic patterns: :/, :!, and :^.

We already specially handle :/ here. So it would really just be adding
the other two (which are just aliases of each other). It's not that much
code. The dirty thing is just that we're replicating some of the
pathspec logic, and any new magic would have to be updated here, too.

I'll see if I can work up a patch.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/6] make "^:foo" work without disambiguating "--"
  2017-05-26 13:24   ` Jeff King
@ 2017-05-26 19:06     ` Jeff King
  2017-05-26 19:06       ` [PATCH 1/6] t4208: add check for ":/" without matching file Jeff King
                         ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, May 26, 2017 at 09:24:32AM -0400, Jeff King wrote:

> The middle ground would be to replicate a simple subset of the rules in
> verify_filename(). If we assume that all arguments with ":(" are
> pathspecs (which seem rather unlikely to have false positives), then
> that leaves only a few short-magic patterns: :/, :!, and :^.
> 
> We already specially handle :/ here. So it would really just be adding
> the other two (which are just aliases of each other). It's not that much
> code. The dirty thing is just that we're replicating some of the
> pathspec logic, and any new magic would have to be updated here, too.
> 
> I'll see if I can work up a patch.

So here's what I came up with. TBH, I could live without patch 5. What I
care most about is making the ":^" work. But I don't really see a
downside to it.

  [1/6]: t4208: add check for ":/" without matching file
  [2/6]: check_filename(): refactor ":/" handling
  [3/6]: check_filename(): use skip_prefix
  [4/6]: check_filename(): handle ":^" path magic
  [5/6]: verify_filename(): treat ":(magic)" as a pathspec
  [6/6]: verify_filename(): flip order of checks

 setup.c                       | 42 ++++++++++++++++++++++++++++++++----------
 t/t4208-log-magic-pathspec.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 10 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] t4208: add check for ":/" without matching file
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
@ 2017-05-26 19:06       ` Jeff King
  2017-05-26 19:07       ` [PATCH 2/6] check_filename(): refactor ":/" handling Jeff King
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The DWIM magic in check_filename() doesn't just recognize
":/". It actually makes sure that the file it points to
exists. t4208 checks only the case where the path is
present, not the opposite. Since the next patches will be
touching this area, let's add a test to make sure it
continues working.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t4208-log-magic-pathspec.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 001343e2f..70bc64100 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -29,6 +29,12 @@ test_expect_success '"git log -- :/a" should not be ambiguous' '
 	git log -- :/a
 '
 
+# This differs from the ":/a" check above in that :/in looks like a pathspec,
+# but doesn't match an actual file.
+test_expect_success '"git log :/in" should not be ambiguous' '
+	git log :/in
+'
+
 test_expect_success '"git log :" should be ambiguous' '
 	test_must_fail git log : 2>error &&
 	test_i18ngrep ambiguous error
-- 
2.13.0.496.ge44ba89db


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] check_filename(): refactor ":/" handling
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
  2017-05-26 19:06       ` [PATCH 1/6] t4208: add check for ":/" without matching file Jeff King
@ 2017-05-26 19:07       ` Jeff King
  2017-05-26 19:07       ` [PATCH 3/6] check_filename(): use skip_prefix Jeff King
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

We handle arguments with the ":/" pathspec magic specially,
making sure the name exists at the top-level.  We'll want to
handle more pathspec magic in future patches, so let's do a
little rearranging to make that easier.

Instead of relying on an if/else cascade to avoid the
prefix_filename() call, we'll just set prefix to NULL.
Likewise, we'll get rid of the "name" variable entirely, and
just push the "arg" pointer forward to skip past the magic.
That means by the time we get to the prefix-handling, we're
set up appropriately whether we saw ":/" or not.

Note that this does impact the final error message we
produce when stat() fails, as it shows "arg" (which we'll
have modified to skip magic and include the prefix). This is
a good thing; the original message would say something like
"failed to stat ':/foo'", which is confusing (we tried to
stat "foo").

Signed-off-by: Jeff King <peff@peff.net>
---
 setup.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/setup.c b/setup.c
index 0309c2782..000ffa810 100644
--- a/setup.c
+++ b/setup.c
@@ -134,19 +134,20 @@ int path_inside_repo(const char *prefix, const char *path)
 
 int check_filename(const char *prefix, const char *arg)
 {
-	const char *name;
 	char *to_free = NULL;
 	struct stat st;
 
 	if (starts_with(arg, ":/")) {
 		if (arg[2] == '\0') /* ":/" is root dir, always exists */
 			return 1;
-		name = arg + 2;
-	} else if (prefix)
-		name = to_free = prefix_filename(prefix, arg);
-	else
-		name = arg;
-	if (!lstat(name, &st)) {
+		arg += 2;
+		prefix = NULL;
+	}
+
+	if (prefix)
+		arg = to_free = prefix_filename(prefix, arg);
+
+	if (!lstat(arg, &st)) {
 		free(to_free);
 		return 1; /* file exists */
 	}
-- 
2.13.0.496.ge44ba89db


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] check_filename(): use skip_prefix
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
  2017-05-26 19:06       ` [PATCH 1/6] t4208: add check for ":/" without matching file Jeff King
  2017-05-26 19:07       ` [PATCH 2/6] check_filename(): refactor ":/" handling Jeff King
@ 2017-05-26 19:07       ` Jeff King
  2017-05-26 19:08       ` [PATCH 4/6] check_filename(): handle ":^" path magic Jeff King
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This avoids some magic numbers (and we'll be adding more
similar calls in a minute).

Signed-off-by: Jeff King <peff@peff.net>
---
 setup.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/setup.c b/setup.c
index 000ffa810..f2a8070bd 100644
--- a/setup.c
+++ b/setup.c
@@ -137,10 +137,9 @@ int check_filename(const char *prefix, const char *arg)
 	char *to_free = NULL;
 	struct stat st;
 
-	if (starts_with(arg, ":/")) {
-		if (arg[2] == '\0') /* ":/" is root dir, always exists */
+	if (skip_prefix(arg, ":/", &arg)) {
+		if (!*arg) /* ":/" is root dir, always exists */
 			return 1;
-		arg += 2;
 		prefix = NULL;
 	}
 
-- 
2.13.0.496.ge44ba89db


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] check_filename(): handle ":^" path magic
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
                         ` (2 preceding siblings ...)
  2017-05-26 19:07       ` [PATCH 3/6] check_filename(): use skip_prefix Jeff King
@ 2017-05-26 19:08       ` Jeff King
  2017-05-26 19:10       ` [PATCH 5/6] verify_filename(): treat ":(magic)" as a pathspec Jeff King
  2017-05-26 19:10       ` [PATCH 6/6] verify_filename(): flip order of checks Jeff King
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

We special-case "git log :/foo" to work when "foo" exists in
the working tree. But :^ (and its alias :!) do not get the
same treatment, requiring the user to supply a
disambiguating "--". Let's make them work without requiring
the user to type the "--".

Signed-off-by: Jeff King <peff@peff.net>
---
 setup.c                       |  4 ++++
 t/t4208-log-magic-pathspec.sh | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/setup.c b/setup.c
index f2a8070bd..86bb7c9a9 100644
--- a/setup.c
+++ b/setup.c
@@ -141,6 +141,10 @@ int check_filename(const char *prefix, const char *arg)
 		if (!*arg) /* ":/" is root dir, always exists */
 			return 1;
 		prefix = NULL;
+	} else if (skip_prefix(arg, ":!", &arg) ||
+		   skip_prefix(arg, ":^", &arg)) {
+		if (!*arg) /* excluding everything is silly, but allowed */
+			return 1;
 	}
 
 	if (prefix)
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 70bc64100..67250ebdc 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -52,6 +52,19 @@ test_expect_success 'git log HEAD -- :/' '
 	test_cmp expected actual
 '
 
+test_expect_success '"git log :^sub" is not ambiguous' '
+	git log :^sub
+'
+
+test_expect_success '"git log :^does-not-exist" does not match anything' '
+	test_must_fail git log :^does-not-exist
+'
+
+test_expect_success  '"git log :!" behaves the same as :^' '
+	git log :!sub &&
+	test_must_fail git log :!does-not-exist
+'
+
 test_expect_success 'command line pathspec parsing for "git log"' '
 	git reset --hard &&
 	>a &&
-- 
2.13.0.496.ge44ba89db


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] verify_filename(): treat ":(magic)" as a pathspec
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
                         ` (3 preceding siblings ...)
  2017-05-26 19:08       ` [PATCH 4/6] check_filename(): handle ":^" path magic Jeff King
@ 2017-05-26 19:10       ` Jeff King
  2017-05-26 19:10       ` [PATCH 6/6] verify_filename(): flip order of checks Jeff King
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

For commands that take revisions and pathspecs, magic
pathspecs like ":(exclude)foo" require the user to specify
a disambiguating "--", since they do not match a file in the
filesystem, like:

  git grep foo -- :(exclude)bar

This makes them more annoying to use than they need to be.
We loosened the rules for wildcards in 28fcc0b71 (pathspec:
avoid the need of "--" when wildcard is used, 2015-05-02).
Let's do the same for pathspecs with long-form magic.

We already handle the short-forms ":/" and ":^" specially in
check_filename(), so we don't need to handle them here. And
in fact, we could do the same with long-form magic, parsing
out the actual filename and making sure it exists. But there
are a few reasons not to do it that way:

  - the parsing gets much more complicated, and we'd want to
    hand it off to the pathspec code. But that code isn't
    ready to do this kind of speculative parsing (it's happy
    to die() when it sees a syntactically invalid pathspec).

  - not all pathspec magic maps to a filesystem path. E.g.,
    :(attr) should be treated as a pathspec regardless of
    what is in the filesystem

  - we can be a bit looser with ":(" than with the
    short-form ":/", because it is much less likely to have
    a false positive. Whereas ":/" also means "search for a
    commit with this regex".

Note that because the change is in verify_filename() and not
in its helper check_filename(), this doesn't affect the
verify_non_filename() case. I.e., if an item that matches
our new rule doesn't resolve as an object, we may fallback
to treating it as a pathspec (rather than complaining it
doesn't exist). But if it does resolve (e.g., as a file in
the index that starts with an open-paren), we won't then
complain that it's also a valid pathspec. This matches the
wildcard-exception behavior.

And of course in either case, one can always insert the "--"
to get more precise results.

Signed-off-by: Jeff King <peff@peff.net>
---
 setup.c                       | 20 +++++++++++++++++++-
 t/t4208-log-magic-pathspec.sh | 13 +++++++++++++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 86bb7c9a9..89fcc12ab 100644
--- a/setup.c
+++ b/setup.c
@@ -185,6 +185,24 @@ static void NORETURN die_verify_filename(const char *prefix,
 
 }
 
+/*
+ * Check for arguments that don't resolve as actual files,
+ * but which look sufficiently like pathspecs that we'll consider
+ * them such for the purposes of rev/pathspec DWIM parsing.
+ */
+static int looks_like_pathspec(const char *arg)
+{
+	/* anything with a wildcard character */
+	if (!no_wildcard(arg))
+		return 1;
+
+	/* long-form pathspec magic */
+	if (starts_with(arg, ":("))
+		return 1;
+
+	return 0;
+}
+
 /*
  * Verify a filename that we got as an argument for a pathspec
  * entry. Note that a filename that begins with "-" never verifies
@@ -211,7 +229,7 @@ void verify_filename(const char *prefix,
 {
 	if (*arg == '-')
 		die("bad flag '%s' used after filename", arg);
-	if (check_filename(prefix, arg) || !no_wildcard(arg))
+	if (check_filename(prefix, arg) || looks_like_pathspec(arg))
 		return;
 	die_verify_filename(prefix, arg, diagnose_misspelt_rev);
 }
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 67250ebdc..25129dfa0 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -65,6 +65,19 @@ test_expect_success  '"git log :!sub" behaves the same as :^' '
 	test_must_fail git log :!does-not-exist
 '
 
+test_expect_success '"git log :(exclude)sub" is not ambiguous' '
+	git log ":(exclude)sub"
+'
+
+test_expect_success '"git log :(exclude)sub --" must resolve as an object' '
+	test_must_fail git log ":(exclude)sub" --
+'
+
+test_expect_success '"git log :(unknown-magic) complains of bogus magic' '
+	test_must_fail git log ":(unknown-magic)" 2>error &&
+	test_i18ngrep pathspec.magic error
+'
+
 test_expect_success 'command line pathspec parsing for "git log"' '
 	git reset --hard &&
 	>a &&
-- 
2.13.0.496.ge44ba89db


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] verify_filename(): flip order of checks
  2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
                         ` (4 preceding siblings ...)
  2017-05-26 19:10       ` [PATCH 5/6] verify_filename(): treat ":(magic)" as a pathspec Jeff King
@ 2017-05-26 19:10       ` Jeff King
  5 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-05-26 19:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

The looks_like_pathspec() check is much cheaper than
check_filename(), which actually stats the file. Since
either is sufficient for our return value, we should do the
cheaper one first, potentially short-circuiting the other.

Signed-off-by: Jeff King <peff@peff.net>
---
Probably doesn't matter, but it bugged me and it was subtle enough that
I pulled it out into its own commit.

 setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 89fcc12ab..1de87ed84 100644
--- a/setup.c
+++ b/setup.c
@@ -229,7 +229,7 @@ void verify_filename(const char *prefix,
 {
 	if (*arg == '-')
 		die("bad flag '%s' used after filename", arg);
-	if (check_filename(prefix, arg) || looks_like_pathspec(arg))
+	if (looks_like_pathspec(arg) || check_filename(prefix, arg))
 		return;
 	die_verify_filename(prefix, arg, diagnose_misspelt_rev);
 }
-- 
2.13.0.496.ge44ba89db

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] recognize pathspec magic without "--" disambiguation
  2017-05-25 15:27 [RFC/PATCH] recognize pathspec magic without "--" disambiguation Jeff King
  2017-05-26  2:13 ` Junio C Hamano
@ 2017-05-27  9:54 ` Ævar Arnfjörð Bjarmason
  2017-05-27 20:39   ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-27  9:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, May 25, 2017 at 5:27 PM, Jeff King <peff@peff.net> wrote:
>         git log :/foo.*bar

Another option would be to deprecate the :/rx syntax over some period
in favor of ^{/rx}.

I think it's too ugly to live, and really useless. It's equivalent to
"--grep=<rx> --all". Does anyone use this and not really mean to use
^{/rx}? E.g. "git show :/fix" might show a fix on some unrelated
branch you recently rebased.

>       will be treated as a pathspec (if it doesn't match a
>       commit message) due to the wildcard matching in
>       28fcc0b71.

So it might DWYM after hanging there looking at your entire history
for a commit message matching foo.*bar? And if you make such a commit
it'll start meaning something else entirely?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] recognize pathspec magic without "--" disambiguation
  2017-05-27  9:54 ` [RFC/PATCH] recognize pathspec magic without "--" disambiguation Ævar Arnfjörð Bjarmason
@ 2017-05-27 20:39   ` Jeff King
  2017-05-28  0:04     ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-05-27 20:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Sat, May 27, 2017 at 11:54:07AM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Thu, May 25, 2017 at 5:27 PM, Jeff King <peff@peff.net> wrote:
> >         git log :/foo.*bar
> 
> Another option would be to deprecate the :/rx syntax over some period
> in favor of ^{/rx}.

Yeah, the latter is more flexible (can start at a tip of your choosing)
and syntactically matches other object selectors much better.

> I think it's too ugly to live, and really useless. It's equivalent to
> "--grep=<rx> --all". Does anyone use this and not really mean to use
> ^{/rx}? E.g. "git show :/fix" might show a fix on some unrelated
> branch you recently rebased.

It's actually _worse_ than that --grep because it only picks one result.
So not only do we look at unrelated branches, but they may take
precedence (based on commit timestamp) over the current branch.

It is shorter than "HEAD^{/re}", though. So I suspect people do use it
and would be slightly annoyed if it went away.

> >       will be treated as a pathspec (if it doesn't match a
> >       commit message) due to the wildcard matching in
> >       28fcc0b71.
> 
> So it might DWYM after hanging there looking at your entire history
> for a commit message matching foo.*bar? And if you make such a commit
> it'll start meaning something else entirely?

Yeah. That's a good reason not to use ":/" without a disambiguating "--"
(which _does_ work, even without my series, because check_filename()
does specific path-matching). At best, you pay for a complete useless
history traversal before the command actually starts running. But much
more likely is that Git just complains of ambiguity, because people tend
to mention top-level paths in their commit messages. E.g.:

  $ cd t
  $ git grep foo :/Documentation
  fatal: ambiguous argument ':/Documentation': both revision and filename

So it really is a pretty horrible syntax.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] recognize pathspec magic without "--" disambiguation
  2017-05-27 20:39   ` Jeff King
@ 2017-05-28  0:04     ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2017-05-28  0:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

> Yeah. That's a good reason not to use ":/" without a disambiguating "--"
> (which _does_ work, even without my series, because check_filename()
> does specific path-matching). At best, you pay for a complete useless
> history traversal before the command actually starts running. But much
> more likely is that Git just complains of ambiguity, because people tend
> to mention top-level paths in their commit messages. E.g.:
>
>   $ cd t
>   $ git grep foo :/Documentation
>   fatal: ambiguous argument ':/Documentation': both revision and filename
>
> So it really is a pretty horrible syntax.

It does not allow it to be further annotated like "HEAD^{/^Merge
branch 'foo'}^2".  Yes, I agree that ":/string" was a pretty poor
design that was done without thinking things through.

But back then in 2007, it is a bit unfair to blame the initial
design for not thinking things through.  There weren't as many
precedents, good or bad, to learn from than we have today ;-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-05-28  0:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-25 15:27 [RFC/PATCH] recognize pathspec magic without "--" disambiguation Jeff King
2017-05-26  2:13 ` Junio C Hamano
2017-05-26 13:24   ` Jeff King
2017-05-26 19:06     ` [PATCH 0/6] make "^:foo" work without disambiguating "--" Jeff King
2017-05-26 19:06       ` [PATCH 1/6] t4208: add check for ":/" without matching file Jeff King
2017-05-26 19:07       ` [PATCH 2/6] check_filename(): refactor ":/" handling Jeff King
2017-05-26 19:07       ` [PATCH 3/6] check_filename(): use skip_prefix Jeff King
2017-05-26 19:08       ` [PATCH 4/6] check_filename(): handle ":^" path magic Jeff King
2017-05-26 19:10       ` [PATCH 5/6] verify_filename(): treat ":(magic)" as a pathspec Jeff King
2017-05-26 19:10       ` [PATCH 6/6] verify_filename(): flip order of checks Jeff King
2017-05-27  9:54 ` [RFC/PATCH] recognize pathspec magic without "--" disambiguation Ævar Arnfjörð Bjarmason
2017-05-27 20:39   ` Jeff King
2017-05-28  0:04     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).