git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/2] grep: make output consistent with revision syntax
@ 2017-01-20 17:11 Stefan Hajnoczi
  2017-01-20 17:11 ` [PATCH v2 1/2] grep: only add delimiter if there isn't one already Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-01-20 17:11 UTC (permalink / raw)
  To: git; +Cc: Jeff King, gitster, Brandon Williams, Stefan Hajnoczi

v2:
 * Use obj->type instead of re-parsing name for delimiter
   (Followed Brandon's suggestion but named the variable 'delim' since that
   name is used in other places and 'del' is used for deletion.)
 * Add tests
 * Update Patch 1 commit description with a more relevant example
 * PATCH instead of RFC, now works with all documented git-rev-parse(1) syntax

git-grep(1)'s output is not consistent with git-rev-parse(1) revision syntax.

This means you cannot take "rev:path/to/file.c: foo();" output from git-grep(1)
and expect "git show rev:path/to/file.c" to work.  See the individual patches
for examples of command-lines that produce invalid output.

Stefan Hajnoczi (2):
  grep: only add delimiter if there isn't one already
  grep: use '/' delimiter for paths

 builtin/grep.c  |  8 +++++++-
 t/t7810-grep.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/2] grep: only add delimiter if there isn't one already
  2017-01-20 17:11 [PATCH v2 0/2] grep: make output consistent with revision syntax Stefan Hajnoczi
@ 2017-01-20 17:11 ` Stefan Hajnoczi
  2017-01-20 22:19   ` Junio C Hamano
  2017-01-20 17:11 ` [PATCH v2 2/2] grep: use '/' delimiter for paths Stefan Hajnoczi
  2017-01-20 23:11 ` [PATCH v2 0/2] grep: make output consistent with revision syntax Junio C Hamano
  2 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-01-20 17:11 UTC (permalink / raw)
  To: git; +Cc: Jeff King, gitster, Brandon Williams, Stefan Hajnoczi

git-grep(1) output does not follow git's own syntax:

  $ git grep malloc v2.9.3:t/
  v2.9.3:t/:test-lib.sh:  setup_malloc_check () {
  $ git show v2.9.3:t/:test-lib.sh
  fatal: Path 't/:test-lib.sh' does not exist in 'v2.9.3'

This patch avoids emitting the unnecessary ':' delimiter if the name
already ends with ':' or '/':

  $ git grep malloc v2.9.3:
  v2.9.3:t/test-lib.sh:   setup_malloc_check () {
  $ git show v2.9.3:t/test-lib.sh
  (succeeds)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 builtin/grep.c  |  6 +++++-
 t/t7810-grep.sh | 21 +++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 8887b6a..90a4f3d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -491,7 +491,11 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
 			strbuf_add(&base, name, len);
-			strbuf_addch(&base, ':');
+
+			/* Add a delimiter if there isn't one already */
+			if (name[len - 1] != '/' && name[len - 1] != ':') {
+				strbuf_addch(&base, ':');
+			}
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index de2405c..e804a3f 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1435,4 +1435,25 @@ test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	test_cmp expected actual
 '
 
+cat >expected <<EOF
+HEAD:t/a/v:vvv
+HEAD:t/v:vvv
+EOF
+
+test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t/' '
+	git grep vvv HEAD:t/ >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<EOF
+HEAD:t/a/v:vvv
+HEAD:t/v:vvv
+HEAD:v:vvv
+EOF
+
+test_expect_success 'grep outputs valid <rev>:<path> for HEAD:' '
+	git grep vvv HEAD: >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-01-20 17:11 [PATCH v2 0/2] grep: make output consistent with revision syntax Stefan Hajnoczi
  2017-01-20 17:11 ` [PATCH v2 1/2] grep: only add delimiter if there isn't one already Stefan Hajnoczi
@ 2017-01-20 17:11 ` Stefan Hajnoczi
  2017-01-20 22:18   ` Junio C Hamano
  2017-01-20 23:11 ` [PATCH v2 0/2] grep: make output consistent with revision syntax Junio C Hamano
  2 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-01-20 17:11 UTC (permalink / raw)
  To: git; +Cc: Jeff King, gitster, Brandon Williams, Stefan Hajnoczi

If the tree contains a sub-directory then git-grep(1) output contains a
colon character instead of a path separator:

  $ git grep malloc v2.9.3:t
  v2.9.3:t:test-lib.sh:	setup_malloc_check () {
  $ git show v2.9.3:t:test-lib.sh
  fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'

This patch attempts to use the correct delimiter:

  $ git grep malloc v2.9.3:t
  v2.9.3:t/test-lib.sh:	setup_malloc_check () {
  $ git show v2.9.3:t/test-lib.sh
  (success)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 builtin/grep.c  | 4 +++-
 t/t7810-grep.sh | 5 +++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 90a4f3d..7a7aab9 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -494,7 +494,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 
 			/* Add a delimiter if there isn't one already */
 			if (name[len - 1] != '/' && name[len - 1] != ':') {
-				strbuf_addch(&base, ':');
+				/* rev: or rev:path/ */
+				char delim = obj->type == OBJ_COMMIT ? ':' : '/';
+				strbuf_addch(&base, delim);
 			}
 		}
 		init_tree_desc(&tree, data, size);
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index e804a3f..8a58d5e 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1445,6 +1445,11 @@ test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t/' '
 	test_cmp expected actual
 '
 
+test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t' '
+	git grep vvv HEAD:t >actual &&
+	test_cmp expected actual
+'
+
 cat >expected <<EOF
 HEAD:t/a/v:vvv
 HEAD:t/v:vvv
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-01-20 17:11 ` [PATCH v2 2/2] grep: use '/' delimiter for paths Stefan Hajnoczi
@ 2017-01-20 22:18   ` Junio C Hamano
  2017-01-20 23:51     ` Brandon Williams
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2017-01-20 22:18 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git, Jeff King, Brandon Williams

Stefan Hajnoczi <stefanha@redhat.com> writes:

> If the tree contains a sub-directory then git-grep(1) output contains a
> colon character instead of a path separator:
>
>   $ git grep malloc v2.9.3:t
>   v2.9.3:t:test-lib.sh:	setup_malloc_check () {
>   $ git show v2.9.3:t:test-lib.sh
>   fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'
>
> This patch attempts to use the correct delimiter:
>
>   $ git grep malloc v2.9.3:t
>   v2.9.3:t/test-lib.sh:	setup_malloc_check () {
>   $ git show v2.9.3:t/test-lib.sh
>   (success)
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  builtin/grep.c  | 4 +++-
>  t/t7810-grep.sh | 5 +++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 90a4f3d..7a7aab9 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -494,7 +494,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>  
>  			/* Add a delimiter if there isn't one already */
>  			if (name[len - 1] != '/' && name[len - 1] != ':') {
> -				strbuf_addch(&base, ':');
> +				/* rev: or rev:path/ */
> +				char delim = obj->type == OBJ_COMMIT ? ':' : '/';

Why check the equality with commit, rather than un-equality with
tree?  Wouldn't you want to treat $commit:path and $tag:path the
same way?

I also think the two patches should be squashed together, as it is
only after this patch that the code does the "right thing" by
changing the delimiter depending on obj->type.

> +				strbuf_addch(&base, delim);
>  			}
>  		}
>  		init_tree_desc(&tree, data, size);
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index e804a3f..8a58d5e 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -1445,6 +1445,11 @@ test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t/' '
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t' '
> +	git grep vvv HEAD:t >actual &&
> +	test_cmp expected actual
> +'
> +

Perhaps you want an annotated tag object refs/tags/v1.0 that points
at the commit at the HEAD, and then run the same grep on v1.0:t, in
addition to the above test.

>  cat >expected <<EOF
>  HEAD:t/a/v:vvv
>  HEAD:t/v:vvv

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/2] grep: only add delimiter if there isn't one already
  2017-01-20 17:11 ` [PATCH v2 1/2] grep: only add delimiter if there isn't one already Stefan Hajnoczi
@ 2017-01-20 22:19   ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-01-20 22:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git, Jeff King, Brandon Williams

Stefan Hajnoczi <stefanha@redhat.com> writes:

> git-grep(1) output does not follow git's own syntax:
>
>   $ git grep malloc v2.9.3:t/
>   v2.9.3:t/:test-lib.sh:  setup_malloc_check () {
>   $ git show v2.9.3:t/:test-lib.sh
>   fatal: Path 't/:test-lib.sh' does not exist in 'v2.9.3'
>
> This patch avoids emitting the unnecessary ':' delimiter if the name
> already ends with ':' or '/':
>
>   $ git grep malloc v2.9.3:
>   v2.9.3:t/test-lib.sh:   setup_malloc_check () {
>   $ git show v2.9.3:t/test-lib.sh
>   (succeeds)

>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  builtin/grep.c  |  6 +++++-
>  t/t7810-grep.sh | 21 +++++++++++++++++++++
>  2 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 8887b6a..90a4f3d 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -491,7 +491,11 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>  		strbuf_init(&base, PATH_MAX + len + 1);
>  		if (len) {
>  			strbuf_add(&base, name, len);
> -			strbuf_addch(&base, ':');
> +
> +			/* Add a delimiter if there isn't one already */
> +			if (name[len - 1] != '/' && name[len - 1] != ':') {
> +				strbuf_addch(&base, ':');
> +			}

I agree with peff and Brandon that checking treeness is the right
way to make the decision, and you seem to have done that in the
updated 2/2, but why doesn't it apply here as well?


>  		}
>  		init_tree_desc(&tree, data, size);
>  		hit = grep_tree(opt, pathspec, &tree, &base, base.len,

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/2] grep: make output consistent with revision syntax
  2017-01-20 17:11 [PATCH v2 0/2] grep: make output consistent with revision syntax Stefan Hajnoczi
  2017-01-20 17:11 ` [PATCH v2 1/2] grep: only add delimiter if there isn't one already Stefan Hajnoczi
  2017-01-20 17:11 ` [PATCH v2 2/2] grep: use '/' delimiter for paths Stefan Hajnoczi
@ 2017-01-20 23:11 ` Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-01-20 23:11 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git, Jeff King, Brandon Williams

Stefan Hajnoczi <stefanha@redhat.com> writes:

> v2:
>  * Use obj->type instead of re-parsing name for delimiter
>    (Followed Brandon's suggestion but named the variable 'delim' since that
>    name is used in other places and 'del' is used for deletion.)
>  * Add tests
>  * Update Patch 1 commit description with a more relevant example
>  * PATCH instead of RFC, now works with all documented git-rev-parse(1) syntax
>
> git-grep(1)'s output is not consistent with git-rev-parse(1) revision syntax.

While I was queuing this series (which I think should become a
single patch in the final version), I was trying to see how it
should be described in the release note (aka an entry in the
periodicall "What's cooking" report).  Here is how I explained it.
You may want to borrow some parts of the description, especially the
part that talks about <tree-ish>:<path> being a way to name a blob,
when updating the commit log message.

     "git grep", when fed a tree-ish as an input, shows each hit
     prefixed with "<tree-ish>:<path>:<lineno>:".  As <tree-ish> is
     almost always either a commit or a tag that points at a commit,
     the early part of the output "<tree-ish>:<path>" can be used as
     the name of the blob and given to "git show".  

     When <tree-ish> is a tree given in the extended SHA-1 syntax
     (e.g. "<commit>:", or "<commit>:<dir>"), however, this results
     in a string that does not name a blob (e.g. "<commit>::<path>"
     or "<commit>:<dir>:<path>").  "git grep" has been taught to be
     a bit more intelligent about these cases and omit a colon (in
     the former case) or use slash (in the latter case) to produce
     "<commit>:<path>" and "<commit>:<dir>/<path>" that can be used
     as the name of a blob.

As a release note entry is written in a style different from the
commit log message, you would need to adjust the voice of the last
sentence (i.e. "Teach 'git grep' to ..." to give an order to the
codebase), but otherwise the above would make an understandable
justification for the change suitable in a log message.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-01-20 22:18   ` Junio C Hamano
@ 2017-01-20 23:51     ` Brandon Williams
  2017-02-07 15:04       ` Stefan Hajnoczi
  0 siblings, 1 reply; 15+ messages in thread
From: Brandon Williams @ 2017-01-20 23:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Hajnoczi, git, Jeff King

On 01/20, Junio C Hamano wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > If the tree contains a sub-directory then git-grep(1) output contains a
> > colon character instead of a path separator:
> >
> >   $ git grep malloc v2.9.3:t
> >   v2.9.3:t:test-lib.sh:	setup_malloc_check () {
> >   $ git show v2.9.3:t:test-lib.sh
> >   fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'
> >
> > This patch attempts to use the correct delimiter:
> >
> >   $ git grep malloc v2.9.3:t
> >   v2.9.3:t/test-lib.sh:	setup_malloc_check () {
> >   $ git show v2.9.3:t/test-lib.sh
> >   (success)
> >
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  builtin/grep.c  | 4 +++-
> >  t/t7810-grep.sh | 5 +++++
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index 90a4f3d..7a7aab9 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -494,7 +494,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
> >  
> >  			/* Add a delimiter if there isn't one already */
> >  			if (name[len - 1] != '/' && name[len - 1] != ':') {
> > -				strbuf_addch(&base, ':');
> > +				/* rev: or rev:path/ */
> > +				char delim = obj->type == OBJ_COMMIT ? ':' : '/';
> 
> Why check the equality with commit, rather than un-equality with
> tree?  Wouldn't you want to treat $commit:path and $tag:path the
> same way?

I assume Stefan just grabbed my naive suggestion hence why it checks
equality with a commit.  So that's my fault :)  Either of these may
not be enough though, since if you do 'git grep malloc v2.9.3^{tree}'
with this change the output prefix is 'v2.9.3^{tree}/' instead of the
correct prefix 'v2.9.3^{tree}:'

> 
> I also think the two patches should be squashed together, as it is
> only after this patch that the code does the "right thing" by
> changing the delimiter depending on obj->type.
> 
> > +				strbuf_addch(&base, delim);
> >  			}
> >  		}
> >  		init_tree_desc(&tree, data, size);
> > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> > index e804a3f..8a58d5e 100755
> > --- a/t/t7810-grep.sh
> > +++ b/t/t7810-grep.sh
> > @@ -1445,6 +1445,11 @@ test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t/' '
> >  	test_cmp expected actual
> >  '
> >  
> > +test_expect_success 'grep outputs valid <rev>:<path> for HEAD:t' '
> > +	git grep vvv HEAD:t >actual &&
> > +	test_cmp expected actual
> > +'
> > +
> 
> Perhaps you want an annotated tag object refs/tags/v1.0 that points
> at the commit at the HEAD, and then run the same grep on v1.0:t, in
> addition to the above test.
> 
> >  cat >expected <<EOF
> >  HEAD:t/a/v:vvv
> >  HEAD:t/v:vvv

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-01-20 23:51     ` Brandon Williams
@ 2017-02-07 15:04       ` Stefan Hajnoczi
  2017-02-07 19:50         ` Jeff King
  2017-02-07 20:24         ` Junio C Hamano
  0 siblings, 2 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-02-07 15:04 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Junio C Hamano, git, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

On Fri, Jan 20, 2017 at 03:51:33PM -0800, Brandon Williams wrote:
> On 01/20, Junio C Hamano wrote:
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > 
> > > If the tree contains a sub-directory then git-grep(1) output contains a
> > > colon character instead of a path separator:
> > >
> > >   $ git grep malloc v2.9.3:t
> > >   v2.9.3:t:test-lib.sh:	setup_malloc_check () {
> > >   $ git show v2.9.3:t:test-lib.sh
> > >   fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'
> > >
> > > This patch attempts to use the correct delimiter:
> > >
> > >   $ git grep malloc v2.9.3:t
> > >   v2.9.3:t/test-lib.sh:	setup_malloc_check () {
> > >   $ git show v2.9.3:t/test-lib.sh
> > >   (success)
> > >
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > >  builtin/grep.c  | 4 +++-
> > >  t/t7810-grep.sh | 5 +++++
> > >  2 files changed, 8 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/builtin/grep.c b/builtin/grep.c
> > > index 90a4f3d..7a7aab9 100644
> > > --- a/builtin/grep.c
> > > +++ b/builtin/grep.c
> > > @@ -494,7 +494,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
> > >  
> > >  			/* Add a delimiter if there isn't one already */
> > >  			if (name[len - 1] != '/' && name[len - 1] != ':') {
> > > -				strbuf_addch(&base, ':');
> > > +				/* rev: or rev:path/ */
> > > +				char delim = obj->type == OBJ_COMMIT ? ':' : '/';
> > 
> > Why check the equality with commit, rather than un-equality with
> > tree?  Wouldn't you want to treat $commit:path and $tag:path the
> > same way?
> 
> I assume Stefan just grabbed my naive suggestion hence why it checks
> equality with a commit.  So that's my fault :)  Either of these may
> not be enough though, since if you do 'git grep malloc v2.9.3^{tree}'
> with this change the output prefix is 'v2.9.3^{tree}/' instead of the
> correct prefix 'v2.9.3^{tree}:'

I revisited this series again today and am coming to the conclusion that
forming output based on the user's rev is really hard to get right in
all cases.  I don't have a good solution to the v2.9.3^{tree} problem.

Perhaps it's better to leave this than to merge code that doesn't work
correctly 100% of the time.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-07 15:04       ` Stefan Hajnoczi
@ 2017-02-07 19:50         ` Jeff King
  2017-02-07 20:24         ` Junio C Hamano
  1 sibling, 0 replies; 15+ messages in thread
From: Jeff King @ 2017-02-07 19:50 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Brandon Williams, Junio C Hamano, git

On Tue, Feb 07, 2017 at 03:04:14PM +0000, Stefan Hajnoczi wrote:

> > I assume Stefan just grabbed my naive suggestion hence why it checks
> > equality with a commit.  So that's my fault :)  Either of these may
> > not be enough though, since if you do 'git grep malloc v2.9.3^{tree}'
> > with this change the output prefix is 'v2.9.3^{tree}/' instead of the
> > correct prefix 'v2.9.3^{tree}:'
> 
> I revisited this series again today and am coming to the conclusion that
> forming output based on the user's rev is really hard to get right in
> all cases.  I don't have a good solution to the v2.9.3^{tree} problem.

I think the rule you need is not "are we at a tree", but rather "did we
traverse a path while resolving the name?". Only the get_sha1() parser
can tell you that. I think:

  char delim = ':';
  struct object_context oc;
  if (get_sha1_with_context(name, 0, sha1, &oc))
          die("...");
  if (oc.path[0])
          delim = '/'; /* name had a partial path */

would work. Root trees via "v2.9.3^{tree}" or "v2.9.3:" would have no
path, but "v2.9.3:Documentation" would. I think you'd still need to
avoid duplicating a trailing delimiter, but I can't think of a case
where it is wrong to do that based purely on the name.

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-07 15:04       ` Stefan Hajnoczi
  2017-02-07 19:50         ` Jeff King
@ 2017-02-07 20:24         ` Junio C Hamano
  2017-02-07 20:37           ` Junio C Hamano
  2017-02-09  3:58           ` Jeff King
  1 sibling, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-02-07 20:24 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Brandon Williams, git, Jeff King

Stefan Hajnoczi <stefanha@redhat.com> writes:

> Perhaps it's better to leave this than to merge code that doesn't work
> correctly 100% of the time.

I am not sure if you are shooting for is "work correctly" to begin
with, to be honest.  The current code always shows the "correct"
output which is "the tree-ish object name (expressed in a way easier
to understand by the humans), followed by a colon, followed by the
path in the tree-ish the hit lies".  You are making it "incorrect
but often more convenient", and sometimes that is a worth goal, but
for the particular use cases you presented, i.e.

    $ git grep -e "$pattern" "$commit:path"

a more natural way to express "I want to find this pattern in the
commit under that path" exists:

    $ git grep -e "$pattern" "$commit" -- path

and because of that, I do not think the former form of the query
should happen _less_ often in the first place, which would make it
"incorrect but more convenient if the user gives an unusual query".

So I am not sure if the change to "grep" is worth it.

Having said that, I actually think "make it more convenient" without
making anything incorrect would be to teach the revision parser to
understand

    <any-expression-to-name-a-tree-ish:<path>

as an extended SHA-1 expression to name the blob or the tree at that
path in the tree-ish, e.g. if we can make the revision parser to
take this

    master:Documentation:git.txt

as the name of the blob object, then the current output is both
correct and more convenient.  After all, this sample string starts
at "master:Documentation" (which is an extended SHA-1 expression to
name a tree-ish), followed by a colon, then followed by the path
"git.txt" in it, and "grep -e pattern master:Documentation" would
show hits in that blob prefixed with it.

I.e.

	T=$(git rev-parse master:Documentation)
	git cat-file blob $T:git.txt

would give you the contents of the source to the Git manual.  It is
not all that unreasonable to expect

	git cat-file blob master:Documentation:git.txt

to be able to show the same thing as well.  You'd need to backtrack
the parsing (e.g. attempt to find "Documentation:git.txt" in
"master", fail to find any, then fall back to find "git.txt" in
"master:Documentation", find one, and be happy, or something like
that), and define how to resolve potential ambiguity (e.g. there may
indeed be "Documentation:git.txt" and "Documentation/git.txt" in the
tree-ish "master"), though.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-07 20:24         ` Junio C Hamano
@ 2017-02-07 20:37           ` Junio C Hamano
  2017-02-09  3:58           ` Jeff King
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2017-02-07 20:37 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Brandon Williams, git, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

Sorry, one shouldn't type while being sick and in bed X-<.

> I am not sure if you are shooting for is "work correctly" to begin
> with, to be honest.  The current code always shows the "correct"
> output which is "the tree-ish object name (expressed in a way easier
> to understand by the humans), followed by a colon, followed by the
> path in the tree-ish the hit lies".  You are making it "incorrect
> but often more convenient", and sometimes that is a worth goal, but

s/worth/&y/;

> for the particular use cases you presented, i.e.
>
>     $ git grep -e "$pattern" "$commit:path"
>
> a more natural way to express "I want to find this pattern in the
> commit under that path" exists:
>
>     $ git grep -e "$pattern" "$commit" -- path
>
> and because of that, I do not think the former form of the query

s/do not think/do think/

> should happen _less_ often in the first place, which would make it
> "incorrect but more convenient if the user gives an unusual query".
>
> So I am not sure if the change to "grep" is worth it.

Also, it may be fairer to do s/incorrect/inconsistent/.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-07 20:24         ` Junio C Hamano
  2017-02-07 20:37           ` Junio C Hamano
@ 2017-02-09  3:58           ` Jeff King
  2017-02-09  5:14             ` Junio C Hamano
  1 sibling, 1 reply; 15+ messages in thread
From: Jeff King @ 2017-02-09  3:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Hajnoczi, Brandon Williams, git

On Tue, Feb 07, 2017 at 12:24:30PM -0800, Junio C Hamano wrote:

> Having said that, I actually think "make it more convenient" without
> making anything incorrect would be to teach the revision parser to
> understand
> 
>     <any-expression-to-name-a-tree-ish:<path>
> 
> as an extended SHA-1 expression to name the blob or the tree at that
> path in the tree-ish, e.g. if we can make the revision parser to
> take this
> 
>     master:Documentation:git.txt

So here I was wondering what happens when you have a filename with a
colon in it, but then...

> to be able to show the same thing as well.  You'd need to backtrack
> the parsing (e.g. attempt to find "Documentation:git.txt" in
> "master", fail to find any, then fall back to find "git.txt" in
> "master:Documentation", find one, and be happy, or something like
> that), and define how to resolve potential ambiguity (e.g. there may
> indeed be "Documentation:git.txt" and "Documentation/git.txt" in the
> tree-ish "master"), though.

...you obviously did think of that. Backtracking sounds pretty nasty,
though. What's the time complexity of parsing:

  master:a:a:a:a:a:a:a:a:a:a:a

I think there are 2^(n-1) possible paths (each colon can be a real colon
or a slash). Though I guess if you walk the trees as you go, you only
have to examine at most "n" paths to find the first-level tree, and then
at most "n-1" paths at the second level, and so on.

Unless you really do have ambiguous trees, in which case you have to
walk down multiple paths.

It certainly would not be the first combinatoric explosion you can
convince Git to perform. But it does seem like a lot of complication for
something as simple as path lookups.

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-09  3:58           ` Jeff King
@ 2017-02-09  5:14             ` Junio C Hamano
  2017-02-09  5:20               ` Jeff King
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2017-02-09  5:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Hajnoczi, Brandon Williams, git

Jeff King <peff@peff.net> writes:

>   master:a:a:a:a:a:a:a:a:a:a:a
>
> I think there are 2^(n-1) possible paths (each colon can be a real colon
> or a slash). Though I guess if you walk the trees as you go, you only
> have to examine at most "n" paths to find the first-level tree, and then
> at most "n-1" paths at the second level, and so on.
>
> Unless you really do have ambiguous trees, in which case you have to
> walk down multiple paths.
>
> It certainly would not be the first combinatoric explosion you can
> convince Git to perform. But it does seem like a lot of complication for
> something as simple as path lookups.

That is true, and we may want to avoid the implementation complexity
of the backtracking name resolution.  If you are on the other hand
worried about the runtime cost, it will be an issue to begin with
only for those who do "git grep -e pattern HEAD:t/perf", which is an
unnatural way to do "git grep -e pattern HEAD -- t/perf", and the
output from the latter won't have such an issue, so...



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-09  5:14             ` Junio C Hamano
@ 2017-02-09  5:20               ` Jeff King
  2017-02-14  9:43                 ` Stefan Hajnoczi
  0 siblings, 1 reply; 15+ messages in thread
From: Jeff King @ 2017-02-09  5:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Hajnoczi, Brandon Williams, git

On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >   master:a:a:a:a:a:a:a:a:a:a:a
> >
> > I think there are 2^(n-1) possible paths (each colon can be a real colon
> > or a slash). Though I guess if you walk the trees as you go, you only
> > have to examine at most "n" paths to find the first-level tree, and then
> > at most "n-1" paths at the second level, and so on.
> >
> > Unless you really do have ambiguous trees, in which case you have to
> > walk down multiple paths.
> >
> > It certainly would not be the first combinatoric explosion you can
> > convince Git to perform. But it does seem like a lot of complication for
> > something as simple as path lookups.
> 
> That is true, and we may want to avoid the implementation complexity
> of the backtracking name resolution.  If you are on the other hand
> worried about the runtime cost, it will be an issue to begin with
> only for those who do "git grep -e pattern HEAD:t/perf", which is an
> unnatural way to do "git grep -e pattern HEAD -- t/perf", and the
> output from the latter won't have such an issue, so...

I thought your point was to move it into the get_sha1() parser (so that
while the form is only generated by "git grep", it can be accepted by
any git command). That exposes it in a lot of places, including ones
which are network accessible to things like gitweb (or GitHub, of
course, which is my concern).

Even without the runtime cost, though, I think the general complexity
makes it an ugly path to go down (e.g., handling ambiguous cases). I
wouldn't want to have to write the documentation for it. :)

(I _do_ think Stefan's proposed direction is worth it simply because the
result is easier to read, but I agree the whole thing can be avoided by
using pathspecs, as you've noted).

-Peff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
  2017-02-09  5:20               ` Jeff King
@ 2017-02-14  9:43                 ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2017-02-14  9:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Brandon Williams, git

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]

On Thu, Feb 09, 2017 at 12:20:34AM -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:
> > Jeff King <peff@peff.net> writes:
> (I _do_ think Stefan's proposed direction is worth it simply because the
> result is easier to read, but I agree the whole thing can be avoided by
> using pathspecs, as you've noted).

I won't be pushing this series further due to limited time.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-02-14 13:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 17:11 [PATCH v2 0/2] grep: make output consistent with revision syntax Stefan Hajnoczi
2017-01-20 17:11 ` [PATCH v2 1/2] grep: only add delimiter if there isn't one already Stefan Hajnoczi
2017-01-20 22:19   ` Junio C Hamano
2017-01-20 17:11 ` [PATCH v2 2/2] grep: use '/' delimiter for paths Stefan Hajnoczi
2017-01-20 22:18   ` Junio C Hamano
2017-01-20 23:51     ` Brandon Williams
2017-02-07 15:04       ` Stefan Hajnoczi
2017-02-07 19:50         ` Jeff King
2017-02-07 20:24         ` Junio C Hamano
2017-02-07 20:37           ` Junio C Hamano
2017-02-09  3:58           ` Jeff King
2017-02-09  5:14             ` Junio C Hamano
2017-02-09  5:20               ` Jeff King
2017-02-14  9:43                 ` Stefan Hajnoczi
2017-01-20 23:11 ` [PATCH v2 0/2] grep: make output consistent with revision syntax Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).