git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug in pathspec handling (in conjunction with submodules)
@ 2017-11-26  2:15 Johannes Schindelin
  2017-11-28 23:06 ` Brandon Williams
  2017-11-28 23:22 ` [PATCH] pathspec: only match across submodule boundaries when requested Brandon Williams
  0 siblings, 2 replies; 8+ messages in thread
From: Johannes Schindelin @ 2017-11-26  2:15 UTC (permalink / raw)
  To: Brandon Williams, Nguyễn Thái Ngọc Duy; +Cc: git

Hi Duy & Brandon,

in 74ed43711fd (grep: enable recurse-submodules to work on <tree> objects,
2016-12-16), the do_match() function in tree-walk.c was changed so that it
can recurse across submodule boundaries.

However, there is a bug, and I *think* there may be two bugs actually. Or
even three.

First of all, here is an MCVE that I distilled from
https://github.com/git-for-windows/git/issues/1371:

	git init repo
	cd repo

	git init submodule
	git -C submodule commit -m initial --allow-empty

	touch "[bracket]"
	git add "[bracket]"
	git commit -m bracket
	git add submodule
	git commit -m submodule

	git rev-list HEAD -- "[bracket]"

Nothing fancy, just adding a file with brackets in the name, then a
submodule, then showing the commit history filtered by the funny file
name.

However, the log prints *both* commits. Clearly the submodule commit
should *not* be shown.

Now, how does this all happen?

Since the pathspec contains brackets, parse_pathspec() marks it as
containing wildcards and sets nowildcard_len to 0.

Now, note that [bracket] *is* a wildcard expression: it should only match
a single character that is one of  a, b, c, e, k, r or t.

I think this is the first bug: `git rev-list` should not even match the
commit that adds the file [bracket] because its file name does not match
that expression. From where I sit, it would appear that f1a2ddbbc2d
(tree_entry_interesting(): optimize wildcard matching when base is
matched, 2010-12-15) simply added the fnmatch() code without disabling the
literal match_entry() code when the pathspec contains a pattern.

But it does not stop there: there is *another* bug which causes the
pattern to somehow match the submodule. I *guess* the idea of
https://github.com/git/git/commit/74ed43711#diff-7a08243175f2cae66aedf53f7dce3bdfR1015
was to allow a pattern like *.c to match files in a submodule, but the
pattern [bracket] should not match any file in submodule/. I think that
that code needs to be a little bit more careful to try to match the
submodule's name against the pattern (it seems to interpret nowildcard_len
== 0 to mean that the wildcard is `*`).

However, the commit introducing that code wanted to teach *grep* (not
*rev-list*) a new trick, and it relies on the `recursive` flag of the
pathspec to be set.

And now it gets really interesting. Or confusing, depending on your mental
condition. This recursive flag of the pathspec is set in
ll_diff_tree_paths() (yep, changing the flag in the passed-in opt
structure... which I found a bit... unexpected, given the function name, I
would have been less surprised if that function only diff'ed the trees and
used the options without changing the options). That flag-change was
introduced in
https://github.com/git/git/commit/bc96cc87dbb2#diff-15203e8cd8ee9191113894de9d97a8a6R149
which is another patch that changed the tree diff machinery to accommodate
`git grep` (but maybe not really paying a lot of attention to the fact
that the same machinery is called repeatedly by the revision machinery,
too).

I am really confused by this code mainly due to the fact that the term
"recursive" is pretty ambiguous in that context: does it refer to
directories/tree objects, or to submodules? I guess it is used for both
when there should be two flags so that rev-list can recurse over tree
objects but not submodules (unless told to do so).

The problem, of course, is that `git rev-list HEAD -- '[bracket]'` never
recurses into the submodule. And therefore, the promised "more accurate
matching [...] in the submodule" is never performed. And the commit adding
the submodule is never pruned.

Since I am not really familiar with all that tree diff code (and as a
general rule to protect my mental health, I try my best to stay away from
submodules, too), but you two are, may I ask you gentle people to have a
closer look to fix those bugs?

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in pathspec handling (in conjunction with submodules)
  2017-11-26  2:15 Bug in pathspec handling (in conjunction with submodules) Johannes Schindelin
@ 2017-11-28 23:06 ` Brandon Williams
  2017-11-28 23:22 ` [PATCH] pathspec: only match across submodule boundaries when requested Brandon Williams
  1 sibling, 0 replies; 8+ messages in thread
From: Brandon Williams @ 2017-11-28 23:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nguyễn Thái Ngọc Duy, git

On 11/26, Johannes Schindelin wrote:
> Hi Duy & Brandon,
> 
> in 74ed43711fd (grep: enable recurse-submodules to work on <tree> objects,
> 2016-12-16), the do_match() function in tree-walk.c was changed so that it
> can recurse across submodule boundaries.
> 
> However, there is a bug, and I *think* there may be two bugs actually. Or
> even three.
> 
> First of all, here is an MCVE that I distilled from
> https://github.com/git-for-windows/git/issues/1371:
> 
> 	git init repo
> 	cd repo
> 
> 	git init submodule
> 	git -C submodule commit -m initial --allow-empty
> 
> 	touch "[bracket]"
> 	git add "[bracket]"
> 	git commit -m bracket
> 	git add submodule
> 	git commit -m submodule
> 
> 	git rev-list HEAD -- "[bracket]"
> 
> Nothing fancy, just adding a file with brackets in the name, then a
> submodule, then showing the commit history filtered by the funny file
> name.
> 
> However, the log prints *both* commits. Clearly the submodule commit
> should *not* be shown.
> 
> Now, how does this all happen?
> 
> Since the pathspec contains brackets, parse_pathspec() marks it as
> containing wildcards and sets nowildcard_len to 0.
> 
> Now, note that [bracket] *is* a wildcard expression: it should only match
> a single character that is one of  a, b, c, e, k, r or t.
> 
> I think this is the first bug: `git rev-list` should not even match the
> commit that adds the file [bracket] because its file name does not match
> that expression. From where I sit, it would appear that f1a2ddbbc2d
> (tree_entry_interesting(): optimize wildcard matching when base is
> matched, 2010-12-15) simply added the fnmatch() code without disabling the
> literal match_entry() code when the pathspec contains a pattern.

I can see both sides to this, wanting to try matching literally first
and then trying the wildcards, so I don't really have an opinion on
how/if we should fix that.

> 
> But it does not stop there: there is *another* bug which causes the
> pattern to somehow match the submodule. I *guess* the idea of
> https://github.com/git/git/commit/74ed43711#diff-7a08243175f2cae66aedf53f7dce3bdfR1015
> was to allow a pattern like *.c to match files in a submodule, but the
> pattern [bracket] should not match any file in submodule/. I think that
> that code needs to be a little bit more careful to try to match the
> submodule's name against the pattern (it seems to interpret nowildcard_len
> == 0 to mean that the wildcard is `*`).

This is a much bigger issue and I'm surprised it took this long to find
this bug.  And of course its due to one of my earlier contributions to
the project :)

> 
> However, the commit introducing that code wanted to teach *grep* (not
> *rev-list*) a new trick, and it relies on the `recursive` flag of the
> pathspec to be set.

This is the root cause of the bug.  The added code to match against
submodules was intended to allow for matching past submodule boundaries
for those commands (like grep) which are recursing submodules.  So
really there should be an additional flag which is passed in to trigger
this logic instead of relying on the recursive flag of the pathspec.  Or
we can add a recurse_submodules flag to the pathspec struct and respect
that flag instead of the 'recursive' flag.

I have a quick patch to do just that which I'll send shortly.

> 
> And now it gets really interesting. Or confusing, depending on your mental
> condition. This recursive flag of the pathspec is set in
> ll_diff_tree_paths() (yep, changing the flag in the passed-in opt
> structure... which I found a bit... unexpected, given the function name, I
> would have been less surprised if that function only diff'ed the trees and
> used the options without changing the options). That flag-change was
> introduced in
> https://github.com/git/git/commit/bc96cc87dbb2#diff-15203e8cd8ee9191113894de9d97a8a6R149
> which is another patch that changed the tree diff machinery to accommodate
> `git grep` (but maybe not really paying a lot of attention to the fact
> that the same machinery is called repeatedly by the revision machinery,
> too).
> 
> I am really confused by this code mainly due to the fact that the term
> "recursive" is pretty ambiguous in that context: does it refer to
> directories/tree objects, or to submodules? I guess it is used for both
> when there should be two flags so that rev-list can recurse over tree
> objects but not submodules (unless told to do so).
> 
> The problem, of course, is that `git rev-list HEAD -- '[bracket]'` never
> recurses into the submodule. And therefore, the promised "more accurate
> matching [...] in the submodule" is never performed. And the commit adding
> the submodule is never pruned.
> 
> Since I am not really familiar with all that tree diff code (and as a
> general rule to protect my mental health, I try my best to stay away from
> submodules, too), but you two are, may I ask you gentle people to have a
> closer look to fix those bugs?
> 
> Thanks,
> Dscho

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] pathspec: only match across submodule boundaries when requested
  2017-11-26  2:15 Bug in pathspec handling (in conjunction with submodules) Johannes Schindelin
  2017-11-28 23:06 ` Brandon Williams
@ 2017-11-28 23:22 ` Brandon Williams
  2017-11-29 21:29   ` Johannes Schindelin
  2017-12-05  0:07   ` [PATCH v2] " Brandon Williams
  1 sibling, 2 replies; 8+ messages in thread
From: Brandon Williams @ 2017-11-28 23:22 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, pclouds, Brandon Williams

Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
match across submodule boundaries in the presence of wildcards.  This is
done by performing literal matching up to the first wildcard and then
punting to the submodule itself to perform more accurate pattern
matching.  Instead of introducing a new flag to request this behavior,
commit 74ed43711fd overloaded the already existing 'recursive' flag in
'struct pathspec' to request this behavior.

This leads to a bug where whenever any other caller has the 'recursive'
flag set as well as a pathspec with wildcards that all submodules will
be indicated as matches.  One simple example of this is:

	git init repo
	cd repo

	git init submodule
	git -C submodule commit -m initial --allow-empty

	touch "[bracket]"
	git add "[bracket]"
	git commit -m bracket
	git add submodule
	git commit -m submodule

	git rev-list HEAD -- "[bracket]"

Fix this by introducing the new flag 'recurse_submodules' in 'struct
pathspec' and using this flag to determine if matches should be allowed
to cross submodule boundaries.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/grep.c                |  1 +
 pathspec.h                    |  1 +
 t/t4208-log-magic-pathspec.sh | 17 +++++++++++++++++
 tree-walk.c                   |  5 +++--
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5a6cfe6b4..3ca4ac80d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		       prefix, argv + i);
 	pathspec.max_depth = opt.max_depth;
 	pathspec.recursive = 1;
+	pathspec.recurse_submodules = !!recurse_submodules;
 
 #ifndef NO_PTHREADS
 	if (list.nr || cached || show_in_pager)
diff --git a/pathspec.h b/pathspec.h
index 6420d1080..099a170c2 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -24,6 +24,7 @@ struct pathspec {
 	int nr;
 	unsigned int has_wildcard:1;
 	unsigned int recursive:1;
+	unsigned int recurse_submodules:1;
 	unsigned magic;
 	int max_depth;
 	struct pathspec_item {
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 935df6a65..bd583af0e 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -93,4 +93,21 @@ test_expect_success 'command line pathspec parsing for "git log"' '
 	git log --merge -- a
 '
 
+test_expect_success 'tree_entry_interesting does not match past submodule boundaries' '
+	test_when_finished "rm -rf repo submodule" &&
+	git init submodule &&
+	test_commit -C submodule initial &&
+	git init repo &&
+	>"repo/[bracket]" &&
+	git -C repo add "[bracket]" &&
+	git -C repo commit -m bracket &&
+	git -C repo rev-list HEAD -- "[bracket]" >expect &&
+
+	git -C repo submodule add ../submodule &&
+	git -C repo commit -m submodule &&
+
+	git -C repo rev-list HEAD -- "[bracket]" >actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/tree-walk.c b/tree-walk.c
index 684f0e337..63a87ed66 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -1011,7 +1011,8 @@ static enum interesting do_match(const struct name_entry *entry,
 				 * character.  More accurate matching can then
 				 * be performed in the submodule itself.
 				 */
-				if (ps->recursive && S_ISGITLINK(entry->mode) &&
+				if (ps->recurse_submodules &&
+				    S_ISGITLINK(entry->mode) &&
 				    !ps_strncmp(item, match + baselen,
 						entry->path,
 						item->nowildcard_len - baselen))
@@ -1060,7 +1061,7 @@ static enum interesting do_match(const struct name_entry *entry,
 		 * character.  More accurate matching can then
 		 * be performed in the submodule itself.
 		 */
-		if (ps->recursive && S_ISGITLINK(entry->mode) &&
+		if (ps->recurse_submodules && S_ISGITLINK(entry->mode) &&
 		    !ps_strncmp(item, match, base->buf + base_offset,
 				item->nowildcard_len)) {
 			strbuf_setlen(base, base_offset + baselen);
-- 
2.15.0.531.g2ccb3012c9-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pathspec: only match across submodule boundaries when requested
  2017-11-28 23:22 ` [PATCH] pathspec: only match across submodule boundaries when requested Brandon Williams
@ 2017-11-29 21:29   ` Johannes Schindelin
  2017-12-05  0:09     ` Brandon Williams
  2017-12-05  0:07   ` [PATCH v2] " Brandon Williams
  1 sibling, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2017-11-29 21:29 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, pclouds

Hi Brandon,

On Tue, 28 Nov 2017, Brandon Williams wrote:

> Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> match across submodule boundaries in the presence of wildcards.  This is
> done by performing literal matching up to the first wildcard and then
> punting to the submodule itself to perform more accurate pattern
> matching.  Instead of introducing a new flag to request this behavior,
> commit 74ed43711fd overloaded the already existing 'recursive' flag in
> 'struct pathspec' to request this behavior.
> 
> This leads to a bug where whenever any other caller has the 'recursive'
> flag set as well as a pathspec with wildcards that all submodules will
> be indicated as matches.  One simple example of this is:
> 
> 	git init repo
> 	cd repo
> 
> 	git init submodule
> 	git -C submodule commit -m initial --allow-empty
> 
> 	touch "[bracket]"
> 	git add "[bracket]"
> 	git commit -m bracket
> 	git add submodule
> 	git commit -m submodule
> 
> 	git rev-list HEAD -- "[bracket]"
> 
> Fix this by introducing the new flag 'recurse_submodules' in 'struct
> pathspec' and using this flag to determine if matches should be allowed
> to cross submodule boundaries.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Could you also add something like

	This fixes https://github.com/git-for-windows/git/issues/1371

at the end of the commit message, to keep a reference to the original bug
report?

>  4 files changed, 22 insertions(+), 2 deletions(-)

Phew. That was much smaller than I expected.

> +test_expect_success 'tree_entry_interesting does not match past submodule boundaries' '
> +	test_when_finished "rm -rf repo submodule" &&
> +	git init submodule &&
> +	test_commit -C submodule initial &&
> +	git init repo &&
> +	>"repo/[bracket]" &&
> +	git -C repo add "[bracket]" &&
> +	git -C repo commit -m bracket &&
> +	git -C repo rev-list HEAD -- "[bracket]" >expect &&
> +
> +	git -C repo submodule add ../submodule &&
> +	git -C repo commit -m submodule &&
> +
> +	git -C repo rev-list HEAD -- "[bracket]" >actual &&
> +	test_cmp expect actual
> +'

Nicely prepared for a new hash function, too (no explicit SHA-1).

I wonder, however, why we can't `git checkout -b bracket` and
`test_when_finished "git checkout master"` and void those many `-C repo`
options. But then, it is actually one of the shorter test cases, and
pretty easy to understand.

However, I would still like to see `test_tick`s before those `git commit`
calls, to make the commit names reproducible.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] pathspec: only match across submodule boundaries when requested
  2017-11-28 23:22 ` [PATCH] pathspec: only match across submodule boundaries when requested Brandon Williams
  2017-11-29 21:29   ` Johannes Schindelin
@ 2017-12-05  0:07   ` Brandon Williams
  2017-12-05 19:19     ` Junio C Hamano
  2017-12-06 21:20     ` Johannes Schindelin
  1 sibling, 2 replies; 8+ messages in thread
From: Brandon Williams @ 2017-12-05  0:07 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, Brandon Williams

Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
match across submodule boundaries in the presence of wildcards.  This is
done by performing literal matching up to the first wildcard and then
punting to the submodule itself to perform more accurate pattern
matching.  Instead of introducing a new flag to request this behavior,
commit 74ed43711fd overloaded the already existing 'recursive' flag in
'struct pathspec' to request this behavior.

This leads to a bug where whenever any other caller has the 'recursive'
flag set as well as a pathspec with wildcards that all submodules will
be indicated as matches.  One simple example of this is:

	git init repo
	cd repo

	git init submodule
	git -C submodule commit -m initial --allow-empty

	touch "[bracket]"
	git add "[bracket]"
	git commit -m bracket
	git add submodule
	git commit -m submodule

	git rev-list HEAD -- "[bracket]"

Fix this by introducing the new flag 'recurse_submodules' in 'struct
pathspec' and using this flag to determine if matches should be allowed
to cross submodule boundaries.

This fixes https://github.com/git-for-windows/git/issues/1371.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/grep.c                |  1 +
 pathspec.h                    |  1 +
 t/t4208-log-magic-pathspec.sh | 19 +++++++++++++++++++
 tree-walk.c                   |  5 +++--
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5a6cfe6b4..3ca4ac80d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		       prefix, argv + i);
 	pathspec.max_depth = opt.max_depth;
 	pathspec.recursive = 1;
+	pathspec.recurse_submodules = !!recurse_submodules;
 
 #ifndef NO_PTHREADS
 	if (list.nr || cached || show_in_pager)
diff --git a/pathspec.h b/pathspec.h
index 6420d1080..099a170c2 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -24,6 +24,7 @@ struct pathspec {
 	int nr;
 	unsigned int has_wildcard:1;
 	unsigned int recursive:1;
+	unsigned int recurse_submodules:1;
 	unsigned magic;
 	int max_depth;
 	struct pathspec_item {
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 935df6a65..a1705f70c 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -93,4 +93,23 @@ test_expect_success 'command line pathspec parsing for "git log"' '
 	git log --merge -- a
 '
 
+test_expect_success 'tree_entry_interesting does not match past submodule boundaries' '
+	test_when_finished "rm -rf repo submodule" &&
+	git init submodule &&
+	test_commit -C submodule initial &&
+	git init repo &&
+	>"repo/[bracket]" &&
+	git -C repo add "[bracket]" &&
+	test_tick &&
+	git -C repo commit -m bracket &&
+	git -C repo rev-list HEAD -- "[bracket]" >expect &&
+
+	git -C repo submodule add ../submodule &&
+	test_tick &&
+	git -C repo commit -m submodule &&
+
+	git -C repo rev-list HEAD -- "[bracket]" >actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/tree-walk.c b/tree-walk.c
index 684f0e337..63a87ed66 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -1011,7 +1011,8 @@ static enum interesting do_match(const struct name_entry *entry,
 				 * character.  More accurate matching can then
 				 * be performed in the submodule itself.
 				 */
-				if (ps->recursive && S_ISGITLINK(entry->mode) &&
+				if (ps->recurse_submodules &&
+				    S_ISGITLINK(entry->mode) &&
 				    !ps_strncmp(item, match + baselen,
 						entry->path,
 						item->nowildcard_len - baselen))
@@ -1060,7 +1061,7 @@ static enum interesting do_match(const struct name_entry *entry,
 		 * character.  More accurate matching can then
 		 * be performed in the submodule itself.
 		 */
-		if (ps->recursive && S_ISGITLINK(entry->mode) &&
+		if (ps->recurse_submodules && S_ISGITLINK(entry->mode) &&
 		    !ps_strncmp(item, match, base->buf + base_offset,
 				item->nowildcard_len)) {
 			strbuf_setlen(base, base_offset + baselen);
-- 
2.15.1.424.g9478a66081-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pathspec: only match across submodule boundaries when requested
  2017-11-29 21:29   ` Johannes Schindelin
@ 2017-12-05  0:09     ` Brandon Williams
  0 siblings, 0 replies; 8+ messages in thread
From: Brandon Williams @ 2017-12-05  0:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, pclouds

On 11/29, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Tue, 28 Nov 2017, Brandon Williams wrote:
> 
> > Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
> > objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> > match across submodule boundaries in the presence of wildcards.  This is
> > done by performing literal matching up to the first wildcard and then
> > punting to the submodule itself to perform more accurate pattern
> > matching.  Instead of introducing a new flag to request this behavior,
> > commit 74ed43711fd overloaded the already existing 'recursive' flag in
> > 'struct pathspec' to request this behavior.
> > 
> > This leads to a bug where whenever any other caller has the 'recursive'
> > flag set as well as a pathspec with wildcards that all submodules will
> > be indicated as matches.  One simple example of this is:
> > 
> > 	git init repo
> > 	cd repo
> > 
> > 	git init submodule
> > 	git -C submodule commit -m initial --allow-empty
> > 
> > 	touch "[bracket]"
> > 	git add "[bracket]"
> > 	git commit -m bracket
> > 	git add submodule
> > 	git commit -m submodule
> > 
> > 	git rev-list HEAD -- "[bracket]"
> > 
> > Fix this by introducing the new flag 'recurse_submodules' in 'struct
> > pathspec' and using this flag to determine if matches should be allowed
> > to cross submodule boundaries.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> Could you also add something like
> 
> 	This fixes https://github.com/git-for-windows/git/issues/1371
> 
> at the end of the commit message, to keep a reference to the original bug
> report?

Yep! I can do that.

> 
> >  4 files changed, 22 insertions(+), 2 deletions(-)
> 
> Phew. That was much smaller than I expected.
> 
> > +test_expect_success 'tree_entry_interesting does not match past submodule boundaries' '
> > +	test_when_finished "rm -rf repo submodule" &&
> > +	git init submodule &&
> > +	test_commit -C submodule initial &&
> > +	git init repo &&
> > +	>"repo/[bracket]" &&
> > +	git -C repo add "[bracket]" &&
> > +	git -C repo commit -m bracket &&
> > +	git -C repo rev-list HEAD -- "[bracket]" >expect &&
> > +
> > +	git -C repo submodule add ../submodule &&
> > +	git -C repo commit -m submodule &&
> > +
> > +	git -C repo rev-list HEAD -- "[bracket]" >actual &&
> > +	test_cmp expect actual
> > +'
> 
> Nicely prepared for a new hash function, too (no explicit SHA-1).
> 
> I wonder, however, why we can't `git checkout -b bracket` and
> `test_when_finished "git checkout master"` and void those many `-C repo`
> options. But then, it is actually one of the shorter test cases, and
> pretty easy to understand.
> 
> However, I would still like to see `test_tick`s before those `git commit`
> calls, to make the commit names reproducible.

In v2 I added the calls to test_tick.  I've never used the function
myself so hopefully I used it correctly! :)

> 
> Thanks,
> Dscho

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] pathspec: only match across submodule boundaries when requested
  2017-12-05  0:07   ` [PATCH v2] " Brandon Williams
@ 2017-12-05 19:19     ` Junio C Hamano
  2017-12-06 21:20     ` Johannes Schindelin
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2017-12-05 19:19 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, Johannes.Schindelin

Brandon Williams <bmwill@google.com> writes:

> Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> match across submodule boundaries in the presence of wildcards.  This is
> done by performing literal matching up to the first wildcard and then
> punting to the submodule itself to perform more accurate pattern
> matching.  Instead of introducing a new flag to request this behavior,
> commit 74ed43711fd overloaded the already existing 'recursive' flag in
> 'struct pathspec' to request this behavior.
>
> This leads to a bug where whenever any other caller has the 'recursive'
> flag set as well as a pathspec with wildcards that all submodules will
> be indicated as matches.  One simple example of this is:
>
> 	git init repo
> 	cd repo
>
> 	git init submodule
> 	git -C submodule commit -m initial --allow-empty
>
> 	touch "[bracket]"
> 	git add "[bracket]"
> 	git commit -m bracket
> 	git add submodule
> 	git commit -m submodule
>
> 	git rev-list HEAD -- "[bracket]"
>
> Fix this by introducing the new flag 'recurse_submodules' in 'struct
> pathspec' and using this flag to determine if matches should be allowed
> to cross submodule boundaries.

Makes sense.

I initially misread the title

    "pathspec: only match across submodule boundaries when requested"

to be saying that the match happens only at the boundary, but that
"only" is not about where the match happens.

    "pathspec: match across submodule boundaries only when requested"

would have avoided such a confusion.

> diff --git a/builtin/grep.c b/builtin/grep.c
> index 5a6cfe6b4..3ca4ac80d 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1015,6 +1015,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		       prefix, argv + i);
>  	pathspec.max_depth = opt.max_depth;
>  	pathspec.recursive = 1;
> +	pathspec.recurse_submodules = !!recurse_submodules;

With the current code, recurse_submodules can only be 0 or 1 (the
only assignment is from the return value of git_config_bool()), so
the force-boolean !! is not strictly needed, but it may be a good
future-proofing measure.

Will queue; thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] pathspec: only match across submodule boundaries when requested
  2017-12-05  0:07   ` [PATCH v2] " Brandon Williams
  2017-12-05 19:19     ` Junio C Hamano
@ 2017-12-06 21:20     ` Johannes Schindelin
  1 sibling, 0 replies; 8+ messages in thread
From: Johannes Schindelin @ 2017-12-06 21:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git

Hi Brandon,

On Mon, 4 Dec 2017, Brandon Williams wrote:

> Commit 74ed43711fd (grep: enable recurse-submodules to work on <tree>
> objects, 2016-12-16) taught 'tree_entry_interesting()' to be able to
> match across submodule boundaries in the presence of wildcards.  This is
> done by performing literal matching up to the first wildcard and then
> punting to the submodule itself to perform more accurate pattern
> matching.  Instead of introducing a new flag to request this behavior,
> commit 74ed43711fd overloaded the already existing 'recursive' flag in
> 'struct pathspec' to request this behavior.
> 
> This leads to a bug where whenever any other caller has the 'recursive'
> flag set as well as a pathspec with wildcards that all submodules will
> be indicated as matches.  One simple example of this is:
> 
> 	git init repo
> 	cd repo
> 
> 	git init submodule
> 	git -C submodule commit -m initial --allow-empty
> 
> 	touch "[bracket]"
> 	git add "[bracket]"
> 	git commit -m bracket
> 	git add submodule
> 	git commit -m submodule
> 
> 	git rev-list HEAD -- "[bracket]"
> 
> Fix this by introducing the new flag 'recurse_submodules' in 'struct
> pathspec' and using this flag to determine if matches should be allowed
> to cross submodule boundaries.
> 
> This fixes https://github.com/git-for-windows/git/issues/1371.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-12-06 21:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-26  2:15 Bug in pathspec handling (in conjunction with submodules) Johannes Schindelin
2017-11-28 23:06 ` Brandon Williams
2017-11-28 23:22 ` [PATCH] pathspec: only match across submodule boundaries when requested Brandon Williams
2017-11-29 21:29   ` Johannes Schindelin
2017-12-05  0:09     ` Brandon Williams
2017-12-05  0:07   ` [PATCH v2] " Brandon Williams
2017-12-05 19:19     ` Junio C Hamano
2017-12-06 21:20     ` Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).