git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] grep: demonstrate bug with textconv attributes and submodules
@ 2021-09-28 17:08 Matheus Tavares
  2021-09-28 17:15 ` Eric Sunshine
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Matheus Tavares @ 2021-09-28 17:08 UTC (permalink / raw)
  To: git; +Cc: avarab, gitster, jonathantanmy

In some circumstances, "git grep --textconv --recurse-submodules"
ignores the textconv attributes from the submodules and erroneuosly
apply the attributes defined in the superproject on the submodules'
files. The textconv cache is also saved on the superproject, even for
submodule objects.

A fix for these problems will probably require at least three changes:

- Some textconv and attributes functions (as well as their callees) will
  have to be adjusted to work with arbitrary repositories. Note that
  "fill_textconv()", for example, already receives a "struct repository"
  but it writes the textconv cache using "write_loose_object()", which
  implicitly works on "the_repository".

- grep.c functions will have to call textconv/userdiff routines passing
  the "repo" field from "struct grep_source" instead of the one from
  "struct grep_opt". The latter always points to "the_repository" on
  "git grep" executions (see its initialization in builtin/grep.c), but
  the former points to the correct repository that each source (an
  object, file, or buffer) comes from.

- "userdiff_find_by_path()" might need to use a different attributes
  stack for each repository it works on or reset its internal static
  stack when the repository is changed throughout the calls.

For now, let's add some tests to demonstrate these problems, and also
update a NEEDSWORK comment in grep.h that mentions this bug to reference
the added tests.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 grep.h                             |   6 +-
 t/t7814-grep-recurse-submodules.sh | 103 +++++++++++++++++++++++++++++
 2 files changed, 106 insertions(+), 3 deletions(-)

diff --git a/grep.h b/grep.h
index 128007db65..3b63bd0253 100644
--- a/grep.h
+++ b/grep.h
@@ -128,9 +128,9 @@ struct grep_opt {
 	 * instead.
 	 *
 	 * This is potentially the cause of at least one bug - "git grep"
-	 * ignoring the textconv attributes from submodules. See [1] for more
-	 * information.
-	 * [1] https://lore.kernel.org/git/CAHd-oW5iEQarYVxEXoTG-ua2zdoybTrSjCBKtO0YT292fm0NQQ@mail.gmail.com/
+	 * using the textconv attributes from the superproject on the
+	 * submodules. See the failing "git grep --textconv" tests in
+	 * t7814-grep-recurse-submodules.sh for more information.
 	 */
 	struct repository *repo;
 
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 3172f5b936..cfbaee3851 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -441,4 +441,107 @@ test_expect_success 'grep --recurse-submodules with --cached ignores worktree mo
 	test_must_fail git grep --recurse-submodules --cached "A modified line in submodule" >actual 2>&1 &&
 	test_must_be_empty actual
 '
+
+test_expect_failure 'grep --textconv: superproject .gitattributes does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv: superproject .gitattributes (from index) does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >.gitattributes &&
+	git add .gitattributes &&
+	rm .gitattributes &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv: superproject .git/info/attributes does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	super_attr="$(git rev-parse --git-path info/attributes)" &&
+	test_when_finished "rm -f \"$super_attr\"" &&
+	echo "a diff=d2x" >"$super_attr" &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+# Note: what currently prevents this test from passing is not that the
+# .gitattributes file from "./submodule" is being ignored, but that it is being
+# propagated to the nested "./submodule/sub" files.
+#
+test_expect_failure 'grep --textconv corectly reads submodule .gitattributes' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >submodule/.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv corectly reads submodule .gitattributes (from index)' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >submodule/.gitattributes &&
+	git -C submodule add .gitattributes &&
+	rm submodule/.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv corectly reads submodule .git/info/attributes' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+
+	submodule_attr="$(git -C submodule rev-parse --path-format=absolute --git-path info/attributes)" &&
+	test_when_finished "rm -f \"$submodule_attr\"" &&
+	echo "a diff=d2x" >"$submodule_attr" &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep saves textconv cache in the appropriated repository' '
+	reset_and_clean &&
+	test_config_global diff.d2x_cached.textconv "sed -e \"s/d/x/\"" &&
+	test_config_global diff.d2x_cached.cachetextconv true &&
+	echo "a diff=d2x_cached" >submodule/.gitattributes &&
+
+	# We only read/write to the textconv cache when grepping from an OID,
+	# as the working tree file might have modifications.
+	git grep --textconv --cached --recurse-submodules x &&
+
+	super_textconv_cache="$(git rev-parse --git-path refs/notes/textconv/d2x_cached)" &&
+	sub_textconv_cache="$(git -C submodule rev-parse \
+			--path-format=absolute --git-path refs/notes/textconv/d2x_cached)" &&
+	test_path_is_missing "$super_textconv_cache" &&
+	test_path_is_file "$sub_textconv_cache"
+'
+
 test_done
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] grep: demonstrate bug with textconv attributes and submodules
  2021-09-28 17:08 [PATCH] grep: demonstrate bug with textconv attributes and submodules Matheus Tavares
@ 2021-09-28 17:15 ` Eric Sunshine
  2021-09-28 17:18   ` Matheus Tavares
  2021-09-28 17:16 ` Matheus Tavares
  2021-09-29 12:24 ` [PATCH v2] " Matheus Tavares
  2 siblings, 1 reply; 5+ messages in thread
From: Eric Sunshine @ 2021-09-28 17:15 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: Git List, Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Jonathan Tan

On Tue, Sep 28, 2021 at 1:08 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
> In some circumstances, "git grep --textconv --recurse-submodules"
> ignores the textconv attributes from the submodules and erroneuosly
> apply the attributes defined in the superproject on the submodules'
> files. The textconv cache is also saved on the superproject, even for
> submodule objects.

s/erroneuosly/erroneously/

Also, perhaps: s/apply/applies/

> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> @@ -441,4 +441,107 @@ test_expect_success 'grep --recurse-submodules with --cached ignores worktree mo
> +test_expect_failure 'grep --textconv corectly reads submodule .gitattributes' '

Here and in remaining newly added tests: s/corectly/correctly/

> +test_expect_failure 'grep saves textconv cache in the appropriated repository' '

s/appropriated/appropriate/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] grep: demonstrate bug with textconv attributes and submodules
  2021-09-28 17:08 [PATCH] grep: demonstrate bug with textconv attributes and submodules Matheus Tavares
  2021-09-28 17:15 ` Eric Sunshine
@ 2021-09-28 17:16 ` Matheus Tavares
  2021-09-29 12:24 ` [PATCH v2] " Matheus Tavares
  2 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2021-09-28 17:16 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Jonathan Tan

On Tue, Sep 28, 2021 at 2:08 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> Subject: [PATCH] grep: demonstrate bug with textconv attributes and submodules

I probably should have referenced (or linked to) the thread this came
from: https://lore.kernel.org/git/87zgryylfx.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] grep: demonstrate bug with textconv attributes and submodules
  2021-09-28 17:15 ` Eric Sunshine
@ 2021-09-28 17:18   ` Matheus Tavares
  0 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2021-09-28 17:18 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Ævar Arnfjörð Bjarmason, Junio C Hamano,
	Jonathan Tan

On Tue, Sep 28, 2021 at 2:16 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Tue, Sep 28, 2021 at 1:08 PM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> > In some circumstances, "git grep --textconv --recurse-submodules"
> > ignores the textconv attributes from the submodules and erroneuosly
> > apply the attributes defined in the superproject on the submodules'
> > files. The textconv cache is also saved on the superproject, even for
> > submodule objects.
>
> s/erroneuosly/erroneously/
>
> Also, perhaps: s/apply/applies/
>
> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> > diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> > @@ -441,4 +441,107 @@ test_expect_success 'grep --recurse-submodules with --cached ignores worktree mo
> > +test_expect_failure 'grep --textconv corectly reads submodule .gitattributes' '
>
> Here and in remaining newly added tests: s/corectly/correctly/
>
> > +test_expect_failure 'grep saves textconv cache in the appropriated repository' '
>
> s/appropriated/appropriate/

Oops, I should have been more careful with the writing. Thanks for
catching those!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] grep: demonstrate bug with textconv attributes and submodules
  2021-09-28 17:08 [PATCH] grep: demonstrate bug with textconv attributes and submodules Matheus Tavares
  2021-09-28 17:15 ` Eric Sunshine
  2021-09-28 17:16 ` Matheus Tavares
@ 2021-09-29 12:24 ` Matheus Tavares
  2 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2021-09-29 12:24 UTC (permalink / raw)
  To: git; +Cc: avarab, gitster, jonathantanmy, sunshine

In some circumstances, "git grep --textconv --recurse-submodules"
ignores the textconv attributes from the submodules and erroneously
applies the attributes defined in the superproject on the submodules'
files. The textconv cache is also saved on the superproject, even for
submodule objects.

A fix for these problems will probably require at least three changes:

- Some textconv and attributes functions (as well as their callees) will
  have to be adjusted to work with arbitrary repositories. Note that
  "fill_textconv()", for example, already receives a "struct repository"
  but it writes the textconv cache using "write_loose_object()", which
  implicitly works on "the_repository".

- grep.c functions will have to call textconv/userdiff routines passing
  the "repo" field from "struct grep_source" instead of the one from
  "struct grep_opt". The latter always points to "the_repository" on
  "git grep" executions (see its initialization in builtin/grep.c), but
  the former points to the correct repository that each source (an
  object, file, or buffer) comes from.

- "userdiff_find_by_path()" might need to use a different attributes
  stack for each repository it works on or reset its internal static
  stack when the repository is changed throughout the calls.

For now, let's add some tests to demonstrate these problems, and also
update a NEEDSWORK comment in grep.h that mentions this bug to reference
the added tests.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

Changed in v2: fixed typos in commit message and test names

 grep.h                             |   6 +-
 t/t7814-grep-recurse-submodules.sh | 103 +++++++++++++++++++++++++++++
 2 files changed, 106 insertions(+), 3 deletions(-)

diff --git a/grep.h b/grep.h
index 128007db65..3b63bd0253 100644
--- a/grep.h
+++ b/grep.h
@@ -128,9 +128,9 @@ struct grep_opt {
 	 * instead.
 	 *
 	 * This is potentially the cause of at least one bug - "git grep"
-	 * ignoring the textconv attributes from submodules. See [1] for more
-	 * information.
-	 * [1] https://lore.kernel.org/git/CAHd-oW5iEQarYVxEXoTG-ua2zdoybTrSjCBKtO0YT292fm0NQQ@mail.gmail.com/
+	 * using the textconv attributes from the superproject on the
+	 * submodules. See the failing "git grep --textconv" tests in
+	 * t7814-grep-recurse-submodules.sh for more information.
 	 */
 	struct repository *repo;
 
diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 3172f5b936..058e5d0c96 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -441,4 +441,107 @@ test_expect_success 'grep --recurse-submodules with --cached ignores worktree mo
 	test_must_fail git grep --recurse-submodules --cached "A modified line in submodule" >actual 2>&1 &&
 	test_must_be_empty actual
 '
+
+test_expect_failure 'grep --textconv: superproject .gitattributes does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv: superproject .gitattributes (from index) does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >.gitattributes &&
+	git add .gitattributes &&
+	rm .gitattributes &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv: superproject .git/info/attributes does not affect submodules' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	super_attr="$(git rev-parse --git-path info/attributes)" &&
+	test_when_finished "rm -f \"$super_attr\"" &&
+	echo "a diff=d2x" >"$super_attr" &&
+
+	cat >expect <<-\EOF &&
+	a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+# Note: what currently prevents this test from passing is not that the
+# .gitattributes file from "./submodule" is being ignored, but that it is being
+# propagated to the nested "./submodule/sub" files.
+#
+test_expect_failure 'grep --textconv correctly reads submodule .gitattributes' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >submodule/.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv correctly reads submodule .gitattributes (from index)' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+	echo "a diff=d2x" >submodule/.gitattributes &&
+	git -C submodule add .gitattributes &&
+	rm submodule/.gitattributes &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep --textconv correctly reads submodule .git/info/attributes' '
+	reset_and_clean &&
+	test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
+
+	submodule_attr="$(git -C submodule rev-parse --path-format=absolute --git-path info/attributes)" &&
+	test_when_finished "rm -f \"$submodule_attr\"" &&
+	echo "a diff=d2x" >"$submodule_attr" &&
+
+	cat >expect <<-\EOF &&
+	submodule/a:(1|2)x(3|4)
+	EOF
+	git grep --textconv --recurse-submodules x >actual &&
+	test_cmp expect actual
+'
+
+test_expect_failure 'grep saves textconv cache in the appropriate repository' '
+	reset_and_clean &&
+	test_config_global diff.d2x_cached.textconv "sed -e \"s/d/x/\"" &&
+	test_config_global diff.d2x_cached.cachetextconv true &&
+	echo "a diff=d2x_cached" >submodule/.gitattributes &&
+
+	# We only read/write to the textconv cache when grepping from an OID,
+	# as the working tree file might have modifications.
+	git grep --textconv --cached --recurse-submodules x &&
+
+	super_textconv_cache="$(git rev-parse --git-path refs/notes/textconv/d2x_cached)" &&
+	sub_textconv_cache="$(git -C submodule rev-parse \
+			--path-format=absolute --git-path refs/notes/textconv/d2x_cached)" &&
+	test_path_is_missing "$super_textconv_cache" &&
+	test_path_is_file "$sub_textconv_cache"
+'
+
 test_done
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-29 12:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28 17:08 [PATCH] grep: demonstrate bug with textconv attributes and submodules Matheus Tavares
2021-09-28 17:15 ` Eric Sunshine
2021-09-28 17:18   ` Matheus Tavares
2021-09-28 17:16 ` Matheus Tavares
2021-09-29 12:24 ` [PATCH v2] " Matheus Tavares

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).