From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, gitster@pobox.com, jonathantanmy@google.com
Subject: Re: [RFC PATCH 0/3] grep: don'\''t add subrepos to in-memory alternates
Date: Mon, 27 Sep 2021 19:30:56 +0200 [thread overview]
Message-ID: <87zgryylfx.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <296230071f2cabda213b2a5f6f435f9308718569.1632754555.git.matheus.bernardino@usp.br>
On Mon, Sep 27 2021, Matheus Tavares wrote:
> On Mon, Sep 27, 2021 at 9:09 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>>
>> On Mon, Aug 16 2021, Jonathan Tan wrote:
>>
>> > Record the repository whenever an OID grep source is created, and teach
>> > the worker threads to explicitly provide the repository when accessing
>> > objects.
>> > [...]
>> > diff --git a/grep.h b/grep.h
>> > index 480b3f5bba..128007db65 100644
>> > --- a/grep.h
>> > +++ b/grep.h
>> > @@ -120,7 +120,20 @@ struct grep_opt {
>> > struct grep_pat *header_list;
>> > struct grep_pat **header_tail;
>> > struct grep_expr *pattern_expression;
>> > +
>> > + /*
>> > + * NEEDSWORK: See if we can remove this field, because the repository
>> > + * should probably be per-source. That is, grep.c functions using this
>> > + * field should probably start using "repo" in "struct grep_source"
>> > + * instead.
>> > + *
>> > + * This is potentially the cause of at least one bug - "git grep"
>> > + * ignoring the textconv attributes from submodules. See [1] for more
>> > + * information.
>> > + * [1] https://lore.kernel.org/git/CAHd-oW5iEQarYVxEXoTG-ua2zdoybTrSjCBKtO0YT292fm0NQQ@mail.gmail.com/
>> > + */
>> > struct repository *repo;
>> > +
>>
>> I ran into this comment and read the linked E-Mail, and then the
>> downthread
>> https://lore.kernel.org/git/CAHd-oW6uG1fap-T4UF17bJmjoHAqWCDq9KbY+_8a3cEnnfATxg@mail.gmail.com/;
>>
>> Given Matheus's "I've somehow missed this guard and the..." there I'm
>> not quite sure what/if we should be doing here & what this comment is
>> recommending? I.e. do we still need to adjust the call chains as noted
>> in the E-Mail the comment links to, or not?
>
> I think we should still adjust the call chains, yes. The downthread
> message you mentioned is kind of a tangent about performance, where
> Junio helped me understand something I had previously missed in the
> code, regarding the persistence of the attributes stack.
>
> But the issue that started the thread was about a correctness problem:
> the superproject textconv attributes are being used on submodules'
> files when running `git grep` with `--recurse-submodules --textconv`.
> The three cases to consider are:
>
> - .gitattributes from the working tree
> - .gitattributes from the index
> - .git/info/attributes
>
> On all these cases, the superproject attributes are being used on the
> submodule. Additionally, if the superproject does not define any
> attribute, the submodule attributes are being ignored in all cases
> except by the first one (but that is only because the code sees the
> .gitattributes file on the submodule as if it were a "regular"
> subdirectory of the surperproject. So the submodule's .gitattribures
> takes higher precedence when evaluating the attributes for files in
> that directory).
>
> Another issue is that the textconv cache is always saved to (and read
> from) the superproject gitdir, even for submodules' files.
>
> Here are some test cases that demonstrate these issues:
>
> -- snipsnap --
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> index 3172f5b936..d01a3bc5d8 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -441,4 +441,104 @@ test_expect_success 'grep --recurse-submodules with --cached ignores worktree mo
> test_must_fail git grep --recurse-submodules --cached "A modified line in submodule" >actual 2>&1 &&
> test_must_be_empty actual
> '
> +
> +test_expect_failure 'grep --textconv: superproject .gitattributes does not affect submodules' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> + echo "a diff=d2x" >.gitattributes &&
> +
> + cat >expect <<-\EOF &&
> + a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep --textconv: superproject .gitattributes (from index) does not affect submodules' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> + echo "a diff=d2x" >.gitattributes &&
> + git add .gitattributes &&
> + rm .gitattributes &&
> +
> + cat >expect <<-\EOF &&
> + a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep --textconv: superproject .git/info/attributes does not affect submodules' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> + super_attr="$(git rev-parse --path-format=relative --git-path info/attributes)" &&
> + test_when_finished rm -f "$super_attr" &&
> + echo "a diff=d2x" >"$super_attr" &&
> +
> + cat >expect <<-\EOF &&
> + a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'grep --textconv corectly reads submodule .gitattributes' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> + echo "a diff=d2x" >submodule/.gitattributes &&
> +
> + cat >expect <<-\EOF &&
> + submodule/a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep --textconv corectly reads submodule .gitattributes (from index)' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> + echo "a diff=d2x" >submodule/.gitattributes &&
> + git -C submodule add .gitattributes &&
> + rm submodule/.gitattributes &&
> +
> + cat >expect <<-\EOF &&
> + submodule/a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep --textconv corectly reads submodule .git/info/attributes' '
> + reset_and_clean &&
> + test_config_global diff.d2x.textconv "sed -e \"s/d/x/\"" &&
> +
> + # Workaround: we use --path-format=relative because the absolute path
> + # contains whitespaces and that seems to confuse test_when_finished
> + #
> + submodule_attr="submodule/$(git -C submodule rev-parse --path-format=relative --git-path info/attributes)" &&
> + test_when_finished rm -f "$submodule_attr" &&
> + echo "a diff=d2x" >"$submodule_attr" &&
> +
> + cat >expect <<-\EOF &&
> + submodule/a:(1|2)x(3|4)
> + EOF
> + git grep --textconv --recurse-submodules x >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_failure 'grep saves textconv cache in the appropriated repository' '
> + reset_and_clean &&
> + test_config_global diff.d2x_cached.textconv "sed -e \"s/d/x/\"" &&
> + test_config_global diff.d2x_cached.cachetextconv true &&
> + echo "a diff=d2x_cached" >submodule/.gitattributes &&
> +
> + # Note: we only read/write to the textconv cache when grepping from an
> + # OID as the working tree file might have modifications. That is why
> + # we use --cached here.
> + #
> + git grep --textconv --cached --recurse-submodules x &&
> + test_path_is_missing "$(git rev-parse --git-path refs/notes/textconv/d2x_cached)" &&
> + test_path_is_file "$(git -C submodule rev-parse --git-path refs/notes/textconv/d2x_cached)"
> +'
> +
> test_done
Thanks! I think it would be very good to have these tests in-tree along
with an updated comment pointing to them.
next prev parent reply other threads:[~2021-09-27 17:46 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-10 18:28 [PATCH 0/7] In grep, no adding submodule ODB as alternates Jonathan Tan
2021-08-10 18:28 ` [PATCH 1/7] submodule: lazily add submodule ODBs " Jonathan Tan
2021-08-10 21:13 ` Junio C Hamano
2021-08-13 16:53 ` Jonathan Tan
2021-08-11 21:33 ` Emily Shaffer
2021-08-13 16:23 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 2/7] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-11 21:36 ` Emily Shaffer
2021-08-13 16:31 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 3/7] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-10 21:38 ` Junio C Hamano
2021-08-11 21:42 ` Emily Shaffer
2021-08-11 23:07 ` Ramsay Jones
2021-08-13 16:32 ` Jonathan Tan
2021-08-11 22:45 ` Matheus Tavares Bernardino
2021-08-12 16:49 ` Junio C Hamano
2021-08-13 16:33 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 4/7] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-11 21:44 ` Emily Shaffer
2021-08-13 16:39 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 5/7] grep: allocate subrepos on heap Jonathan Tan
2021-08-11 21:50 ` Emily Shaffer
2021-08-13 16:42 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 6/7] grep: add repository to OID grep sources Jonathan Tan
2021-08-11 21:52 ` Emily Shaffer
2021-08-13 16:44 ` Jonathan Tan
2021-08-11 23:28 ` Matheus Tavares Bernardino
2021-08-13 16:47 ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 7/7] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-11 21:55 ` Emily Shaffer
2021-08-11 22:22 ` Matheus Tavares Bernardino
2021-08-13 16:50 ` Jonathan Tan
2021-08-11 21:29 ` [PATCH 0/7] In grep, no adding submodule ODB as alternates Emily Shaffer
2021-08-11 22:49 ` Josh Steadmon
2021-08-13 21:05 ` [PATCH v2 0/8] " Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 1/8] submodule: lazily add submodule ODBs " Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 2/8] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 3/8] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-16 15:06 ` Matheus Tavares Bernardino
2021-08-13 21:05 ` [PATCH v2 4/8] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 5/8] grep: allocate subrepos on heap Jonathan Tan
2021-08-13 21:44 ` Junio C Hamano
2021-08-16 19:42 ` Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 6/8] grep: add repository to OID grep sources Jonathan Tan
2021-08-16 14:48 ` Matheus Tavares Bernardino
2021-08-16 19:44 ` Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 7/8] submodule-config: pass repo upon blob config read Jonathan Tan
2021-08-16 14:32 ` Matheus Tavares Bernardino
2021-08-16 19:57 ` Matheus Tavares Bernardino
2021-08-16 20:02 ` Jonathan Tan
2021-08-16 15:48 ` Matheus Tavares Bernardino
2021-08-16 20:09 ` Jonathan Tan
2021-08-16 20:57 ` Jonathan Tan
2021-08-13 21:05 ` [PATCH v2 8/8] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-16 15:14 ` [PATCH v2 0/8] In grep, no adding submodule ODB as alternates Matheus Tavares Bernardino
2021-08-16 21:09 ` [PATCH v3 " Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 1/8] submodule: lazily add submodule ODBs " Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 2/8] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 3/8] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 4/8] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 5/8] grep: allocate subrepos on heap Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 6/8] grep: add repository to OID grep sources Jonathan Tan
2021-09-27 12:08 ` Ævar Arnfjörð Bjarmason
2021-09-27 16:45 ` [RFC PATCH 0/3] grep: don'\''t add subrepos to in-memory alternates Matheus Tavares
2021-09-27 17:30 ` Ævar Arnfjörð Bjarmason [this message]
2021-08-16 21:09 ` [PATCH v3 7/8] submodule-config: pass repo upon blob config read Jonathan Tan
2021-08-16 21:09 ` [PATCH v3 8/8] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-17 19:29 ` [PATCH v3 0/8] In grep, no adding submodule ODB as alternates Matheus Tavares Bernardino
2021-09-08 0:26 ` Junio C Hamano
2021-09-08 15:31 ` Matheus Tavares Bernardino
2021-09-08 18:45 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2019-09-18 1:56 [RFC PATCH 0/3] grep: don't add subrepos to in-memory alternates Matheus Tavares
2019-09-18 19:55 ` Junio C Hamano
2019-09-19 5:18 ` Matheus Tavares Bernardino
2019-09-20 16:26 ` Junio C Hamano
2019-09-21 20:34 ` Matheus Tavares Bernardino
2019-09-28 3:24 ` Junio C Hamano
2019-09-28 4:20 ` Matheus Tavares Bernardino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zgryylfx.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=matheus.bernardino@usp.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).