git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Victoria Dye <vdye@github.com>
To: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Cc: derrickstolee@github.com, git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH v5 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse
Date: Fri, 9 Sep 2022 19:04:40 -0700	[thread overview]
Message-ID: <5d4a3cc5-d68e-0e8d-0792-e1e8d60bcfd1@github.com> (raw)
In-Reply-To: <20220908001854.206789-4-shaoxuan.yuan02@gmail.com>

Shaoxuan Yuan wrote:
> +
> +	/*
> +	 * NEEDSWORK: when reading a submodule, the sparsity settings in the
> +	 * superproject are incorrectly forgotten or misused. For example:
> +	 *
> +	 * 1. "command_requires_full_index"
> +	 * 	When this setting is turned on for `grep`, only the superproject
> +	 *	knows it. All the submodules are read with their own configs
> +	 *	and get prepare_repo_settings()'d. Therefore, these submodules
> +	 *	"forget" the sparse-index feature switch. As a result, the index
> +	 *	of these submodules are expanded unexpectedly.
> +	 *
> +	 * 2. "core_apply_sparse_checkout"
> +	 *	When running `grep` in the superproject, this setting is
> +	 *	populated using the superproject's configs. However, once
> +	 *	initialized, this config is globally accessible and is read by
> +	 *	prepare_repo_settings() for the submodules. For instance, if a
> +	 *	submodule is using a sparse-checkout, however, the superproject
> +	 *	is not, the result is that the config from the superproject will
> +	 *	dictate the behavior for the submodule, making it "forget" its
> +	 *	sparse-checkout state.
> +	 *
> +	 * 3. "core_sparse_checkout_cone"
> +	 *	ditto.

These are interesting observations, thank you for describing the behavior in
detail.

- #1 might seem like an easy fix - since 'command_requires_full_index' is
  tied to the command (not properties of the repo), the logical thing to do
  would be to propagate the value from the superproject to the subproject.
  However, that fix will undoubtedly expose lots of places where we're not
  handling the sparse index correctly in submodules. Since this isn't a
  problem introduced by your patch series, I'm content leaving this for a
  later series.
- #2 is an odd situation, but I'm guessing that the effect here will be
  minimal (since, regardless of the 'core_*' sparse-checkout globals,
  'SKIP_WORKTREE' will still be applied to - and respected on - entries in
  the index). It's more worrisome for commands that recurse submodules and
  *write* the index (e.g., 'git read-tree'), but that's also outside the
  scope of this series.

Given this information, I think your approach is (for the time being) a safe
one. Beyond the submodule issues, I'm happy with the rest of your
'grep_tree()' updates.

> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 63becc3138..fda05faadf 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1987,7 +2000,48 @@ test_expect_success 'grep is not expanded' '
>  
>  	# All files within the folder1/* pathspec are sparse,
>  	# so this command does not find any matches
> -	ensure_not_expanded ! grep a -- folder1/*
> +	ensure_not_expanded ! grep a -- folder1/* &&
> +
> +	# test out-of-cone pathspec with or without wildcard
> +	ensure_not_expanded grep --sparse --cached a -- "folder1/a" &&
> +	ensure_not_expanded grep --sparse --cached a -- "folder1/*" &&
> +
> +	# test in-cone pathspec with or without wildcard
> +	ensure_not_expanded grep --sparse --cached a -- "deep/a" &&
> +	ensure_not_expanded grep --sparse --cached a -- "deep/*"

Thanks for the new tests (re: [1])! 

[1] https://lore.kernel.org/git/4b65d7dc-e711-43a6-8763-62be79a3e4a9@github.com/

> +'
> +
> +# NEEDSWORK: when running `grep` in the superproject with --recurse-submodules,
> +# Git expands the index of the submodules unexpectedly. Even though `grep`
> +# builtin is marked as "command_requires_full_index = 0", this config is only
> +# useful for the superproject. Namely, the submodules have their own configs,
> +# which are _not_ populated by the one-time sparse-index feature switch.
> +test_expect_failure 'grep within submodules is not expanded' '
> +	init_repos_as_submodules &&
> +
> +	# do not use ensure_not_expanded() here, becasue `grep` should be
> +	# run in the superproject, not in "./sparse-index"
> +	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
> +	git grep --sparse --cached --recurse-submodules a -- "*/folder1/*" &&
> +	test_region ! index ensure_full_index trace2.txt
> +'

So this test is *only* demonstrating that the submodules' indexes are
expanded (incorrectly, hence the 'test_expect_failure'); it doesn't show
that 'git grep' returns the correct results...

> +
> +# NEEDSWORK: this test is not actually testing the code. The design purpose
> +# of this test is to verify the grep result when the submodules are using a
> +# sparse-index. Namely, we want "folder1/" as a tree (a sparse directory); but
> +# because of the index expansion, we are now grepping the "folder1/a" blob.
> +# Because of the problem stated above 'grep within submodules is not expanded',
> +# we don't have the ideal test environment yet.
> +test_expect_success 'grep sparse directory within submodules' '
> +	init_repos_as_submodules &&
> +
> +	cat >expect <<-\EOF &&
> +	full-checkout/folder1/a:a
> +	sparse-checkout/folder1/a:a
> +	sparse-index/folder1/a:a
> +	EOF
> +	git grep --sparse --cached --recurse-submodules a -- "*/folder1/*" >actual &&
> +	test_cmp actual expect
>  '

...but this test *does* show that those results are correct. I think it was
a good decision to keep the two separate, since only the index expansion
behavior is wrong (thus warranting the 'test_expect_failure'). The output of
'git grep' is still what we want it to be, so it gets a
'test_expect_success'.

>  
>  test_done


  parent reply	other threads:[~2022-09-10  2:04 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17  7:56 [PATCH v1 0/2] grep: integrate with sparse index Shaoxuan Yuan
2022-08-17  7:56 ` [PATCH v1 1/2] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-08-17 14:12   ` Derrick Stolee
2022-08-17 17:13     ` Junio C Hamano
2022-08-17 17:34       ` Victoria Dye
2022-08-17 17:43         ` Derrick Stolee
2022-08-17 18:47           ` Junio C Hamano
2022-08-17 17:37     ` Elijah Newren
2022-08-24 18:20     ` Shaoxuan Yuan
2022-08-24 19:08       ` Derrick Stolee
2022-08-17  7:56 ` [PATCH v1 2/2] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-08-17 14:23   ` Derrick Stolee
2022-08-24 21:06     ` Shaoxuan Yuan
2022-08-25  0:39       ` Derrick Stolee
2022-08-17 13:46 ` [PATCH v1 0/2] grep: " Derrick Stolee
2022-08-29 23:28 ` [PATCH v2 " Shaoxuan Yuan
2022-08-29 23:28   ` [PATCH v2 1/2] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-08-29 23:28   ` [PATCH v2 2/2] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-08-30 13:45     ` Derrick Stolee
2022-09-01  4:57 ` [PATCH v3 0/3] grep: " Shaoxuan Yuan
2022-09-01  4:57   ` [PATCH v3 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-01  4:57   ` [PATCH v3 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-01  4:57   ` [PATCH v3 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-01 17:03     ` Derrick Stolee
2022-09-01 18:31       ` Shaoxuan Yuan
2022-09-01 17:17     ` Junio C Hamano
2022-09-01 17:27       ` Junio C Hamano
2022-09-01 22:49         ` Shaoxuan Yuan
2022-09-01 22:36       ` Shaoxuan Yuan
2022-09-02  3:28     ` Victoria Dye
2022-09-02 18:47       ` Shaoxuan Yuan
2022-09-03  0:36 ` [PATCH v4 0/3] grep: integrate with sparse index Shaoxuan Yuan
2022-09-03  0:36   ` [PATCH v4 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-03  0:36   ` [PATCH v4 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-03  0:36   ` [PATCH v4 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-03  4:39     ` Junio C Hamano
2022-09-08  0:24       ` Shaoxuan Yuan
2022-09-08  0:18 ` [PATCH v5 0/3] grep: integrate with sparse index Shaoxuan Yuan
2022-09-08  0:18   ` [PATCH v5 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-10  1:07     ` Victoria Dye
2022-09-14  6:08     ` Elijah Newren
2022-09-15  2:57       ` Junio C Hamano
2022-09-18  2:14         ` Elijah Newren
2022-09-18 19:52           ` Victoria Dye
2022-09-19  1:23             ` Junio C Hamano
2022-09-19  4:27             ` Shaoxuan Yuan
2022-09-19 11:03             ` Ævar Arnfjörð Bjarmason
2022-09-20  7:13             ` Elijah Newren
2022-09-17  3:34       ` Shaoxuan Yuan
2022-09-18  4:24         ` Elijah Newren
2022-09-19  4:13           ` Shaoxuan Yuan
2022-09-17  3:45       ` Shaoxuan Yuan
2022-09-08  0:18   ` [PATCH v5 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-08  0:18   ` [PATCH v5 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-08 17:59     ` Junio C Hamano
2022-09-08 20:46       ` Derrick Stolee
2022-09-08 20:56         ` Junio C Hamano
2022-09-08 21:06           ` Shaoxuan Yuan
2022-09-09 12:49           ` Derrick Stolee
2022-09-13 17:23         ` Junio C Hamano
2022-09-10  2:04     ` Victoria Dye [this message]
2022-09-23  4:18 ` [PATCH v6 0/1] grep: integrate with sparse index Shaoxuan Yuan
2022-09-23  4:18   ` [PATCH v6 1/1] builtin/grep.c: " Shaoxuan Yuan
2022-09-23 16:40     ` Junio C Hamano
2022-09-23 16:58     ` Junio C Hamano
2022-09-26 17:28       ` Junio C Hamano
2022-09-23 14:13   ` [PATCH v6 0/1] grep: " Derrick Stolee
2022-09-23 16:01   ` Victoria Dye
2022-09-23 17:08     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d4a3cc5-d68e-0e8d-0792-e1e8d60bcfd1@github.com \
    --to=vdye@github.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=shaoxuan.yuan02@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).