git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse'
@ 2021-10-15 19:54 Victoria Dye via GitGitGadget
  2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-15 19:54 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye

This series updates do_read_index to use the index.sparse config setting
when determining whether the index should be expanded or collapsed. If the
command & repo allow use of a sparse index, index.sparse is enabled, and a
full index is read from disk, the index is collapsed before returning to the
caller. Conversely, if index.sparse is disabled but the index read from disk
is sparse, the index is expanded before returning. This allows index.sparse
to control use of the sparse index in addition to its existing control over
how the index is written to disk. It also introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).

While testing this change, a bug was found in 'test-tool read-cache' in
which config settings for the repository were not initialized before
preparing the repo settings. This caused index.sparse to always be 'false'
when using the test helper in a cone-mode sparse checkout, breaking tests in
t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
config setup.

[1]
https://lore.kernel.org/git/cc60c6e7-ecef-ae22-8ec7-ab290ff2b830@gmail.com/

Thanks! -Victoria

Victoria Dye (2):
  test-read-cache.c: prepare_repo_settings after config init
  sparse-index: update index read to consider index.sparse config

 read-cache.c                             |  5 +++-
 t/helper/test-read-cache.c               |  5 ++--
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 3 deletions(-)


base-commit: f443b226ca681d87a3a31e245a70e6bc2769123c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1059%2Fvdye%2Fsparse%2Findex-sparse-config-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1059/vdye/sparse/index-sparse-config-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1059
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init
  2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
@ 2021-10-15 19:54 ` Victoria Dye via GitGitGadget
  2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-15 19:54 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/helper/test-read-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index b52c174acc7..0d9f08931a1 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -39,8 +39,6 @@ int cmd__read_cache(int argc, const char **argv)
 	int table = 0, expand = 0;
 
 	initialize_the_repository();
-	prepare_repo_settings(r);
-	r->settings.command_requires_full_index = 0;
 
 	for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
 		if (skip_prefix(*argv, "--print-and-refresh=", &name))
@@ -56,6 +54,9 @@ int cmd__read_cache(int argc, const char **argv)
 	setup_git_directory();
 	git_config(git_default_config, NULL);
 
+	prepare_repo_settings(r);
+	r->settings.command_requires_full_index = 0;
+
 	for (i = 0; i < cnt; i++) {
 		repo_read_index(r);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
@ 2021-10-15 19:54 ` Victoria Dye via GitGitGadget
  2021-10-15 21:53   ` Junio C Hamano
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2 siblings, 1 reply; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-15 19:54 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Use the index.sparse config setting to expand or collapse the index when
read. Previously, index.sparse would determine how the index would be
written to disk, but would not enforce whether the index is read into memory
as full or sparse. Now, the index is expanded when a sparse index is read
with `index.sparse=false` and is collapsed to sparse when a full index is
read with `index.sparse=true` (and the command does not require a full
index).

This makes the behavior of `index.sparse` more intuitive, as it now clearly
enables/disables usage of a sparse index. It also provides users with a way
to disable the sparse index per-command (e.g., for troubleshooting purposes)
without needing to re-initialize the sparse-checkout.

Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             |  5 +++-
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index a78b88a41bf..c3f31718b19 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2338,8 +2338,11 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	if (!istate->repo)
 		istate->repo = the_repository;
 	prepare_repo_settings(istate->repo);
-	if (istate->repo->settings.command_requires_full_index)
+	if (!istate->repo->settings.sparse_index ||
+	    istate->repo->settings.command_requires_full_index)
 		ensure_full_index(istate);
+	else if (!istate->sparse_index)
+		convert_to_sparse(istate, 0);
 
 	return istate->cache_nr;
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ca91c6a67f8..59accde1fa3 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
+test_expect_success 'index.sparse disabled inline uses full index' '
+	init_repos &&
+
+	# When index.sparse is disabled inline with `git status`, the
+	# index is expanded at the beginning of the execution then never
+	# converted back to sparse. It is then written to disk as a full index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index -c index.sparse=false status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	test_region index ensure_full_index trace2.txt &&
+
+	# Since index.sparse is set to true at a repo level, the index
+	# is converted from full to sparse when read, then never expanded
+	# over the course of `git status`. It is written to disk as a sparse
+	# index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt &&
+
+	# Now that the index has been written to disk as sparse, it is not
+	# converted to sparse (or expanded to full) when read by `git status`.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt
+'
+
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
@ 2021-10-15 21:53   ` Junio C Hamano
  2021-10-17  1:21     ` Derrick Stolee
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-15 21:53 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, stolee, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> Use the index.sparse config setting to expand or collapse the index when
> read. Previously, index.sparse would determine how the index would be
> written to disk, but would not enforce whether the index is read into memory
> as full or sparse. Now, the index is expanded when a sparse index is read
> with `index.sparse=false` and is collapsed to sparse when a full index is
> read with `index.sparse=true` (and the command does not require a full
> index).

Instead of calling both in-core index and on-disk index, perhaps use
"the in-core index" as appropriately in the above description and
the result would become much less ambigous.

My knee-jerk reaction was that it is of dubious value to spend
cycles to make the in-core index sparse after reading a non-sparse
index from the disk to give to the caller, but this hurts only the
commands that are not yet sparse-aware and call ensure_full_index()
as the first thing they do.  To them, we are wasting cycles to
shrink and expand for no good reason, and after they are done, the
final writeout would create a sparse on-disk index.

Besides, the on-disk index is expected to be sparse most of the time
when index.sparse is true, so it is hopefully a one-time waste that
corrects by itself.

For all commands that are sparse-aware, especially when asked to
perform their operation on the paths that are not hidden by a
tree-like index entry, it may or may not be a win, but the downside
would be much smaller.  The cost to shrink a full in-core index
before writing out as a sparse one should be comparable to the cost
to shrink a full in-core index just read from the disk before the
sparse-index-aware caller works on it and leaves a still mostly
sparse in-core index to be written out without much extra work to
re-shrink it to the disk.

> This makes the behavior of `index.sparse` more intuitive, as it now clearly
> enables/disables usage of a sparse index.

It is a minor thing, so I am willing to let it pass, but I am not
sure about this claim.  The write-out codepath ensures, independent
of this change, that a full on-disk index is corrected to become
sparse when the configuration is set to true, and a sparse on-disk
index is corrected to become full when the configuration is set to
false, no?

So the only "intuitive"-ness we may be gaining is that the codepaths
that are sparse-aware would work in their "sparse" (non-"sparse")
mode when index.sparse is set to true (false), respectively, no
matter how sparse (or not sparse) the on-disk index they work on is
initially.  That might help debuggability (assuming that converting
between the full and sparse forms are working correctly), but I am
not sure if that is something end users would even care about.

> -	if (istate->repo->settings.command_requires_full_index)
> +	if (!istate->repo->settings.sparse_index ||
> +	    istate->repo->settings.command_requires_full_index)
>  		ensure_full_index(istate);
> +	else if (!istate->sparse_index)
> +		convert_to_sparse(istate, 0);
>  
>  	return istate->cache_nr;

Quite straight-forward.  Looking good.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-15 21:53   ` Junio C Hamano
@ 2021-10-17  1:21     ` Derrick Stolee
  2021-10-17  5:58       ` Junio C Hamano
  0 siblings, 1 reply; 44+ messages in thread
From: Derrick Stolee @ 2021-10-17  1:21 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye via GitGitGadget; +Cc: git, Victoria Dye

On 10/15/2021 5:53 PM, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Victoria Dye <vdye@github.com>
>>
>> Use the index.sparse config setting to expand or collapse the index when
>> read. Previously, index.sparse would determine how the index would be
>> written to disk, but would not enforce whether the index is read into memory
>> as full or sparse. Now, the index is expanded when a sparse index is read
>> with `index.sparse=false` and is collapsed to sparse when a full index is
>> read with `index.sparse=true` (and the command does not require a full
>> index).
> 
> Instead of calling both in-core index and on-disk index, perhaps use
> "the in-core index" as appropriately in the above description and
> the result would become much less ambigous.
> 
> My knee-jerk reaction was that it is of dubious value to spend
> cycles to make the in-core index sparse after reading a non-sparse
> index from the disk to give to the caller, but this hurts only the
> commands that are not yet sparse-aware and call ensure_full_index()
> as the first thing they do.  To them, we are wasting cycles to
> shrink and expand for no good reason, and after they are done, the
> final writeout would create a sparse on-disk index.

I think you are slightly mistaken here: If index.sparse=true, then a
full index will be converted to one on write, but not immediately upon
read. This means that subsequent commands will read a sparse index, and
they will either benefit from that or not depending on whether they are
integrated with the sparse index or not.

The new behavior here is that if index.sparse=false, then we convert
a sparse index to a full one upon read, avoiding any chance that a
Git command is operating on a sparse index in-memory.

The simplest summary I can say is here:

* If index.sparse=false, then a sparse index will be converted to
  full upon read.

* If index.sparse=true, then a full index will be converted to sparse
  on write.

> Besides, the on-disk index is expected to be sparse most of the time
> when index.sparse is true, so it is hopefully a one-time waste that
> corrects by itself.
> 
> For all commands that are sparse-aware, especially when asked to
> perform their operation on the paths that are not hidden by a
> tree-like index entry, it may or may not be a win, but the downside
> would be much smaller.  The cost to shrink a full in-core index
> before writing out as a sparse one should be comparable to the cost
> to shrink a full in-core index just read from the disk before the
> sparse-index-aware caller works on it and leaves a still mostly
> sparse in-core index to be written out without much extra work to
> re-shrink it to the disk.
> 
>> This makes the behavior of `index.sparse` more intuitive, as it now clearly
>> enables/disables usage of a sparse index.
> 
> It is a minor thing, so I am willing to let it pass, but I am not
> sure about this claim.  The write-out codepath ensures, independent
> of this change, that a full on-disk index is corrected to become
> sparse when the configuration is set to true, and a sparse on-disk
> index is corrected to become full when the configuration is set to
> false, no?

The previous behavior required an index write for the sparse-to-full
transition. After this change, an index write is still required for the
full-to-sparse transition.

> So the only "intuitive"-ness we may be gaining is that the codepaths
> that are sparse-aware would work in their "sparse" (non-"sparse")
> mode when index.sparse is set to true (false), respectively, no
> matter how sparse (or not sparse) the on-disk index they work on is
> initially.  That might help debuggability (assuming that converting
> between the full and sparse forms are working correctly), but I am
> not sure if that is something end users would even care about.

I think you have a case for concern with the word "intuitive". Even
though I agree that the new setup is the best possible interpretation
of the setting, its inherit asymmetry can be confusing.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-17  1:21     ` Derrick Stolee
@ 2021-10-17  5:58       ` Junio C Hamano
  2021-10-17 19:33         ` Derrick Stolee
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-17  5:58 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

Derrick Stolee <stolee@gmail.com> writes:

> I think you are slightly mistaken here: If index.sparse=true, then a
> full index will be converted to one on write, but not immediately upon
> read. This means that subsequent commands will read a sparse index, and
> they will either benefit from that or not depending on whether they are
> integrated with the sparse index or not.
>
> The new behavior here is that if index.sparse=false, then we convert
> a sparse index to a full one upon read, avoiding any chance that a
> Git command is operating on a sparse index in-memory.

And if index.sparse=true, then we convert a full on-disk index to a
sparse one in-core upon reading, right?  My comment was solely on
that side of the picture, not on the "index.sparse is set to false
so we automatically expand" case.

> The simplest summary I can say is here:
>
> * If index.sparse=false, then a sparse index will be converted to
>   full upon read.
>
> * If index.sparse=true, then a full index will be converted to sparse
>   on write.

Oh, I see, so yes I was very much misunderstanding what you guys are
trying to do.  I somehow thought that sparse-to-full and
full-to-sparse conversions (1) already happen on the write codepath,
and (2) this patch makes them both happen also on the read codepath.

IOW:

    * If index.sparse=false, a sparse index will be written as full,
      and if it is true, a non-sparse index will be written as
      sparse, even before these patches.

    * In addition, with these patches, if index.sparse=false, a
      sparse index will be expaned to full upon reading, and if it
      is true, a non-sparse index will be shrunk to sparse upon
      reading

was what I was expecting.

What your summary above is saying is very much different.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-17  5:58       ` Junio C Hamano
@ 2021-10-17 19:33         ` Derrick Stolee
  2021-10-18  1:15           ` Junio C Hamano
  0 siblings, 1 reply; 44+ messages in thread
From: Derrick Stolee @ 2021-10-17 19:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

On 10/17/2021 1:58 AM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> I think you are slightly mistaken here: If index.sparse=true, then a
>> full index will be converted to one on write, but not immediately upon
>> read. This means that subsequent commands will read a sparse index, and
>> they will either benefit from that or not depending on whether they are
>> integrated with the sparse index or not.
>>
>> The new behavior here is that if index.sparse=false, then we convert
>> a sparse index to a full one upon read, avoiding any chance that a
>> Git command is operating on a sparse index in-memory.
> 
> And if index.sparse=true, then we convert a full on-disk index to a
> sparse one in-core upon reading, right?  My comment was solely on
> that side of the picture, not on the "index.sparse is set to false
> so we automatically expand" case.
> 
>> The simplest summary I can say is here:
>>
>> * If index.sparse=false, then a sparse index will be converted to
>>   full upon read.
>>
>> * If index.sparse=true, then a full index will be converted to sparse
>>   on write.
> 
> Oh, I see, so yes I was very much misunderstanding what you guys are
> trying to do.  I somehow thought that sparse-to-full and
> full-to-sparse conversions (1) already happen on the write codepath,
> and (2) this patch makes them both happen also on the read codepath.
> 
> IOW:
> 
>     * If index.sparse=false, a sparse index will be written as full,
>       and if it is true, a non-sparse index will be written as
>       sparse, even before these patches.

This statement is true.

>     * In addition, with these patches, if index.sparse=false, a
>       sparse index will be expaned to full upon reading, and if it
>       is true, a non-sparse index will be shrunk to sparse upon
>       reading

This is only half true. If "index.sparse=false", then a sparse
index will be expanded to full upon reading. If "index.sparse=true"
then nothing special will happen to an index on reading.
> What your summary above is saying is very much different.

Yes, these are different things. It also gets around the concern that
"if we have a non-integrated command, then index.sparse=true would
convert a full index to a sparse one, only to be expanded again".
That doesn't happen if we only convert from full to sparse on write.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-17 19:33         ` Derrick Stolee
@ 2021-10-18  1:15           ` Junio C Hamano
  2021-10-18 13:25             ` Derrick Stolee
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-18  1:15 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

Derrick Stolee <stolee@gmail.com> writes:

>>     * In addition, with these patches, if index.sparse=false, a
>>       sparse index will be expaned to full upon reading, and if it
>>       is true, a non-sparse index will be shrunk to sparse upon
>>       reading
>
> This is only half true. If "index.sparse=false", then a sparse
> index will be expanded to full upon reading. If "index.sparse=true"
> then nothing special will happen to an index on reading.

OK.  I somehow got the impression that we convert in both ways from
the patch text under discussion, specifically this part in
do_read_index():

> -	if (istate->repo->settings.command_requires_full_index)
> +	if (!istate->repo->settings.sparse_index ||
> +	    istate->repo->settings.command_requires_full_index)
>  		ensure_full_index(istate);
> +	else if (!istate->sparse_index)
> +		convert_to_sparse(istate, 0);
>  
>  	return istate->cache_nr;

We used to say "when we know we are running a command that is not
sparse ready, expand it here" and nothing else.

We now added a bit more condition for expansion, i.e. "when we are
told that the repository does not want sparse index, or when the
command is not sparse ready".

But the patch goes one more step.  "If we didn't find a reason to
expand to full, and if the in-core index we read is not sparse, then
call convert_to_sparse() on it".  So if the repo settings tells us
that the repository wants a sparse index, and the index we read was
not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
do?  Nothing special?

It internally checks the repo->settings.sparse_index again and
refrains from shrinking the entries, but we are talking about the
case where that repo settings is set to "true", so wouldn't we end
up getting the originally full in-core index turned into sparse?

It is a confusing piece of code.

I see many unconditional calls to convert_to_sparse(istate, 0) on
the write code paths, where I instead would expect "if the repo
wants sparse, call convert_to_sparse(), and if the repo does not
want sparse, call ensure_full_index()", before actualy writing the
entries out.  These also are setting traps to confuse readers.

Perhaps we want a helper function "ensure_right_sparsity(istate)"
that would be called where we have

	if (condition)
		convert_to_sparse();
	else
		ensure_full_index();

or an unconditonal call to convert_to_sparse() to unconfuse readers?

Still puzzled...

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-18  1:15           ` Junio C Hamano
@ 2021-10-18 13:25             ` Derrick Stolee
  2021-10-18 14:14               ` Victoria Dye
  2021-10-18 16:09               ` Junio C Hamano
  0 siblings, 2 replies; 44+ messages in thread
From: Derrick Stolee @ 2021-10-18 13:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

On 10/17/2021 9:15 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>>>     * In addition, with these patches, if index.sparse=false, a
>>>       sparse index will be expaned to full upon reading, and if it
>>>       is true, a non-sparse index will be shrunk to sparse upon
>>>       reading
>>
>> This is only half true. If "index.sparse=false", then a sparse
>> index will be expanded to full upon reading. If "index.sparse=true"
>> then nothing special will happen to an index on reading.
> 
> OK.  I somehow got the impression that we convert in both ways from
> the patch text under discussion, specifically this part in
> do_read_index():
> 
>> -	if (istate->repo->settings.command_requires_full_index)
>> +	if (!istate->repo->settings.sparse_index ||
>> +	    istate->repo->settings.command_requires_full_index)
>>  		ensure_full_index(istate);
>> +	else if (!istate->sparse_index)
>> +		convert_to_sparse(istate, 0);
>>  
>>  	return istate->cache_nr;
> 
> We used to say "when we know we are running a command that is not
> sparse ready, expand it here" and nothing else.
> 
> We now added a bit more condition for expansion, i.e. "when we are
> told that the repository does not want sparse index, or when the
> command is not sparse ready".
> 
> But the patch goes one more step.  "If we didn't find a reason to
> expand to full, and if the in-core index we read is not sparse, then
> call convert_to_sparse() on it".  So if the repo settings tells us
> that the repository wants a sparse index, and the index we read was
> not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
> do?  Nothing special?

You are absolutely right. I've been talking about what I _thought_
the code does (and should do) but I missed this 'else if' which is
in fact doing what you have been claiming the entire time. I should
have done a better job double-checking the code before doubling
down on my claims.

I think the 'else if' should be removed, which would match my
expectations.

> I see many unconditional calls to convert_to_sparse(istate, 0) on
> the write code paths, where I instead would expect "if the repo
> wants sparse, call convert_to_sparse(), and if the repo does not
> want sparse, call ensure_full_index()", before actualy writing the
> entries out.  These also are setting traps to confuse readers.
> 
> Perhaps we want a helper function "ensure_right_sparsity(istate)"
> that would be called where we have
> 
> 	if (condition)
> 		convert_to_sparse();
> 	else
> 		ensure_full_index();
> 
> or an unconditonal call to convert_to_sparse() to unconfuse readers?

convert_to_sparse() has several checks that prevent the conversion
from happening, such as having a split index. In particular, it will
skip the conversion if the_repository->settings.sparse_index is false.
Thus, calling it unconditionally in the write code is correct.

Doing these conditional checks within convert_to_sparse() make sense
to avoid repeating the conditionals in all callers. ensure_full_index()
doesn't do the same because we absolutely want a full index at the end
of that method.

Perhaps a rename to something like "convert_to_sparse_if_able()" would
make sense? But it might be best to leave that one lie.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-18 13:25             ` Derrick Stolee
@ 2021-10-18 14:14               ` Victoria Dye
  2021-10-21 13:26                 ` Derrick Stolee
  2021-10-18 16:09               ` Junio C Hamano
  1 sibling, 1 reply; 44+ messages in thread
From: Victoria Dye @ 2021-10-18 14:14 UTC (permalink / raw)
  To: Derrick Stolee, Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git

Derrick Stolee wrote:
> On 10/17/2021 9:15 PM, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>>>     * In addition, with these patches, if index.sparse=false, a
>>>>       sparse index will be expaned to full upon reading, and if it
>>>>       is true, a non-sparse index will be shrunk to sparse upon
>>>>       reading
>>>
>>> This is only half true. If "index.sparse=false", then a sparse
>>> index will be expanded to full upon reading. If "index.sparse=true"
>>> then nothing special will happen to an index on reading.
>>
>> OK.  I somehow got the impression that we convert in both ways from
>> the patch text under discussion, specifically this part in
>> do_read_index():
>>
>>> -	if (istate->repo->settings.command_requires_full_index)
>>> +	if (!istate->repo->settings.sparse_index ||
>>> +	    istate->repo->settings.command_requires_full_index)
>>>  		ensure_full_index(istate);
>>> +	else if (!istate->sparse_index)
>>> +		convert_to_sparse(istate, 0);
>>>  
>>>  	return istate->cache_nr;
>>
>> We used to say "when we know we are running a command that is not
>> sparse ready, expand it here" and nothing else.
>>
>> We now added a bit more condition for expansion, i.e. "when we are
>> told that the repository does not want sparse index, or when the
>> command is not sparse ready".
>>
>> But the patch goes one more step.  "If we didn't find a reason to
>> expand to full, and if the in-core index we read is not sparse, then
>> call convert_to_sparse() on it".  So if the repo settings tells us
>> that the repository wants a sparse index, and the index we read was
>> not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
>> do?  Nothing special?
> 
> You are absolutely right. I've been talking about what I _thought_
> the code does (and should do) but I missed this 'else if' which is
> in fact doing what you have been claiming the entire time. I should
> have done a better job double-checking the code before doubling
> down on my claims.
> 
> I think the 'else if' should be removed, which would match my
> expectations.
> 

By leaving that part out, wouldn't you only solve half of the "mismatch"
between in-core index and repo setting? Earlier, you described your
expectation as:

> * If index.sparse=false, then a sparse index will be converted to
>   full upon read.
> 
> * If index.sparse=true, then a full index will be converted to sparse
>   on write.

Why should the direction of change to the setting value (false->true vs
true->false) cause the index to convert at different times? Consider the
scenario:

# In a cone-mode, sparse index-enabled sparse checkout repo
$ git -c index.sparse=false status    # 1
$ git status                          # 2
$ git status                          # 3

Before this patch, the index has the following states per command:

(1) the index is sparse in-core, writes full on-disk
(2) the index is full in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, I see two mismatches in my expectations: (1) operates on an in-core
sparse index, despite `index.sparse=false`, and (2) operates on an in-core
full index, despite `index.sparse=true`. 

What you're suggesting solves only the first mismatch. However, the second
mismatch incurs the performance hit of a supposedly-sparse command actually
operating on an in-core full index. It also creates an inconsistency between
(2) and (3) in their use of the sparse index. What I'd like this patch to do
is:

(1) the index is full in-core, writes full on-disk
(2) the index is sparse in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, there are no more mismatches between in-core index usage and what is
written to disk, and (2) and (3) use the same index sparsity.

>> I see many unconditional calls to convert_to_sparse(istate, 0) on
>> the write code paths, where I instead would expect "if the repo
>> wants sparse, call convert_to_sparse(), and if the repo does not
>> want sparse, call ensure_full_index()", before actualy writing the
>> entries out.  These also are setting traps to confuse readers.
>>
>> Perhaps we want a helper function "ensure_right_sparsity(istate)"
>> that would be called where we have
>>
>> 	if (condition)
>> 		convert_to_sparse();
>> 	else
>> 		ensure_full_index();
>>
>> or an unconditonal call to convert_to_sparse() to unconfuse readers?
> 
> convert_to_sparse() has several checks that prevent the conversion
> from happening, such as having a split index. In particular, it will
> skip the conversion if the_repository->settings.sparse_index is false.
> Thus, calling it unconditionally in the write code is correct.
> 

I may have introduced some confusion by redundantly checking
`!istate->sparse_index` before converting to sparse (my goal was readability
- showing the index does not need to be converted to sparse if it is already
sparse - but I think it ended up doing the opposite). The condition could be
removed entirely, thanks to an equivalent check inside `convert_to_sparse`:

-       if (istate->repo->settings.command_requires_full_index)
+       if (!istate->repo->settings.sparse_index ||
+           istate->repo->settings.command_requires_full_index)
                ensure_full_index(istate);
+       else
+               convert_to_sparse(istate, 0);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-18 13:25             ` Derrick Stolee
  2021-10-18 14:14               ` Victoria Dye
@ 2021-10-18 16:09               ` Junio C Hamano
  1 sibling, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2021-10-18 16:09 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

Derrick Stolee <stolee@gmail.com> writes:

> I think the 'else if' should be removed, which would match my
> expectations.
> ...
> convert_to_sparse() has several checks that prevent the conversion
> from happening, such as having a split index. In particular, it will
> skip the conversion if the_repository->settings.sparse_index is false.
> Thus, calling it unconditionally in the write code is correct.

Heh ;-) It was exactly the fact that convert_to_sparse() has many
checks inside that confused me into thinking that the new "else" may
turn into a no-op and the behaviour of the new code matches your
description.  It is confusing, and it being one way street does not
help readers, either.

Which in turn led me into worrying that our unconditional calls to
convert_to_sparse(), which are not accompanied by corresponding
calls to ensure_full_index() on the write code paths, were
problematic.  With the patch under discussion, in a repository that
does not want to see a sparse index, if we did not have "otherwise,
do sparsify the in-core index", lack of ensure_full_index() may not
be a problem, because we are guaranteeing that the in-core index is
fully expanded by the time the control reaches the write code path.

But if we by default shrunk the in-core index upon reading
(i.e. with the "else" clause), its correctness becomes muddy.
Convert_to_sparse() may not turn a non-sparse in-core index into
sparse with these checks, but it would not expand a sparse in-core
index before writing out.

Which in turn led me to suggest, instead of unconditional calls to
convert_to_sparse(), make unconditional calls to a helper function
that fixes the sparseness.  I knew convert_to_sparse() refrains from
touching the in-core index when the repository does not want a
sparse index, but I also am reasonably sure that it lacks support to
expanding the sparse index into full, so it cannot be the "fix to
the correct sparseness" function.

If our promise to a repository that does not want a sparse index is
that the read code path ensures that the in-core index is fully
expanded, and that nobody (including convert_to_sparse()) places a
sparse entry in the in-core index, then the promise would not be
broken, but I still wonder if the whole code is human-reader
friendly.  It wasn't to me ;-)

> Perhaps a rename to something like "convert_to_sparse_if_able()" would
> make sense? But it might be best to leave that one lie.

"-if-able" does not make much sense to me, but "-as-needed" may.  By
not calling it "fix-sparseness", but saying "convert-to-sparse", we
are making it clear that it is a one-way street that may make it
sparse but never removes sparseness.

I do not know if that is the guarantee we want to keep giving the
callers (not saying that I have a hunch that it would be better to
allow us to optionally expand to full in that function---I genuinely
do not know), or we may want to call it a neutral "fix sparseness"
to keep the door open for a future where we may expand an already
sparse in-core index into full based on the setting.

Thanks.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
  2021-10-18 14:14               ` Victoria Dye
@ 2021-10-21 13:26                 ` Derrick Stolee
  0 siblings, 0 replies; 44+ messages in thread
From: Derrick Stolee @ 2021-10-21 13:26 UTC (permalink / raw)
  To: Victoria Dye, Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git

On 10/18/2021 10:14 AM, Victoria Dye wrote:
> Derrick Stolee wrote:
>> On 10/17/2021 9:15 PM, Junio C Hamano wrote:
>>> OK.  I somehow got the impression that we convert in both ways from
>>> the patch text under discussion, specifically this part in
>>> do_read_index():
>>>
>>>> -	if (istate->repo->settings.command_requires_full_index)
>>>> +	if (!istate->repo->settings.sparse_index ||
>>>> +	    istate->repo->settings.command_requires_full_index)
>>>>  		ensure_full_index(istate);
>>>> +	else if (!istate->sparse_index)
>>>> +		convert_to_sparse(istate, 0);
>>>>  
>>>>  	return istate->cache_nr;
>>>
>>> We used to say "when we know we are running a command that is not
>>> sparse ready, expand it here" and nothing else.
>>>
>>> We now added a bit more condition for expansion, i.e. "when we are
>>> told that the repository does not want sparse index, or when the
>>> command is not sparse ready".
>>>
>>> But the patch goes one more step.  "If we didn't find a reason to
>>> expand to full, and if the in-core index we read is not sparse, then
>>> call convert_to_sparse() on it".  So if the repo settings tells us
>>> that the repository wants a sparse index, and the index we read was
>>> not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
>>> do?  Nothing special?
>>
>> You are absolutely right. I've been talking about what I _thought_
>> the code does (and should do) but I missed this 'else if' which is
>> in fact doing what you have been claiming the entire time. I should
>> have done a better job double-checking the code before doubling
>> down on my claims.
>>
>> I think the 'else if' should be removed, which would match my
>> expectations.
>>
> 
> By leaving that part out, wouldn't you only solve half of the "mismatch"
> between in-core index and repo setting? Earlier, you described your
> expectation as:
> 
>> * If index.sparse=false, then a sparse index will be converted to
>>   full upon read.
>>
>> * If index.sparse=true, then a full index will be converted to sparse
>>   on write.
> 
> Why should the direction of change to the setting value (false->true vs
> true->false) cause the index to convert at different times? Consider the
> scenario:
> 
> # In a cone-mode, sparse index-enabled sparse checkout repo
> $ git -c index.sparse=false status    # 1
> $ git status                          # 2
> $ git status                          # 3
> 
> Before this patch, the index has the following states per command:
> 
> (1) the index is sparse in-core, writes full on-disk
> (2) the index is full in-core, writes sparse on-disk
> (3) the index is sparse in-core, writes sparse on-disk
> 
> Here, I see two mismatches in my expectations: (1) operates on an in-core
> sparse index, despite `index.sparse=false`, and (2) operates on an in-core
> full index, despite `index.sparse=true`. 
> 
> What you're suggesting solves only the first mismatch. However, the second
> mismatch incurs the performance hit of a supposedly-sparse command actually
> operating on an in-core full index. It also creates an inconsistency between
> (2) and (3) in their use of the sparse index. What I'd like this patch to do
> is:
> 
> (1) the index is full in-core, writes full on-disk
> (2) the index is sparse in-core, writes sparse on-disk
> (3) the index is sparse in-core, writes sparse on-disk
> 
> Here, there are no more mismatches between in-core index usage and what is
> written to disk, and (2) and (3) use the same index sparsity.

I suppose that my perspective is that we always need to handle a full
index in-core, because it is a subset of the capabilities of a sparse
index. There is not much value in requiring the in-core index be sparse.

But also, I'm now less concerned about commands that convert from
full-to-sparse on read and expand back to full because of
command_requires_full_index. This should be a short-lived issue because
the index.sparse config is unlikely to be changing frequently, so once
the index is converted to sparse on write, we no longer need to do any
work to convert the in-core index to sparse on read.

The thing to keep in mind is that not every command that reads the index
also writes it. For example, the two 'git status' commands you list might
not write the index if there is no new information in the filesystem that
would trigger an index write.

In short: I've shifted my view and no longer oppose this conversion
immediately upon reading.

>>> I see many unconditional calls to convert_to_sparse(istate, 0) on
>>> the write code paths, where I instead would expect "if the repo
>>> wants sparse, call convert_to_sparse(), and if the repo does not
>>> want sparse, call ensure_full_index()", before actualy writing the
>>> entries out.  These also are setting traps to confuse readers.
>>>
>>> Perhaps we want a helper function "ensure_right_sparsity(istate)"
>>> that would be called where we have
>>>
>>> 	if (condition)
>>> 		convert_to_sparse();
>>> 	else
>>> 		ensure_full_index();
>>>
>>> or an unconditonal call to convert_to_sparse() to unconfuse readers?
>>
>> convert_to_sparse() has several checks that prevent the conversion
>> from happening, such as having a split index. In particular, it will
>> skip the conversion if the_repository->settings.sparse_index is false.
>> Thus, calling it unconditionally in the write code is correct.
>>
> 
> I may have introduced some confusion by redundantly checking
> `!istate->sparse_index` before converting to sparse (my goal was readability
> - showing the index does not need to be converted to sparse if it is already
> sparse - but I think it ended up doing the opposite). The condition could be
> removed entirely, thanks to an equivalent check inside `convert_to_sparse`:
> 
> -       if (istate->repo->settings.command_requires_full_index)
> +       if (!istate->repo->settings.sparse_index ||
> +           istate->repo->settings.command_requires_full_index)
>                 ensure_full_index(istate);
> +       else
> +               convert_to_sparse(istate, 0);
 
This is simpler.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
  2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
@ 2021-10-21 20:48 ` Victoria Dye via GitGitGadget
  2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
                     ` (3 more replies)
  2 siblings, 4 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-21 20:48 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye

This series updates do_read_index to use the index.sparse config setting
when determining whether the index should be expanded or collapsed. If the
command & repo allow use of a sparse index, index.sparse is enabled, and a
full index is read from disk, the index is collapsed before returning to the
caller. Conversely, if index.sparse is disabled but the index read from disk
is sparse, the index is expanded before returning. This allows index.sparse
to control use of the sparse index in addition to its existing control over
how the index is written to disk. It also introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).

While testing this change, a bug was found in 'test-tool read-cache' in
which config settings for the repository were not initialized before
preparing the repo settings. This caused index.sparse to always be 'false'
when using the test helper in a cone-mode sparse checkout, breaking tests in
t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
config setup.


Changes since V1
================

 * Add ensure_correct_sparsity function that ensures the index is sparse if
   the repository settings (including index.sparse) allow it, otherwise
   ensuring the index is expanded to full.
 * Restructure condition in do_read_index to, rather than check specifically
   for the index.sparse config setting, call ensure_correct_sparsity
   unconditionally when command_requires_full_index is false.

[1]
https://lore.kernel.org/git/cc60c6e7-ecef-ae22-8ec7-ab290ff2b830@gmail.com/

Thanks! -Victoria

Victoria Dye (3):
  test-read-cache.c: prepare_repo_settings after config init
  sparse-index: add ensure_correct_sparsity function
  sparse-index: update do_read_index to ensure correct sparsity

 read-cache.c                             |  8 +++++
 sparse-index.c                           | 42 ++++++++++++++++++++++--
 sparse-index.h                           |  2 ++
 t/helper/test-read-cache.c               |  5 +--
 t/t1092-sparse-checkout-compatibility.sh | 31 +++++++++++++++++
 5 files changed, 83 insertions(+), 5 deletions(-)


base-commit: f443b226ca681d87a3a31e245a70e6bc2769123c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1059%2Fvdye%2Fsparse%2Findex-sparse-config-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1059/vdye/sparse/index-sparse-config-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1059

Range-diff vs v1:

 1:  6974ce7e7f5 = 1:  6974ce7e7f5 test-read-cache.c: prepare_repo_settings after config init
 -:  ----------- > 2:  0b6e6633bb2 sparse-index: add ensure_correct_sparsity function
 2:  c6279b22545 ! 3:  437cf398256 sparse-index: update index read to consider index.sparse config
     @@ Metadata
      Author: Victoria Dye <vdye@github.com>
      
       ## Commit message ##
     -    sparse-index: update index read to consider index.sparse config
     +    sparse-index: update do_read_index to ensure correct sparsity
      
     -    Use the index.sparse config setting to expand or collapse the index when
     -    read. Previously, index.sparse would determine how the index would be
     -    written to disk, but would not enforce whether the index is read into memory
     -    as full or sparse. Now, the index is expanded when a sparse index is read
     -    with `index.sparse=false` and is collapsed to sparse when a full index is
     -    read with `index.sparse=true` (and the command does not require a full
     -    index).
     +    If `command_requires_full_index` is false, ensure correct in-core index
     +    sparsity on read by calling `ensure_correct_sparsity`. This change is meant
     +    to update the how the index is read in a command after sparse index-related
     +    repository settings are modified. Previously, for example, if `index.sparse`
     +    were changed from `true` to `false`, the in-core index on the next command
     +    would be sparse. The index would only be expanded to full when it was next
     +    written to disk.
      
     -    This makes the behavior of `index.sparse` more intuitive, as it now clearly
     -    enables/disables usage of a sparse index. It also provides users with a way
     -    to disable the sparse index per-command (e.g., for troubleshooting purposes)
     -    without needing to re-initialize the sparse-checkout.
     +    By adding a call to `ensure_correct_sparsity`, the in-core index now matches
     +    the sparsity dictated by the relevant repository settings as soon as it is
     +    read into memory, rather than when it is later written to disk.
      
     +    Helped-by: Junio C Hamano <gitster@pobox.com>
          Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
          Signed-off-by: Victoria Dye <vdye@github.com>
      
       ## read-cache.c ##
      @@ read-cache.c: int do_read_index(struct index_state *istate, const char *path, int must_exist)
     + 
       	if (!istate->repo)
       		istate->repo = the_repository;
     ++
     ++	/*
     ++	 * If the command explicitly requires a full index, force it
     ++	 * to be full. Otherwise, correct the sparsity based on repository
     ++	 * settings and other properties of the index (if necessary).
     ++	 */
       	prepare_repo_settings(istate->repo);
     --	if (istate->repo->settings.command_requires_full_index)
     -+	if (!istate->repo->settings.sparse_index ||
     -+	    istate->repo->settings.command_requires_full_index)
     + 	if (istate->repo->settings.command_requires_full_index)
       		ensure_full_index(istate);
     -+	else if (!istate->sparse_index)
     -+		convert_to_sparse(istate, 0);
     ++	else
     ++		ensure_correct_sparsity(istate);
       
       	return istate->cache_nr;
       

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
@ 2021-10-21 20:48   ` Victoria Dye via GitGitGadget
  2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-21 20:48 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/helper/test-read-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index b52c174acc7..0d9f08931a1 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -39,8 +39,6 @@ int cmd__read_cache(int argc, const char **argv)
 	int table = 0, expand = 0;
 
 	initialize_the_repository();
-	prepare_repo_settings(r);
-	r->settings.command_requires_full_index = 0;
 
 	for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
 		if (skip_prefix(*argv, "--print-and-refresh=", &name))
@@ -56,6 +54,9 @@ int cmd__read_cache(int argc, const char **argv)
 	setup_git_directory();
 	git_config(git_default_config, NULL);
 
+	prepare_repo_settings(r);
+	r->settings.command_requires_full_index = 0;
+
 	for (i = 0; i < cnt; i++) {
 		repo_read_index(r);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
@ 2021-10-21 20:48   ` Victoria Dye via GitGitGadget
  2021-10-21 22:20     ` Junio C Hamano
  2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  3 siblings, 1 reply; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-21 20:48 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

The purpose of the `ensure_correct_sparsity` function is to provide a means
of aligning the in-core index with the sparsity required by the repository
settings and other properties of the index. The function will first attempt
to convert the index to sparse, now with a "SPARSE_INDEX_VERIFY_ALLOWED"
flag that forces the function to return a nonzero value if repository
settings do not allow use of a sparse index. If a nonzero value is returned,
the index is expanded to full with `ensure_full_index`.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 42 +++++++++++++++++++++++++++++++++++++++---
 sparse-index.h |  2 ++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 7b7ff79e044..4273453e078 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -122,11 +122,10 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-int convert_to_sparse(struct index_state *istate, int flags)
+static int can_convert_to_sparse(struct index_state *istate, int flags)
 {
 	int test_env;
-	if (istate->sparse_index || !istate->cache_nr ||
-	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
+	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
 
 	if (!istate->repo)
@@ -187,6 +186,30 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
 		return 0;
 
+	return 1;
+}
+
+int convert_to_sparse(struct index_state *istate, int flags)
+{
+	int verify = flags & SPARSE_INDEX_VERIFY_ALLOWED;
+
+	/*
+	 * If validating with strict checks against whether the sparse index is
+	 * allowed, we want to check `can_convert_to_sparse` *before* exiting
+	 * early due to an already sparse or empty index.
+	 *
+	 * If not performing strict validation, the order is reversed to avoid
+	 * the more expensive checks in `can_convert_to_sparse` whenver possible.
+	 */
+	if (verify) {
+		if (!can_convert_to_sparse(istate, flags))
+			return -1;
+		else if (istate->sparse_index || !istate->cache_nr)
+			return 0;
+	} else if (istate->sparse_index || !istate->cache_nr ||
+		   !can_convert_to_sparse(istate, flags))
+		return 0;
+
 	remove_fsmonitor(istate);
 
 	trace2_region_enter("index", "convert_to_sparse", istate->repo);
@@ -313,6 +336,19 @@ void ensure_full_index(struct index_state *istate)
 	trace2_region_leave("index", "ensure_full_index", istate->repo);
 }
 
+void ensure_correct_sparsity(struct index_state *istate)
+{
+	/*
+	 * First check whether the index can be converted to sparse by attempting
+	 * to convert it with the SPARSE_INDEX_VERIFY_ALLOWED flag. If the
+	 * SPARSE_INDEX_VERIFY_ALLOWED checks indicate that the index cannot
+	 * be converted because repository settings and/or index properties
+	 * do not allow it, expand the index to full.
+	 */
+	if (convert_to_sparse(istate, SPARSE_INDEX_VERIFY_ALLOWED))
+		ensure_full_index(istate);
+}
+
 /*
  * This static global helps avoid infinite recursion between
  * expand_to_path() and index_file_exists().
diff --git a/sparse-index.h b/sparse-index.h
index 9f3d7bc7faf..b61754f1f76 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -3,7 +3,9 @@
 
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
+#define SPARSE_INDEX_VERIFY_ALLOWED (1 << 1)
 int convert_to_sparse(struct index_state *istate, int flags);
+void ensure_correct_sparsity(struct index_state *istate);
 
 /*
  * Some places in the codebase expect to search for a specific path.
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
  2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-10-21 20:48   ` Victoria Dye via GitGitGadget
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  3 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-21 20:48 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

If `command_requires_full_index` is false, ensure correct in-core index
sparsity on read by calling `ensure_correct_sparsity`. This change is meant
to update the how the index is read in a command after sparse index-related
repository settings are modified. Previously, for example, if `index.sparse`
were changed from `true` to `false`, the in-core index on the next command
would be sparse. The index would only be expanded to full when it was next
written to disk.

By adding a call to `ensure_correct_sparsity`, the in-core index now matches
the sparsity dictated by the relevant repository settings as soon as it is
read into memory, rather than when it is later written to disk.

Helped-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             |  8 ++++++
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a78b88a41bf..b3772ba70a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 	if (!istate->repo)
 		istate->repo = the_repository;
+
+	/*
+	 * If the command explicitly requires a full index, force it
+	 * to be full. Otherwise, correct the sparsity based on repository
+	 * settings and other properties of the index (if necessary).
+	 */
 	prepare_repo_settings(istate->repo);
 	if (istate->repo->settings.command_requires_full_index)
 		ensure_full_index(istate);
+	else
+		ensure_correct_sparsity(istate);
 
 	return istate->cache_nr;
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ca91c6a67f8..59accde1fa3 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
+test_expect_success 'index.sparse disabled inline uses full index' '
+	init_repos &&
+
+	# When index.sparse is disabled inline with `git status`, the
+	# index is expanded at the beginning of the execution then never
+	# converted back to sparse. It is then written to disk as a full index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index -c index.sparse=false status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	test_region index ensure_full_index trace2.txt &&
+
+	# Since index.sparse is set to true at a repo level, the index
+	# is converted from full to sparse when read, then never expanded
+	# over the course of `git status`. It is written to disk as a sparse
+	# index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt &&
+
+	# Now that the index has been written to disk as sparse, it is not
+	# converted to sparse (or expanded to full) when read by `git status`.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt
+'
+
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-10-21 22:20     ` Junio C Hamano
  2021-10-27 17:21       ` Victoria Dye
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-21 22:20 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, stolee, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -int convert_to_sparse(struct index_state *istate, int flags)
> +static int can_convert_to_sparse(struct index_state *istate, int flags)
>  {
>  	int test_env;

May not be a problem with this patch, but this variable does not
need to be in this large a scope.

> -	if (istate->sparse_index || !istate->cache_nr ||
> -	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
> +	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
>  		return 0;
>  
>  	if (!istate->repo)
> @@ -187,6 +186,30 @@ int convert_to_sparse(struct index_state *istate, int flags)
>  	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
>  		return 0;
>  
> +	return 1;
> +}
> +
> +int convert_to_sparse(struct index_state *istate, int flags)
> +{
> +	int verify = flags & SPARSE_INDEX_VERIFY_ALLOWED;
> +
> +	/*
> +	 * If validating with strict checks against whether the sparse index is
> +	 * allowed, we want to check `can_convert_to_sparse` *before* exiting
> +	 * early due to an already sparse or empty index.
> +	 *
> +	 * If not performing strict validation, the order is reversed to avoid
> +	 * the more expensive checks in `can_convert_to_sparse` whenver possible.
> +	 */
> +	if (verify) {
> +		if (!can_convert_to_sparse(istate, flags))
> +			return -1;
> +		else if (istate->sparse_index || !istate->cache_nr)
> +			return 0;
> +	} else if (istate->sparse_index || !istate->cache_nr ||
> +		   !can_convert_to_sparse(istate, flags))
> +		return 0;
> +
>  	remove_fsmonitor(istate);
>  
>  	trace2_region_enter("index", "convert_to_sparse", istate->repo);
> @@ -313,6 +336,19 @@ void ensure_full_index(struct index_state *istate)
>  	trace2_region_leave("index", "ensure_full_index", istate->repo);
>  }
>  
> +void ensure_correct_sparsity(struct index_state *istate)
> +{
> +	/*
> +	 * First check whether the index can be converted to sparse by attempting
> +	 * to convert it with the SPARSE_INDEX_VERIFY_ALLOWED flag. If the
> +	 * SPARSE_INDEX_VERIFY_ALLOWED checks indicate that the index cannot
> +	 * be converted because repository settings and/or index properties
> +	 * do not allow it, expand the index to full.
> +	 */

The logic may be OK, but the need to give this long description is a
sign that the meaning of the value returned from the function is not
clear from the name of the function.

> +	if (convert_to_sparse(istate, SPARSE_INDEX_VERIFY_ALLOWED))
> +		ensure_full_index(istate);

It might make it more straight-forward to 

 (1) drop the "if (verify)" part in convert_to_sparse(), which would
     mean that for all callers convert_to_sparse() will retain the
     same behaviour as before;

 (2) make a call to can_convert_to_sparse() here, and if that
     returns true, make a call to ensure_full_index(); this would
     behave identically to what this patch does when can_convert is
     false; and

 (3) correct the can_convert_to_sparse() function to not blow away
     the cache_tree unconditionally and recompute, so that calling
     it twice in a row will not be costly.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-21 22:20     ` Junio C Hamano
@ 2021-10-27 17:21       ` Victoria Dye
  0 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye @ 2021-10-27 17:21 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye via GitGitGadget; +Cc: git, stolee

Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >> @@ -313,6 +336,19 @@ void ensure_full_index(struct index_state *istate)
>>  	trace2_region_leave("index", "ensure_full_index", istate->repo);
>>  }
>>  
>> +void ensure_correct_sparsity(struct index_state *istate)
>> +{
>> +	/*
>> +	 * First check whether the index can be converted to sparse by attempting
>> +	 * to convert it with the SPARSE_INDEX_VERIFY_ALLOWED flag. If the
>> +	 * SPARSE_INDEX_VERIFY_ALLOWED checks indicate that the index cannot
>> +	 * be converted because repository settings and/or index properties
>> +	 * do not allow it, expand the index to full.
>> +	 */
> 
> The logic may be OK, but the need to give this long description is a
> sign that the meaning of the value returned from the function is not
> clear from the name of the function.
> 
>> +	if (convert_to_sparse(istate, SPARSE_INDEX_VERIFY_ALLOWED))
>> +		ensure_full_index(istate);
> 
> It might make it more straight-forward to 
> 
>  (1) drop the "if (verify)" part in convert_to_sparse(), which would
>      mean that for all callers convert_to_sparse() will retain the
>      same behaviour as before;
> 
>  (2) make a call to can_convert_to_sparse() here, and if that
>      returns true, make a call to ensure_full_index(); this would
>      behave identically to what this patch does when can_convert is
>      false; and
> 
>  (3) correct the can_convert_to_sparse() function to not blow away
>      the cache_tree unconditionally and recompute, so that calling
>      it twice in a row will not be costly.
> 

Agreed, this approach is a lot easier to follow. I went a bit overboard
trying to handle *all* possible cases where `convert_to_sparse` returns
before converting, but the primary goal of this series is updating the
behavior when config settings change. I'll include the suggested
restructuring in my next re-roll. Thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
@ 2021-10-27 18:20   ` Victoria Dye via GitGitGadget
  2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
                       ` (4 more replies)
  3 siblings, 5 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-27 18:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye

This series updates do_read_index to use the index.sparse config setting
when determining whether the index should be expanded or collapsed. If the
command & repo allow use of a sparse index, index.sparse is enabled, and a
full index is read from disk, the index is collapsed before returning to the
caller. Conversely, if index.sparse is disabled but the index read from disk
is sparse, the index is expanded before returning. This allows index.sparse
to control use of the sparse index in addition to its existing control over
how the index is written to disk. It also introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).

While testing this change, a bug was found in 'test-tool read-cache' in
which config settings for the repository were not initialized before
preparing the repo settings. This caused index.sparse to always be 'false'
when using the test helper in a cone-mode sparse checkout, breaking tests in
t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
config setup.


Changes since V1
================

 * Add ensure_correct_sparsity function that ensures the index is sparse if
   the repository settings (including index.sparse) allow it, otherwise
   ensuring the index is expanded to full.
 * Restructure condition in do_read_index to, rather than check specifically
   for the index.sparse config setting, call ensure_correct_sparsity
   unconditionally when command_requires_full_index is false.


Changes since V2
================

 * Rename can_convert_to_sparse to is_sparse_index_allowed to more
   accurately reflect what the function returns.
 * Remove index-iterating checks from is_sparse_index_allowed, leaving only
   inexpensive checks on config settings & sparse checkout patterns. Checks
   are still part of convert_to_sparse to ensure it behaves exactly as it
   did before this series.
 * Restructure ensure_correct_sparsity for better readability.
 * Fix test_env variable scope.

[1]
https://lore.kernel.org/git/cc60c6e7-ecef-ae22-8ec7-ab290ff2b830@gmail.com/

Thanks! -Victoria

Victoria Dye (3):
  test-read-cache.c: prepare_repo_settings after config init
  sparse-index: add ensure_correct_sparsity function
  sparse-index: update do_read_index to ensure correct sparsity

 read-cache.c                             |  8 ++++++
 sparse-index.c                           | 33 +++++++++++++++++++++---
 sparse-index.h                           |  1 +
 t/helper/test-read-cache.c               |  5 ++--
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++
 5 files changed, 72 insertions(+), 6 deletions(-)


base-commit: f443b226ca681d87a3a31e245a70e6bc2769123c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1059%2Fvdye%2Fsparse%2Findex-sparse-config-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1059/vdye/sparse/index-sparse-config-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1059

Range-diff vs v2:

 1:  6974ce7e7f5 = 1:  6974ce7e7f5 test-read-cache.c: prepare_repo_settings after config init
 2:  0b6e6633bb2 ! 2:  9d6511db072 sparse-index: add ensure_correct_sparsity function
     @@ Metadata
       ## Commit message ##
          sparse-index: add ensure_correct_sparsity function
      
     -    The purpose of the `ensure_correct_sparsity` function is to provide a means
     -    of aligning the in-core index with the sparsity required by the repository
     -    settings and other properties of the index. The function will first attempt
     -    to convert the index to sparse, now with a "SPARSE_INDEX_VERIFY_ALLOWED"
     -    flag that forces the function to return a nonzero value if repository
     -    settings do not allow use of a sparse index. If a nonzero value is returned,
     -    the index is expanded to full with `ensure_full_index`.
     +    The `ensure_correct_sparsity` function is intended to provide a means of
     +    aligning the in-core index with the sparsity required by the repository
     +    settings and other properties of the index. The function first checks
     +    whether a sparse index is allowed (per repository & sparse checkout pattern
     +    settings). If the sparse index may be used, the index is converted to
     +    sparse; otherwise, it is explicitly expanded with `ensure_full_index`.
      
          Helped-by: Junio C Hamano <gitster@pobox.com>
          Signed-off-by: Victoria Dye <vdye@github.com>
     @@ sparse-index.c: static int index_has_unmerged_entries(struct index_state *istate
       }
       
      -int convert_to_sparse(struct index_state *istate, int flags)
     -+static int can_convert_to_sparse(struct index_state *istate, int flags)
     ++static int is_sparse_index_allowed(struct index_state *istate, int flags)
       {
     - 	int test_env;
     +-	int test_env;
      -	if (istate->sparse_index || !istate->cache_nr ||
      -	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
      +	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
       		return 0;
       
       	if (!istate->repo)
     + 		istate->repo = the_repository;
     + 
     + 	if (!(flags & SPARSE_INDEX_MEMORY_ONLY)) {
     ++		int test_env;
     ++
     + 		/*
     + 		 * The sparse index is not (yet) integrated with a split index.
     + 		 */
      @@ sparse-index.c: int convert_to_sparse(struct index_state *istate, int flags)
     - 	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
     + 	if (!istate->sparse_checkout_patterns->use_cone_patterns)
       		return 0;
       
      +	return 1;
     @@ sparse-index.c: int convert_to_sparse(struct index_state *istate, int flags)
      +
      +int convert_to_sparse(struct index_state *istate, int flags)
      +{
     -+	int verify = flags & SPARSE_INDEX_VERIFY_ALLOWED;
     -+
      +	/*
     -+	 * If validating with strict checks against whether the sparse index is
     -+	 * allowed, we want to check `can_convert_to_sparse` *before* exiting
     -+	 * early due to an already sparse or empty index.
     -+	 *
     -+	 * If not performing strict validation, the order is reversed to avoid
     -+	 * the more expensive checks in `can_convert_to_sparse` whenver possible.
     ++	 * If the index is already sparse, empty, or otherwise
     ++	 * cannot be converted to sparse, do not convert.
      +	 */
     -+	if (verify) {
     -+		if (!can_convert_to_sparse(istate, flags))
     -+			return -1;
     -+		else if (istate->sparse_index || !istate->cache_nr)
     -+			return 0;
     -+	} else if (istate->sparse_index || !istate->cache_nr ||
     -+		   !can_convert_to_sparse(istate, flags))
     ++	if (istate->sparse_index || !istate->cache_nr ||
     ++	    !is_sparse_index_allowed(istate, flags))
      +		return 0;
      +
     - 	remove_fsmonitor(istate);
     - 
     - 	trace2_region_enter("index", "convert_to_sparse", istate->repo);
     + 	/*
     + 	 * NEEDSWORK: If we have unmerged entries, then stay full.
     + 	 * Unmerged entries prevent the cache-tree extension from working.
      @@ sparse-index.c: void ensure_full_index(struct index_state *istate)
       	trace2_region_leave("index", "ensure_full_index", istate->repo);
       }
     @@ sparse-index.c: void ensure_full_index(struct index_state *istate)
      +void ensure_correct_sparsity(struct index_state *istate)
      +{
      +	/*
     -+	 * First check whether the index can be converted to sparse by attempting
     -+	 * to convert it with the SPARSE_INDEX_VERIFY_ALLOWED flag. If the
     -+	 * SPARSE_INDEX_VERIFY_ALLOWED checks indicate that the index cannot
     -+	 * be converted because repository settings and/or index properties
     -+	 * do not allow it, expand the index to full.
     ++	 * If the index can be sparse, make it sparse. Otherwise,
     ++	 * ensure the index is full.
      +	 */
     -+	if (convert_to_sparse(istate, SPARSE_INDEX_VERIFY_ALLOWED))
     ++	if (is_sparse_index_allowed(istate, 0))
     ++		convert_to_sparse(istate, 0);
     ++	else
      +		ensure_full_index(istate);
      +}
      +
     @@ sparse-index.c: void ensure_full_index(struct index_state *istate)
      
       ## sparse-index.h ##
      @@
     - 
       struct index_state;
       #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
     -+#define SPARSE_INDEX_VERIFY_ALLOWED (1 << 1)
       int convert_to_sparse(struct index_state *istate, int flags);
      +void ensure_correct_sparsity(struct index_state *istate);
       
 3:  437cf398256 = 3:  d6c3c694e1e sparse-index: update do_read_index to ensure correct sparsity

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
@ 2021-10-27 18:20     ` Victoria Dye via GitGitGadget
  2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-27 18:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/helper/test-read-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index b52c174acc7..0d9f08931a1 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -39,8 +39,6 @@ int cmd__read_cache(int argc, const char **argv)
 	int table = 0, expand = 0;
 
 	initialize_the_repository();
-	prepare_repo_settings(r);
-	r->settings.command_requires_full_index = 0;
 
 	for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
 		if (skip_prefix(*argv, "--print-and-refresh=", &name))
@@ -56,6 +54,9 @@ int cmd__read_cache(int argc, const char **argv)
 	setup_git_directory();
 	git_config(git_default_config, NULL);
 
+	prepare_repo_settings(r);
+	r->settings.command_requires_full_index = 0;
+
 	for (i = 0; i < cnt; i++) {
 		repo_read_index(r);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
@ 2021-10-27 18:20     ` Victoria Dye via GitGitGadget
  2021-10-27 20:07       ` Derrick Stolee
  2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-27 18:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

The `ensure_correct_sparsity` function is intended to provide a means of
aligning the in-core index with the sparsity required by the repository
settings and other properties of the index. The function first checks
whether a sparse index is allowed (per repository & sparse checkout pattern
settings). If the sparse index may be used, the index is converted to
sparse; otherwise, it is explicitly expanded with `ensure_full_index`.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 33 +++++++++++++++++++++++++++++----
 sparse-index.h |  1 +
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 7b7ff79e044..bc3ee358c63 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -122,17 +122,17 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-int convert_to_sparse(struct index_state *istate, int flags)
+static int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
-	int test_env;
-	if (istate->sparse_index || !istate->cache_nr ||
-	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
+	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
 
 	if (!istate->repo)
 		istate->repo = the_repository;
 
 	if (!(flags & SPARSE_INDEX_MEMORY_ONLY)) {
+		int test_env;
+
 		/*
 		 * The sparse index is not (yet) integrated with a split index.
 		 */
@@ -168,6 +168,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (!istate->sparse_checkout_patterns->use_cone_patterns)
 		return 0;
 
+	return 1;
+}
+
+int convert_to_sparse(struct index_state *istate, int flags)
+{
+	/*
+	 * If the index is already sparse, empty, or otherwise
+	 * cannot be converted to sparse, do not convert.
+	 */
+	if (istate->sparse_index || !istate->cache_nr ||
+	    !is_sparse_index_allowed(istate, flags))
+		return 0;
+
 	/*
 	 * NEEDSWORK: If we have unmerged entries, then stay full.
 	 * Unmerged entries prevent the cache-tree extension from working.
@@ -313,6 +326,18 @@ void ensure_full_index(struct index_state *istate)
 	trace2_region_leave("index", "ensure_full_index", istate->repo);
 }
 
+void ensure_correct_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index can be sparse, make it sparse. Otherwise,
+	 * ensure the index is full.
+	 */
+	if (is_sparse_index_allowed(istate, 0))
+		convert_to_sparse(istate, 0);
+	else
+		ensure_full_index(istate);
+}
+
 /*
  * This static global helps avoid infinite recursion between
  * expand_to_path() and index_file_exists().
diff --git a/sparse-index.h b/sparse-index.h
index 9f3d7bc7faf..656bd835b25 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -4,6 +4,7 @@
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
 int convert_to_sparse(struct index_state *istate, int flags);
+void ensure_correct_sparsity(struct index_state *istate);
 
 /*
  * Some places in the codebase expect to search for a specific path.
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
  2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
  2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-10-27 18:20     ` Victoria Dye via GitGitGadget
  2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-27 18:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

If `command_requires_full_index` is false, ensure correct in-core index
sparsity on read by calling `ensure_correct_sparsity`. This change is meant
to update the how the index is read in a command after sparse index-related
repository settings are modified. Previously, for example, if `index.sparse`
were changed from `true` to `false`, the in-core index on the next command
would be sparse. The index would only be expanded to full when it was next
written to disk.

By adding a call to `ensure_correct_sparsity`, the in-core index now matches
the sparsity dictated by the relevant repository settings as soon as it is
read into memory, rather than when it is later written to disk.

Helped-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             |  8 ++++++
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a78b88a41bf..b3772ba70a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 	if (!istate->repo)
 		istate->repo = the_repository;
+
+	/*
+	 * If the command explicitly requires a full index, force it
+	 * to be full. Otherwise, correct the sparsity based on repository
+	 * settings and other properties of the index (if necessary).
+	 */
 	prepare_repo_settings(istate->repo);
 	if (istate->repo->settings.command_requires_full_index)
 		ensure_full_index(istate);
+	else
+		ensure_correct_sparsity(istate);
 
 	return istate->cache_nr;
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ca91c6a67f8..59accde1fa3 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
+test_expect_success 'index.sparse disabled inline uses full index' '
+	init_repos &&
+
+	# When index.sparse is disabled inline with `git status`, the
+	# index is expanded at the beginning of the execution then never
+	# converted back to sparse. It is then written to disk as a full index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index -c index.sparse=false status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	test_region index ensure_full_index trace2.txt &&
+
+	# Since index.sparse is set to true at a repo level, the index
+	# is converted from full to sparse when read, then never expanded
+	# over the course of `git status`. It is written to disk as a sparse
+	# index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt &&
+
+	# Now that the index has been written to disk as sparse, it is not
+	# converted to sparse (or expanded to full) when read by `git status`.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt
+'
+
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-10-27 20:07       ` Derrick Stolee
  2021-10-27 21:32         ` Junio C Hamano
  0 siblings, 1 reply; 44+ messages in thread
From: Derrick Stolee @ 2021-10-27 20:07 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: gitster, Victoria Dye

On 10/27/2021 2:20 PM, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>

> +static int is_sparse_index_allowed(struct index_state *istate, int flags)

I agree this name is better.

>  {
> -	int test_env;
> -	if (istate->sparse_index || !istate->cache_nr ||
> -	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
> +	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
>  		return 0;
>  
>  	if (!istate->repo)
>  		istate->repo = the_repository;
>  
>  	if (!(flags & SPARSE_INDEX_MEMORY_ONLY)) {
> +		int test_env;
> +
>  		/*
>  		 * The sparse index is not (yet) integrated with a split index.
>  		 */

Nice that most of the implementation comes over without showing
up in the diff.

>  	if (!istate->sparse_checkout_patterns->use_cone_patterns)
>  		return 0;
>  
> +	return 1;
> +}
> +
> +int convert_to_sparse(struct index_state *istate, int flags)
> +{
> +	/*
> +	 * If the index is already sparse, empty, or otherwise
> +	 * cannot be converted to sparse, do not convert.
> +	 */
> +	if (istate->sparse_index || !istate->cache_nr ||
> +	    !is_sparse_index_allowed(istate, flags))
> +		return 0;

> +void ensure_correct_sparsity(struct index_state *istate)
> +{
> +	/*
> +	 * If the index can be sparse, make it sparse. Otherwise,
> +	 * ensure the index is full.
> +	 */
> +	if (is_sparse_index_allowed(istate, 0))
> +		convert_to_sparse(istate, 0);
> +	else
> +		ensure_full_index(istate);
> +}

These two methods become very simple. Excellent.

-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
@ 2021-10-27 20:08     ` Derrick Stolee
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
  4 siblings, 0 replies; 44+ messages in thread
From: Derrick Stolee @ 2021-10-27 20:08 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: gitster, Victoria Dye

On 10/27/2021 2:20 PM, Victoria Dye via GitGitGadget wrote:

> Changes since V2
> ================
> 
>  * Rename can_convert_to_sparse to is_sparse_index_allowed to more
>    accurately reflect what the function returns.
>  * Remove index-iterating checks from is_sparse_index_allowed, leaving only
>    inexpensive checks on config settings & sparse checkout patterns. Checks
>    are still part of convert_to_sparse to ensure it behaves exactly as it
>    did before this series.
>  * Restructure ensure_correct_sparsity for better readability.
>  * Fix test_env variable scope.

I took a full read through this version, taking special look at the
changes since v2. I think this version is good to go!

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-27 20:07       ` Derrick Stolee
@ 2021-10-27 21:32         ` Junio C Hamano
  2021-10-28  1:24           ` Derrick Stolee
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-27 21:32 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

Derrick Stolee <stolee@gmail.com> writes:

>> +int convert_to_sparse(struct index_state *istate, int flags)
>> +{
>> +	/*
>> +	 * If the index is already sparse, empty, or otherwise
>> +	 * cannot be converted to sparse, do not convert.
>> +	 */
>> +	if (istate->sparse_index || !istate->cache_nr ||
>> +	    !is_sparse_index_allowed(istate, flags))
>> +		return 0;

Shouldn't we also at least do this?  Blindly blowing away the entire
cache-tree and rebuilding it from scratch may be hiding a latent bug
somewhere else, but is never supposed to be needed, and is a huge
waste of computational resources.

I say "at least" here, because a cache tree that is partially valid
should be safely salvageable---at least that was the intention back
when I designed the subsystem.

 sparse-index.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git c/sparse-index.c w/sparse-index.c
index bc3ee358c6..a95c3386f3 100644
--- c/sparse-index.c
+++ w/sparse-index.c
@@ -188,17 +188,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (index_has_unmerged_entries(istate))
 		return 0;
 
-	/* Clear and recompute the cache-tree */
-	cache_tree_free(&istate->cache_tree);
-	/*
-	 * Silently return if there is a problem with the cache tree update,
-	 * which might just be due to a conflict state in some entry.
-	 *
-	 * This might create new tree objects, so be sure to use
-	 * WRITE_TREE_MISSING_OK.
-	 */
-	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
-		return 0;
+	if (!cache_tree_fully_valid(&istate->cache_tree)) {
+		/* Clear and recompute the cache-tree */
+		cache_tree_free(&istate->cache_tree);
+		/*
+		 * Silently return if there is a problem with the cache tree update,
+		 * which might just be due to a conflict state in some entry.
+		 *
+		 * This might create new tree objects, so be sure to use
+		 * WRITE_TREE_MISSING_OK.
+		 */
+		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
+			return 0;
+	}
 
 	remove_fsmonitor(istate);
 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-27 21:32         ` Junio C Hamano
@ 2021-10-28  1:24           ` Derrick Stolee
  2021-10-29 13:43             ` Victoria Dye
  0 siblings, 1 reply; 44+ messages in thread
From: Derrick Stolee @ 2021-10-28  1:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

On 10/27/2021 5:32 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>>> +int convert_to_sparse(struct index_state *istate, int flags)
>>> +{
>>> +	/*
>>> +	 * If the index is already sparse, empty, or otherwise
>>> +	 * cannot be converted to sparse, do not convert.
>>> +	 */
>>> +	if (istate->sparse_index || !istate->cache_nr ||
>>> +	    !is_sparse_index_allowed(istate, flags))
>>> +		return 0;
> 
> Shouldn't we also at least do this?  Blindly blowing away the entire
> cache-tree and rebuilding it from scratch may be hiding a latent bug
> somewhere else, but is never supposed to be needed, and is a huge
> waste of computational resources.
> 
> I say "at least" here, because a cache tree that is partially valid
> should be safely salvageable---at least that was the intention back
> when I designed the subsystem.

I think you are right, what you propose below. It certainly seems
like it would work, and even speed up the conversion from full to
sparse. I think I erred on the side of extreme caution and used
a hope that converting to sparse would be rare.

>  sparse-index.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git c/sparse-index.c w/sparse-index.c
> index bc3ee358c6..a95c3386f3 100644
> --- c/sparse-index.c
> +++ w/sparse-index.c
> @@ -188,17 +188,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
>  	if (index_has_unmerged_entries(istate))
>  		return 0;
>  
> -	/* Clear and recompute the cache-tree */
> -	cache_tree_free(&istate->cache_tree);
> -	/*
> -	 * Silently return if there is a problem with the cache tree update,
> -	 * which might just be due to a conflict state in some entry.
> -	 *
> -	 * This might create new tree objects, so be sure to use
> -	 * WRITE_TREE_MISSING_OK.
> -	 */
> -	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
> -		return 0;
> +	if (!cache_tree_fully_valid(&istate->cache_tree)) {
> +		/* Clear and recompute the cache-tree */
> +		cache_tree_free(&istate->cache_tree);
> +		/*
> +		 * Silently return if there is a problem with the cache tree update,
> +		 * which might just be due to a conflict state in some entry.
> +		 *
> +		 * This might create new tree objects, so be sure to use
> +		 * WRITE_TREE_MISSING_OK.
> +		 */
> +		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
> +			return 0;
> +	}

I think at this point we have enough tests that check the sparse index
and its different conversion points that the test suite might catch if
this is a bad idea. Note that this is only a change of behavior if the
cache-tree is valid, which I expect to be the case most of the time in
the tests.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function
  2021-10-28  1:24           ` Derrick Stolee
@ 2021-10-29 13:43             ` Victoria Dye
  0 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye @ 2021-10-29 13:43 UTC (permalink / raw)
  To: Derrick Stolee, Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git

Derrick Stolee wrote:
> On 10/27/2021 5:32 PM, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>>> +int convert_to_sparse(struct index_state *istate, int flags)
>>>> +{
>>>> +	/*
>>>> +	 * If the index is already sparse, empty, or otherwise
>>>> +	 * cannot be converted to sparse, do not convert.
>>>> +	 */
>>>> +	if (istate->sparse_index || !istate->cache_nr ||
>>>> +	    !is_sparse_index_allowed(istate, flags))
>>>> +		return 0;
>>
>> Shouldn't we also at least do this?  Blindly blowing away the entire
>> cache-tree and rebuilding it from scratch may be hiding a latent bug
>> somewhere else, but is never supposed to be needed, and is a huge
>> waste of computational resources.
>>
>> I say "at least" here, because a cache tree that is partially valid
>> should be safely salvageable---at least that was the intention back
>> when I designed the subsystem.
> 
> I think you are right, what you propose below. It certainly seems
> like it would work, and even speed up the conversion from full to
> sparse. I think I erred on the side of extreme caution and used
> a hope that converting to sparse would be rare.
> 
>>  sparse-index.c | 24 +++++++++++++-----------
>>  1 file changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git c/sparse-index.c w/sparse-index.c
>> index bc3ee358c6..a95c3386f3 100644
>> --- c/sparse-index.c
>> +++ w/sparse-index.c
>> @@ -188,17 +188,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
>>  	if (index_has_unmerged_entries(istate))
>>  		return 0;
>>  
>> -	/* Clear and recompute the cache-tree */
>> -	cache_tree_free(&istate->cache_tree);
>> -	/*
>> -	 * Silently return if there is a problem with the cache tree update,
>> -	 * which might just be due to a conflict state in some entry.
>> -	 *
>> -	 * This might create new tree objects, so be sure to use
>> -	 * WRITE_TREE_MISSING_OK.
>> -	 */
>> -	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
>> -		return 0;
>> +	if (!cache_tree_fully_valid(&istate->cache_tree)) {
>> +		/* Clear and recompute the cache-tree */
>> +		cache_tree_free(&istate->cache_tree);
>> +		/*
>> +		 * Silently return if there is a problem with the cache tree update,
>> +		 * which might just be due to a conflict state in some entry.
>> +		 *
>> +		 * This might create new tree objects, so be sure to use
>> +		 * WRITE_TREE_MISSING_OK.
>> +		 */
>> +		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
>> +			return 0;
>> +	}
> 
> I think at this point we have enough tests that check the sparse index
> and its different conversion points that the test suite might catch if
> this is a bad idea. Note that this is only a change of behavior if the
> cache-tree is valid, which I expect to be the case most of the time in
> the tests.
> 
> Thanks,
> -Stolee
> 

This change doesn't appear to introduce any test failures or other unwanted
behavior, so I don't see a reason not to include it. I'll add it in a
re-roll - thanks!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
@ 2021-10-29 13:51     ` Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
                         ` (5 more replies)
  4 siblings, 6 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-29 13:51 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye

This series updates do_read_index to use the index.sparse config setting
when determining whether the index should be expanded or collapsed. If the
command & repo allow use of a sparse index, index.sparse is enabled, and a
full index is read from disk, the index is collapsed before returning to the
caller. Conversely, if index.sparse is disabled but the index read from disk
is sparse, the index is expanded before returning. This allows index.sparse
to control use of the sparse index in addition to its existing control over
how the index is written to disk. It also introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).

While testing this change, a bug was found in 'test-tool read-cache' in
which config settings for the repository were not initialized before
preparing the repo settings. This caused index.sparse to always be 'false'
when using the test helper in a cone-mode sparse checkout, breaking tests in
t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
config setup.


Changes since V1
================

 * Add ensure_correct_sparsity function that ensures the index is sparse if
   the repository settings (including index.sparse) allow it, otherwise
   ensuring the index is expanded to full.
 * Restructure condition in do_read_index to, rather than check specifically
   for the index.sparse config setting, call ensure_correct_sparsity
   unconditionally when command_requires_full_index is false.


Changes since V2
================

 * Rename can_convert_to_sparse to is_sparse_index_allowed to more
   accurately reflect what the function returns.
 * Remove index-iterating checks from is_sparse_index_allowed, leaving only
   inexpensive checks on config settings & sparse checkout patterns. Checks
   are still part of convert_to_sparse to ensure it behaves exactly as it
   did before this series.
 * Restructure ensure_correct_sparsity for better readability.
 * Fix test_env variable scope.


Changes since V3
================

 * Add a new patch to avoid unnecessary cache tree free/recreation when
   possible in convert_to_sparse.

[1]
https://lore.kernel.org/git/cc60c6e7-ecef-ae22-8ec7-ab290ff2b830@gmail.com/

Thanks! -Victoria

Victoria Dye (4):
  test-read-cache.c: prepare_repo_settings after config init
  sparse-index: avoid unnecessary cache tree clearing
  sparse-index: add ensure_correct_sparsity function
  sparse-index: update do_read_index to ensure correct sparsity

 read-cache.c                             |  8 ++++
 sparse-index.c                           | 58 ++++++++++++++++++------
 sparse-index.h                           |  1 +
 t/helper/test-read-cache.c               |  5 +-
 t/t1092-sparse-checkout-compatibility.sh | 31 +++++++++++++
 5 files changed, 86 insertions(+), 17 deletions(-)


base-commit: f443b226ca681d87a3a31e245a70e6bc2769123c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1059%2Fvdye%2Fsparse%2Findex-sparse-config-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1059/vdye/sparse/index-sparse-config-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1059

Range-diff vs v3:

 1:  6974ce7e7f5 = 1:  6974ce7e7f5 test-read-cache.c: prepare_repo_settings after config init
 -:  ----------- > 2:  91351ac4bde sparse-index: avoid unnecessary cache tree clearing
 2:  9d6511db072 = 3:  d2133ca1724 sparse-index: add ensure_correct_sparsity function
 3:  d6c3c694e1e = 4:  b67cd9d07f8 sparse-index: update do_read_index to ensure correct sparsity

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
@ 2021-10-29 13:51       ` Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-29 13:51 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/helper/test-read-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index b52c174acc7..0d9f08931a1 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -39,8 +39,6 @@ int cmd__read_cache(int argc, const char **argv)
 	int table = 0, expand = 0;
 
 	initialize_the_repository();
-	prepare_repo_settings(r);
-	r->settings.command_requires_full_index = 0;
 
 	for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
 		if (skip_prefix(*argv, "--print-and-refresh=", &name))
@@ -56,6 +54,9 @@ int cmd__read_cache(int argc, const char **argv)
 	setup_git_directory();
 	git_config(git_default_config, NULL);
 
+	prepare_repo_settings(r);
+	r->settings.command_requires_full_index = 0;
+
 	for (i = 0; i < cnt; i++) {
 		repo_read_index(r);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
@ 2021-10-29 13:51       ` Victoria Dye via GitGitGadget
  2021-10-29 18:45         ` Junio C Hamano
  2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-29 13:51 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When converting a full index to sparse, clear and recreate the cache tree
only if the cache tree is not fully valid. The convert_to_sparse operation
should exit silently if a cache tree update cannot be successfully completed
(e.g., due to a conflicted entry state). However, because this failure
scenario only occurs when at least a portion of the cache tree is invalid,
we can save ourselves the cost of clearing and recreating the cache tree by
skipping the check when the cache tree is fully valid.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 7b7ff79e044..85613cd8a3a 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -175,17 +175,20 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (index_has_unmerged_entries(istate))
 		return 0;
 
-	/* Clear and recompute the cache-tree */
-	cache_tree_free(&istate->cache_tree);
-	/*
-	 * Silently return if there is a problem with the cache tree update,
-	 * which might just be due to a conflict state in some entry.
-	 *
-	 * This might create new tree objects, so be sure to use
-	 * WRITE_TREE_MISSING_OK.
-	 */
-	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
-		return 0;
+	if (!cache_tree_fully_valid(istate->cache_tree)) {
+		/* Clear and recompute the cache-tree */
+		cache_tree_free(&istate->cache_tree);
+
+		/*
+		 * Silently return if there is a problem with the cache tree update,
+		 * which might just be due to a conflict state in some entry.
+		 *
+		 * This might create new tree objects, so be sure to use
+		 * WRITE_TREE_MISSING_OK.
+		 */
+		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
+			return 0;
+	}
 
 	remove_fsmonitor(istate);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
@ 2021-10-29 13:51       ` Victoria Dye via GitGitGadget
  2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-29 13:51 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

The `ensure_correct_sparsity` function is intended to provide a means of
aligning the in-core index with the sparsity required by the repository
settings and other properties of the index. The function first checks
whether a sparse index is allowed (per repository & sparse checkout pattern
settings). If the sparse index may be used, the index is converted to
sparse; otherwise, it is explicitly expanded with `ensure_full_index`.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 33 +++++++++++++++++++++++++++++----
 sparse-index.h |  1 +
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 85613cd8a3a..a1d505d50e9 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -122,17 +122,17 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-int convert_to_sparse(struct index_state *istate, int flags)
+static int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
-	int test_env;
-	if (istate->sparse_index || !istate->cache_nr ||
-	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
+	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
 
 	if (!istate->repo)
 		istate->repo = the_repository;
 
 	if (!(flags & SPARSE_INDEX_MEMORY_ONLY)) {
+		int test_env;
+
 		/*
 		 * The sparse index is not (yet) integrated with a split index.
 		 */
@@ -168,6 +168,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (!istate->sparse_checkout_patterns->use_cone_patterns)
 		return 0;
 
+	return 1;
+}
+
+int convert_to_sparse(struct index_state *istate, int flags)
+{
+	/*
+	 * If the index is already sparse, empty, or otherwise
+	 * cannot be converted to sparse, do not convert.
+	 */
+	if (istate->sparse_index || !istate->cache_nr ||
+	    !is_sparse_index_allowed(istate, flags))
+		return 0;
+
 	/*
 	 * NEEDSWORK: If we have unmerged entries, then stay full.
 	 * Unmerged entries prevent the cache-tree extension from working.
@@ -316,6 +329,18 @@ void ensure_full_index(struct index_state *istate)
 	trace2_region_leave("index", "ensure_full_index", istate->repo);
 }
 
+void ensure_correct_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index can be sparse, make it sparse. Otherwise,
+	 * ensure the index is full.
+	 */
+	if (is_sparse_index_allowed(istate, 0))
+		convert_to_sparse(istate, 0);
+	else
+		ensure_full_index(istate);
+}
+
 /*
  * This static global helps avoid infinite recursion between
  * expand_to_path() and index_file_exists().
diff --git a/sparse-index.h b/sparse-index.h
index 9f3d7bc7faf..656bd835b25 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -4,6 +4,7 @@
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
 int convert_to_sparse(struct index_state *istate, int flags);
+void ensure_correct_sparsity(struct index_state *istate);
 
 /*
  * Some places in the codebase expect to search for a specific path.
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
                         ` (2 preceding siblings ...)
  2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-10-29 13:51       ` Victoria Dye via GitGitGadget
  2021-11-22 17:36         ` Elijah Newren
  2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
  5 siblings, 1 reply; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-10-29 13:51 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

If `command_requires_full_index` is false, ensure correct in-core index
sparsity on read by calling `ensure_correct_sparsity`. This change is meant
to update the how the index is read in a command after sparse index-related
repository settings are modified. Previously, for example, if `index.sparse`
were changed from `true` to `false`, the in-core index on the next command
would be sparse. The index would only be expanded to full when it was next
written to disk.

By adding a call to `ensure_correct_sparsity`, the in-core index now matches
the sparsity dictated by the relevant repository settings as soon as it is
read into memory, rather than when it is later written to disk.

Helped-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             |  8 ++++++
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a78b88a41bf..b3772ba70a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 	if (!istate->repo)
 		istate->repo = the_repository;
+
+	/*
+	 * If the command explicitly requires a full index, force it
+	 * to be full. Otherwise, correct the sparsity based on repository
+	 * settings and other properties of the index (if necessary).
+	 */
 	prepare_repo_settings(istate->repo);
 	if (istate->repo->settings.command_requires_full_index)
 		ensure_full_index(istate);
+	else
+		ensure_correct_sparsity(istate);
 
 	return istate->cache_nr;
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ca91c6a67f8..59accde1fa3 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
+test_expect_success 'index.sparse disabled inline uses full index' '
+	init_repos &&
+
+	# When index.sparse is disabled inline with `git status`, the
+	# index is expanded at the beginning of the execution then never
+	# converted back to sparse. It is then written to disk as a full index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index -c index.sparse=false status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	test_region index ensure_full_index trace2.txt &&
+
+	# Since index.sparse is set to true at a repo level, the index
+	# is converted from full to sparse when read, then never expanded
+	# over the course of `git status`. It is written to disk as a sparse
+	# index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt &&
+
+	# Now that the index has been written to disk as sparse, it is not
+	# converted to sparse (or expanded to full) when read by `git status`.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt
+'
+
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing
  2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
@ 2021-10-29 18:45         ` Junio C Hamano
  2021-10-29 19:00           ` Derrick Stolee
  0 siblings, 1 reply; 44+ messages in thread
From: Junio C Hamano @ 2021-10-29 18:45 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, stolee, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> When converting a full index to sparse, clear and recreate the cache tree
> only if the cache tree is not fully valid. The convert_to_sparse operation
> should exit silently if a cache tree update cannot be successfully completed
> (e.g., due to a conflicted entry state). However, because this failure
> scenario only occurs when at least a portion of the cache tree is invalid,
> we can save ourselves the cost of clearing and recreating the cache tree by
> skipping the check when the cache tree is fully valid.

I see in cache-tree.c::update_one() this snippet of code.

	/*
	 * If the first entry of this region is a sparse directory
	 * entry corresponding exactly to 'base', then this cache_tree
	 * struct is a "leaf" in the data structure, pointing to the
	 * tree OID specified in the entry.
	 */
	if (entries > 0) {
		const struct cache_entry *ce = cache[0];

		if (S_ISSPARSEDIR(ce->ce_mode) &&
		    ce->ce_namelen == baselen &&
		    !strncmp(ce->name, base, baselen)) {
			it->entry_count = 1;
			oidcpy(&it->oid, &ce->oid);
			return 1;
		}
	}

Sorry for not noticing it earlier, but does this mean that the
content of a cache-tree changes shape when sparse-ness of the index
changes?  Is a cache-tree that knows about all of the
subdirectories, even the ones that are descendants of a directory
that is represented as a tree-ish entry in the main index array,
still valid in a sparse index?

If not, then I do not think of a quick and sure way to ensure that
the cache-tree is valid when the sparse-ness changes.

The earlier suggestion was based on my assumption that even when the
main index array becomes sparse, the cache tree is still populated
and valid, so that after writing a tree and writing an on-disk index,
and then reading the on-disk index back (possibly in another process),
would not have to incur the recomputation cost of the full tree when
the reading codepath needs to flip the sparseness.

But the above code snippet makes me worried a lot.  A cache-tree
that used to be valid when the corresponding in-core index array was
not sparse will become invalid immediately when we decide to make it
sparse, right?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing
  2021-10-29 18:45         ` Junio C Hamano
@ 2021-10-29 19:00           ` Derrick Stolee
  2021-10-29 20:01             ` Junio C Hamano
  0 siblings, 1 reply; 44+ messages in thread
From: Derrick Stolee @ 2021-10-29 19:00 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye via GitGitGadget; +Cc: git, Victoria Dye

On 10/29/2021 2:45 PM, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Victoria Dye <vdye@github.com>
>>
>> When converting a full index to sparse, clear and recreate the cache tree
>> only if the cache tree is not fully valid. The convert_to_sparse operation
>> should exit silently if a cache tree update cannot be successfully completed
>> (e.g., due to a conflicted entry state). However, because this failure
>> scenario only occurs when at least a portion of the cache tree is invalid,
>> we can save ourselves the cost of clearing and recreating the cache tree by
>> skipping the check when the cache tree is fully valid.
> 
> I see in cache-tree.c::update_one() this snippet of code.
> 
> 	/*
> 	 * If the first entry of this region is a sparse directory
> 	 * entry corresponding exactly to 'base', then this cache_tree
> 	 * struct is a "leaf" in the data structure, pointing to the
> 	 * tree OID specified in the entry.
> 	 */
> 	if (entries > 0) {
> 		const struct cache_entry *ce = cache[0];
> 
> 		if (S_ISSPARSEDIR(ce->ce_mode) &&
> 		    ce->ce_namelen == baselen &&
> 		    !strncmp(ce->name, base, baselen)) {
> 			it->entry_count = 1;
> 			oidcpy(&it->oid, &ce->oid);
> 			return 1;
> 		}
> 	}
> 
> Sorry for not noticing it earlier, but does this mean that the
> content of a cache-tree changes shape when sparse-ness of the index
> changes?  Is a cache-tree that knows about all of the
> subdirectories, even the ones that are descendants of a directory
> that is represented as a tree-ish entry in the main index array,
> still valid in a sparse index?
> 
> If not, then I do not think of a quick and sure way to ensure that
> the cache-tree is valid when the sparse-ness changes.
> 
> The earlier suggestion was based on my assumption that even when the
> main index array becomes sparse, the cache tree is still populated
> and valid, so that after writing a tree and writing an on-disk index,
> and then reading the on-disk index back (possibly in another process),
> would not have to incur the recomputation cost of the full tree when
> the reading codepath needs to flip the sparseness.
>
> But the above code snippet makes me worried a lot.  A cache-tree
> that used to be valid when the corresponding in-core index array was
> not sparse will become invalid immediately when we decide to make it
> sparse, right?

I think I would argue that the cache-tree is valid in this case. Each
node represents the tree that would be created from that region of the
index if we called 'git commit'. We can reconstruct the contained
subtrees by walking trees if necessary, but that isn't necessary at
this point.

I am writing a blog post about the sparse index, and I go into detail
about how the cache tree is modified. I use the
derrickstolee/trace-flamechart repo [1] as an example, so I will use
it as a concrete discussion of what is happening here.

Here is the cache-tree without sparse-index:

 [ root (left: 0, size: 16) ]
  |
  +---- [ bin/ (left: 2, size: 1) ] 
  |
  +---- [ examples/ (left: 3, size: 11) ]
        |
        +---- [ fetch/ (left: 3, size: 8) ]
        |
        +---- [ maintenance/ (left: 11, size: 3) ]

Then, I run

	git sparse-checkout init --cone --sparse-index
	git sparse-checkout set bin

the cache-tree looks like this:

 [ root (left: 0, size: 6) ]
  |
  +---- [ bin/ (left: 2, size: 1) ] 
  |
  +---- [ examples/ (left: 3, size: 1) ]

The tree OIDs at each of "root", "bin" and "examples" match the trees
for the nodes in the non-sparse cache tree.

By contrast, including any subtrees of "examples" would cause a few
things to happen:

1. The size of the cache-tree would increase (significantly in large
   repos).

2. The contained cache-tree nodes would have size _zero_, since there
   are no cache entries defined within their directories.

I think in this pruned form, the cache-tree is valid and serves all of
the roles it was designed for. We can quickly compute the root tree in
'git commit'. We can navigate the cache-tree in traverse_by_cache_tree()
to accurately describe the trees in the index. These things are tested
in t1092 and the p2000 performance test.

Does this satisfy your concern, or is there a larger expectation of the
cache-tree extension that I am not taking into account?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing
  2021-10-29 19:00           ` Derrick Stolee
@ 2021-10-29 20:01             ` Junio C Hamano
  0 siblings, 0 replies; 44+ messages in thread
From: Junio C Hamano @ 2021-10-29 20:01 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye

Derrick Stolee <stolee@gmail.com> writes:

> I think in this pruned form, the cache-tree is valid and serves all of
> the roles it was designed for. We can quickly compute the root tree in
> 'git commit'. We can navigate the cache-tree in traverse_by_cache_tree()
> to accurately describe the trees in the index. These things are tested
> in t1092 and the p2000 performance test.

OK, then I was worried too much.  Thanks for clarifying.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity
  2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
@ 2021-11-22 17:36         ` Elijah Newren
  2021-11-22 18:59           ` Victoria Dye
  0 siblings, 1 reply; 44+ messages in thread
From: Elijah Newren @ 2021-11-22 17:36 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Fri, Oct 29, 2021 at 6:56 AM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Victoria Dye <vdye@github.com>
>
> If `command_requires_full_index` is false, ensure correct in-core index
> sparsity on read by calling `ensure_correct_sparsity`. This change is meant
> to update the how the index is read in a command after sparse index-related

s/update the how/update how/ ?

> repository settings are modified. Previously, for example, if `index.sparse`
> were changed from `true` to `false`, the in-core index on the next command
> would be sparse. The index would only be expanded to full when it was next
> written to disk.
>
> By adding a call to `ensure_correct_sparsity`, the in-core index now matches
> the sparsity dictated by the relevant repository settings as soon as it is
> read into memory, rather than when it is later written to disk.

I split up reading this series across different days, and when I got
here, my first reaction was "Okay, but why would you want that?
Sounds like extra work for no gain."  I went and looked up the cover
letter and saw you mentioned that this "introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).  That seems like a good reason to me, and
sounds like it belongs in this commit message.  But it sounds like you
had other reasons in mind.  If so, could you share them; I'm having
difficulty understanding how this would make a difference other than
in the troubleshooting scenario.

>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  read-cache.c                             |  8 ++++++
>  t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
>  2 files changed, 39 insertions(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index a78b88a41bf..b3772ba70a1 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
>
>         if (!istate->repo)
>                 istate->repo = the_repository;
> +
> +       /*
> +        * If the command explicitly requires a full index, force it
> +        * to be full. Otherwise, correct the sparsity based on repository
> +        * settings and other properties of the index (if necessary).
> +        */
>         prepare_repo_settings(istate->repo);
>         if (istate->repo->settings.command_requires_full_index)
>                 ensure_full_index(istate);
> +       else
> +               ensure_correct_sparsity(istate);
>
>         return istate->cache_nr;
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ca91c6a67f8..59accde1fa3 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
>         test_region index ensure_full_index trace2.txt
>  '
>
> +test_expect_success 'index.sparse disabled inline uses full index' '
> +       init_repos &&
> +
> +       # When index.sparse is disabled inline with `git status`, the
> +       # index is expanded at the beginning of the execution then never
> +       # converted back to sparse. It is then written to disk as a full index.
> +       rm -f trace2.txt &&
> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +               git -C sparse-index -c index.sparse=false status &&
> +       ! test_region index convert_to_sparse trace2.txt &&
> +       test_region index ensure_full_index trace2.txt &&
> +
> +       # Since index.sparse is set to true at a repo level, the index
> +       # is converted from full to sparse when read, then never expanded
> +       # over the course of `git status`. It is written to disk as a sparse
> +       # index.
> +       rm -f trace2.txt &&
> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +               git -C sparse-index status &&
> +       test_region index convert_to_sparse trace2.txt &&
> +       ! test_region index ensure_full_index trace2.txt &&
> +
> +       # Now that the index has been written to disk as sparse, it is not
> +       # converted to sparse (or expanded to full) when read by `git status`.
> +       rm -f trace2.txt &&
> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +               git -C sparse-index status &&
> +       ! test_region index convert_to_sparse trace2.txt &&
> +       ! test_region index ensure_full_index trace2.txt
> +'
> +
>  ensure_not_expanded () {
>         rm -f trace2.txt &&
>         echo >>sparse-index/untracked.txt &&
> --
> gitgitgadget

The rest of the patch looks fine.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
                         ` (3 preceding siblings ...)
  2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
@ 2021-11-22 17:40       ` Elijah Newren
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
  5 siblings, 0 replies; 44+ messages in thread
From: Elijah Newren @ 2021-11-22 17:40 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Fri, Oct 29, 2021 at 6:53 AM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series updates do_read_index to use the index.sparse config setting
> when determining whether the index should be expanded or collapsed. If the
> command & repo allow use of a sparse index, index.sparse is enabled, and a
> full index is read from disk, the index is collapsed before returning to the
> caller. Conversely, if index.sparse is disabled but the index read from disk
> is sparse, the index is expanded before returning. This allows index.sparse
> to control use of the sparse index in addition to its existing control over
> how the index is written to disk. It also introduces the ability to
> enable/disable the sparse index on a command-by-command basis (e.g.,
> allowing a user to troubleshoot a sparse-aware command with '-c
> index.sparse=false' [1]).
>
> While testing this change, a bug was found in 'test-tool read-cache' in
> which config settings for the repository were not initialized before
> preparing the repo settings. This caused index.sparse to always be 'false'
> when using the test helper in a cone-mode sparse checkout, breaking tests in
> t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
> config setup.
>
>
> Changes since V1
> ================
>
>  * Add ensure_correct_sparsity function that ensures the index is sparse if
>    the repository settings (including index.sparse) allow it, otherwise
>    ensuring the index is expanded to full.
>  * Restructure condition in do_read_index to, rather than check specifically
>    for the index.sparse config setting, call ensure_correct_sparsity
>    unconditionally when command_requires_full_index is false.
>
>
> Changes since V2
> ================
>
>  * Rename can_convert_to_sparse to is_sparse_index_allowed to more
>    accurately reflect what the function returns.
>  * Remove index-iterating checks from is_sparse_index_allowed, leaving only
>    inexpensive checks on config settings & sparse checkout patterns. Checks
>    are still part of convert_to_sparse to ensure it behaves exactly as it
>    did before this series.
>  * Restructure ensure_correct_sparsity for better readability.
>  * Fix test_env variable scope.
>
>
> Changes since V3
> ================
>
>  * Add a new patch to avoid unnecessary cache tree free/recreation when
>    possible in convert_to_sparse.

I read over this series.  I only spotted two minor things, both with
the commit message of the final patch.

Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity
  2021-11-22 17:36         ` Elijah Newren
@ 2021-11-22 18:59           ` Victoria Dye
  0 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye @ 2021-11-22 18:59 UTC (permalink / raw)
  To: Elijah Newren, Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano

Elijah Newren wrote:
> On Fri, Oct 29, 2021 at 6:56 AM Victoria Dye via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Victoria Dye <vdye@github.com>
>>
>> If `command_requires_full_index` is false, ensure correct in-core index
>> sparsity on read by calling `ensure_correct_sparsity`. This change is meant
>> to update the how the index is read in a command after sparse index-related
> 
> s/update the how/update how/ ?
> 

This was probably the result of mixing up "update the way" and "update how".
I'll go with the latter.

>> repository settings are modified. Previously, for example, if `index.sparse`
>> were changed from `true` to `false`, the in-core index on the next command
>> would be sparse. The index would only be expanded to full when it was next
>> written to disk.
>>
>> By adding a call to `ensure_correct_sparsity`, the in-core index now matches
>> the sparsity dictated by the relevant repository settings as soon as it is
>> read into memory, rather than when it is later written to disk.
> 
> I split up reading this series across different days, and when I got
> here, my first reaction was "Okay, but why would you want that?
> Sounds like extra work for no gain."  I went and looked up the cover
> letter and saw you mentioned that this "introduces the ability to
> enable/disable the sparse index on a command-by-command basis (e.g.,
> allowing a user to troubleshoot a sparse-aware command with '-c
> index.sparse=false' [1]).  That seems like a good reason to me, and
> sounds like it belongs in this commit message.  But it sounds like you
> had other reasons in mind.  If so, could you share them; I'm having
> difficulty understanding how this would make a difference other than
> in the troubleshooting scenario.
> 

The troubleshooting scenario was what inspired this change in the first
place, but it applies any time someone changes repository settings outside
of `git sparse-checkout init`. One relevant scenario I can imagine is if a
user enables `index.sparse, commands would not use the sparse index until a
command that *writes* the index is executed:

$ git config index.sparse true
$ git diff      # read full index, full in-core, no write
$ git add .     # read full index, full in-core, write sparse
$ git diff      # read sparse index, sparse in-core, no write

Another scenario might be enabling `index.sparse`, then running a command
that turns out to be surprisingly slow because it's operating on a full
index in-core (i.e., much slower than the convert_to_sparse + command with
sparse index would be). 

These are definitely edge cases - they rely on manual config changes, and
they're only really noticeable 1) if the config change is immediately
followed by index read-only commands and 2) if the first index-writing
command after the config change is slow. That said, on the off chance that a
user (or future developer) encounter one of these scenarios, I think it'll
be helpful to have `index.sparse` take effect "immediately" after the config
change.

I'll include these (and the troubleshooting) examples in the updated commit
message.

>>
>> Helped-by: Junio C Hamano <gitster@pobox.com>
>> Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
>> Signed-off-by: Victoria Dye <vdye@github.com>
>> ---
>>  read-cache.c                             |  8 ++++++
>>  t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
>>  2 files changed, 39 insertions(+)
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index a78b88a41bf..b3772ba70a1 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
>>
>>         if (!istate->repo)
>>                 istate->repo = the_repository;
>> +
>> +       /*
>> +        * If the command explicitly requires a full index, force it
>> +        * to be full. Otherwise, correct the sparsity based on repository
>> +        * settings and other properties of the index (if necessary).
>> +        */
>>         prepare_repo_settings(istate->repo);
>>         if (istate->repo->settings.command_requires_full_index)
>>                 ensure_full_index(istate);
>> +       else
>> +               ensure_correct_sparsity(istate);
>>
>>         return istate->cache_nr;
>>
>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>> index ca91c6a67f8..59accde1fa3 100755
>> --- a/t/t1092-sparse-checkout-compatibility.sh
>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>> @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
>>         test_region index ensure_full_index trace2.txt
>>  '
>>
>> +test_expect_success 'index.sparse disabled inline uses full index' '
>> +       init_repos &&
>> +
>> +       # When index.sparse is disabled inline with `git status`, the
>> +       # index is expanded at the beginning of the execution then never
>> +       # converted back to sparse. It is then written to disk as a full index.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index -c index.sparse=false status &&
>> +       ! test_region index convert_to_sparse trace2.txt &&
>> +       test_region index ensure_full_index trace2.txt &&
>> +
>> +       # Since index.sparse is set to true at a repo level, the index
>> +       # is converted from full to sparse when read, then never expanded
>> +       # over the course of `git status`. It is written to disk as a sparse
>> +       # index.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index status &&
>> +       test_region index convert_to_sparse trace2.txt &&
>> +       ! test_region index ensure_full_index trace2.txt &&
>> +
>> +       # Now that the index has been written to disk as sparse, it is not
>> +       # converted to sparse (or expanded to full) when read by `git status`.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index status &&
>> +       ! test_region index convert_to_sparse trace2.txt &&
>> +       ! test_region index ensure_full_index trace2.txt
>> +'
>> +
>>  ensure_not_expanded () {
>>         rm -f trace2.txt &&
>>         echo >>sparse-index/untracked.txt &&
>> --
>> gitgitgadget
> 
> The rest of the patch looks fine.
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse'
  2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
                         ` (4 preceding siblings ...)
  2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
@ 2021-11-23  0:20       ` Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
                           ` (4 more replies)
  5 siblings, 5 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-11-23  0:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Elijah Newren, Victoria Dye

This series updates do_read_index to use the index.sparse config setting
when determining whether the index should be expanded or collapsed. If the
command & repo allow use of a sparse index, index.sparse is enabled, and a
full index is read from disk, the index is collapsed before returning to the
caller. Conversely, if index.sparse is disabled but the index read from disk
is sparse, the index is expanded before returning. This allows index.sparse
to control use of the sparse index in addition to its existing control over
how the index is written to disk. It also introduces the ability to
enable/disable the sparse index on a command-by-command basis (e.g.,
allowing a user to troubleshoot a sparse-aware command with '-c
index.sparse=false' [1]).

While testing this change, a bug was found in 'test-tool read-cache' in
which config settings for the repository were not initialized before
preparing the repo settings. This caused index.sparse to always be 'false'
when using the test helper in a cone-mode sparse checkout, breaking tests in
t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
config setup.


Changes since V1
================

 * Add ensure_correct_sparsity function that ensures the index is sparse if
   the repository settings (including index.sparse) allow it, otherwise
   ensuring the index is expanded to full.
 * Restructure condition in do_read_index to, rather than check specifically
   for the index.sparse config setting, call ensure_correct_sparsity
   unconditionally when command_requires_full_index is false.


Changes since V2
================

 * Rename can_convert_to_sparse to is_sparse_index_allowed to more
   accurately reflect what the function returns.
 * Remove index-iterating checks from is_sparse_index_allowed, leaving only
   inexpensive checks on config settings & sparse checkout patterns. Checks
   are still part of convert_to_sparse to ensure it behaves exactly as it
   did before this series.
 * Restructure ensure_correct_sparsity for better readability.
 * Fix test_env variable scope.


Changes since V3
================

 * Add a new patch to avoid unnecessary cache tree free/recreation when
   possible in convert_to_sparse.


Changes since V4
================

 * Updated patch 4/4 commit message to better explain practical reasons for
   making this change.

[1]
https://lore.kernel.org/git/cc60c6e7-ecef-ae22-8ec7-ab290ff2b830@gmail.com/

Thanks! -Victoria

Victoria Dye (4):
  test-read-cache.c: prepare_repo_settings after config init
  sparse-index: avoid unnecessary cache tree clearing
  sparse-index: add ensure_correct_sparsity function
  sparse-index: update do_read_index to ensure correct sparsity

 read-cache.c                             |  8 ++++
 sparse-index.c                           | 58 ++++++++++++++++++------
 sparse-index.h                           |  1 +
 t/helper/test-read-cache.c               |  5 +-
 t/t1092-sparse-checkout-compatibility.sh | 31 +++++++++++++
 5 files changed, 86 insertions(+), 17 deletions(-)


base-commit: f443b226ca681d87a3a31e245a70e6bc2769123c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1059%2Fvdye%2Fsparse%2Findex-sparse-config-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1059/vdye/sparse/index-sparse-config-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1059

Range-diff vs v4:

 1:  6974ce7e7f5 = 1:  6974ce7e7f5 test-read-cache.c: prepare_repo_settings after config init
 2:  91351ac4bde = 2:  91351ac4bde sparse-index: avoid unnecessary cache tree clearing
 3:  d2133ca1724 = 3:  d2133ca1724 sparse-index: add ensure_correct_sparsity function
 4:  b67cd9d07f8 ! 4:  3237a519c80 sparse-index: update do_read_index to ensure correct sparsity
     @@ Metadata
       ## Commit message ##
          sparse-index: update do_read_index to ensure correct sparsity
      
     -    If `command_requires_full_index` is false, ensure correct in-core index
     -    sparsity on read by calling `ensure_correct_sparsity`. This change is meant
     -    to update the how the index is read in a command after sparse index-related
     -    repository settings are modified. Previously, for example, if `index.sparse`
     -    were changed from `true` to `false`, the in-core index on the next command
     -    would be sparse. The index would only be expanded to full when it was next
     -    written to disk.
     +    Unless `command_requires_full_index` forces index expansion, ensure in-core
     +    index sparsity matches config settings on read by calling
     +    `ensure_correct_sparsity`. This makes the behavior of the in-core index more
     +    consistent between different methods of updating sparsity: manually changing
     +    the `index.sparse` config setting vs. executing
     +    `git sparse-checkout --[no-]sparse-index init`
      
     -    By adding a call to `ensure_correct_sparsity`, the in-core index now matches
     -    the sparsity dictated by the relevant repository settings as soon as it is
     -    read into memory, rather than when it is later written to disk.
     +    Although index sparsity is normally updated with `git sparse-checkout init`,
     +    ensuring correct sparsity after a manual `index.sparse` change has some
     +    practical benefits:
     +
     +    1. It allows for command-by-command sparsity toggling with
     +       `-c index.sparse=<true|false>`, e.g. when troubleshooting issues with the
     +       sparse index.
     +    2. It prevents users from experiencing abnormal slowness after setting
     +       `index.sparse` to `true` due to use of a full index in all commands until
     +       the on-disk index is updated.
      
          Helped-by: Junio C Hamano <gitster@pobox.com>
          Co-authored-by: Derrick Stolee <dstolee@microsoft.com>

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
@ 2021-11-23  0:20         ` Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-11-23  0:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Elijah Newren, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/helper/test-read-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index b52c174acc7..0d9f08931a1 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -39,8 +39,6 @@ int cmd__read_cache(int argc, const char **argv)
 	int table = 0, expand = 0;
 
 	initialize_the_repository();
-	prepare_repo_settings(r);
-	r->settings.command_requires_full_index = 0;
 
 	for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) {
 		if (skip_prefix(*argv, "--print-and-refresh=", &name))
@@ -56,6 +54,9 @@ int cmd__read_cache(int argc, const char **argv)
 	setup_git_directory();
 	git_config(git_default_config, NULL);
 
+	prepare_repo_settings(r);
+	r->settings.command_requires_full_index = 0;
+
 	for (i = 0; i < cnt; i++) {
 		repo_read_index(r);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
@ 2021-11-23  0:20         ` Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-11-23  0:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Elijah Newren, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When converting a full index to sparse, clear and recreate the cache tree
only if the cache tree is not fully valid. The convert_to_sparse operation
should exit silently if a cache tree update cannot be successfully completed
(e.g., due to a conflicted entry state). However, because this failure
scenario only occurs when at least a portion of the cache tree is invalid,
we can save ourselves the cost of clearing and recreating the cache tree by
skipping the check when the cache tree is fully valid.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 7b7ff79e044..85613cd8a3a 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -175,17 +175,20 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (index_has_unmerged_entries(istate))
 		return 0;
 
-	/* Clear and recompute the cache-tree */
-	cache_tree_free(&istate->cache_tree);
-	/*
-	 * Silently return if there is a problem with the cache tree update,
-	 * which might just be due to a conflict state in some entry.
-	 *
-	 * This might create new tree objects, so be sure to use
-	 * WRITE_TREE_MISSING_OK.
-	 */
-	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
-		return 0;
+	if (!cache_tree_fully_valid(istate->cache_tree)) {
+		/* Clear and recompute the cache-tree */
+		cache_tree_free(&istate->cache_tree);
+
+		/*
+		 * Silently return if there is a problem with the cache tree update,
+		 * which might just be due to a conflict state in some entry.
+		 *
+		 * This might create new tree objects, so be sure to use
+		 * WRITE_TREE_MISSING_OK.
+		 */
+		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
+			return 0;
+	}
 
 	remove_fsmonitor(istate);
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
@ 2021-11-23  0:20         ` Victoria Dye via GitGitGadget
  2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
  2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-11-23  0:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Elijah Newren, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

The `ensure_correct_sparsity` function is intended to provide a means of
aligning the in-core index with the sparsity required by the repository
settings and other properties of the index. The function first checks
whether a sparse index is allowed (per repository & sparse checkout pattern
settings). If the sparse index may be used, the index is converted to
sparse; otherwise, it is explicitly expanded with `ensure_full_index`.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 33 +++++++++++++++++++++++++++++----
 sparse-index.h |  1 +
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/sparse-index.c b/sparse-index.c
index 85613cd8a3a..a1d505d50e9 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -122,17 +122,17 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-int convert_to_sparse(struct index_state *istate, int flags)
+static int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
-	int test_env;
-	if (istate->sparse_index || !istate->cache_nr ||
-	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
+	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
 
 	if (!istate->repo)
 		istate->repo = the_repository;
 
 	if (!(flags & SPARSE_INDEX_MEMORY_ONLY)) {
+		int test_env;
+
 		/*
 		 * The sparse index is not (yet) integrated with a split index.
 		 */
@@ -168,6 +168,19 @@ int convert_to_sparse(struct index_state *istate, int flags)
 	if (!istate->sparse_checkout_patterns->use_cone_patterns)
 		return 0;
 
+	return 1;
+}
+
+int convert_to_sparse(struct index_state *istate, int flags)
+{
+	/*
+	 * If the index is already sparse, empty, or otherwise
+	 * cannot be converted to sparse, do not convert.
+	 */
+	if (istate->sparse_index || !istate->cache_nr ||
+	    !is_sparse_index_allowed(istate, flags))
+		return 0;
+
 	/*
 	 * NEEDSWORK: If we have unmerged entries, then stay full.
 	 * Unmerged entries prevent the cache-tree extension from working.
@@ -316,6 +329,18 @@ void ensure_full_index(struct index_state *istate)
 	trace2_region_leave("index", "ensure_full_index", istate->repo);
 }
 
+void ensure_correct_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index can be sparse, make it sparse. Otherwise,
+	 * ensure the index is full.
+	 */
+	if (is_sparse_index_allowed(istate, 0))
+		convert_to_sparse(istate, 0);
+	else
+		ensure_full_index(istate);
+}
+
 /*
  * This static global helps avoid infinite recursion between
  * expand_to_path() and index_file_exists().
diff --git a/sparse-index.h b/sparse-index.h
index 9f3d7bc7faf..656bd835b25 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -4,6 +4,7 @@
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
 int convert_to_sparse(struct index_state *istate, int flags);
+void ensure_correct_sparsity(struct index_state *istate);
 
 /*
  * Some places in the codebase expect to search for a specific path.
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
                           ` (2 preceding siblings ...)
  2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
@ 2021-11-23  0:20         ` Victoria Dye via GitGitGadget
  2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
  4 siblings, 0 replies; 44+ messages in thread
From: Victoria Dye via GitGitGadget @ 2021-11-23  0:20 UTC (permalink / raw)
  To: git; +Cc: stolee, gitster, Elijah Newren, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Unless `command_requires_full_index` forces index expansion, ensure in-core
index sparsity matches config settings on read by calling
`ensure_correct_sparsity`. This makes the behavior of the in-core index more
consistent between different methods of updating sparsity: manually changing
the `index.sparse` config setting vs. executing
`git sparse-checkout --[no-]sparse-index init`

Although index sparsity is normally updated with `git sparse-checkout init`,
ensuring correct sparsity after a manual `index.sparse` change has some
practical benefits:

1. It allows for command-by-command sparsity toggling with
   `-c index.sparse=<true|false>`, e.g. when troubleshooting issues with the
   sparse index.
2. It prevents users from experiencing abnormal slowness after setting
   `index.sparse` to `true` due to use of a full index in all commands until
   the on-disk index is updated.

Helped-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             |  8 ++++++
 t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index a78b88a41bf..b3772ba70a1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 	if (!istate->repo)
 		istate->repo = the_repository;
+
+	/*
+	 * If the command explicitly requires a full index, force it
+	 * to be full. Otherwise, correct the sparsity based on repository
+	 * settings and other properties of the index (if necessary).
+	 */
 	prepare_repo_settings(istate->repo);
 	if (istate->repo->settings.command_requires_full_index)
 		ensure_full_index(istate);
+	else
+		ensure_correct_sparsity(istate);
 
 	return istate->cache_nr;
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ca91c6a67f8..59accde1fa3 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
+test_expect_success 'index.sparse disabled inline uses full index' '
+	init_repos &&
+
+	# When index.sparse is disabled inline with `git status`, the
+	# index is expanded at the beginning of the execution then never
+	# converted back to sparse. It is then written to disk as a full index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index -c index.sparse=false status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	test_region index ensure_full_index trace2.txt &&
+
+	# Since index.sparse is set to true at a repo level, the index
+	# is converted from full to sparse when read, then never expanded
+	# over the course of `git status`. It is written to disk as a sparse
+	# index.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt &&
+
+	# Now that the index has been written to disk as sparse, it is not
+	# converted to sparse (or expanded to full) when read by `git status`.
+	rm -f trace2.txt &&
+	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+		git -C sparse-index status &&
+	! test_region index convert_to_sparse trace2.txt &&
+	! test_region index ensure_full_index trace2.txt
+'
+
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse'
  2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
                           ` (3 preceding siblings ...)
  2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
@ 2021-11-23 17:21         ` Elijah Newren
  4 siblings, 0 replies; 44+ messages in thread
From: Elijah Newren @ 2021-11-23 17:21 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Mon, Nov 22, 2021 at 4:20 PM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series updates do_read_index to use the index.sparse config setting
> when determining whether the index should be expanded or collapsed. If the
> command & repo allow use of a sparse index, index.sparse is enabled, and a
> full index is read from disk, the index is collapsed before returning to the
> caller. Conversely, if index.sparse is disabled but the index read from disk
> is sparse, the index is expanded before returning. This allows index.sparse
> to control use of the sparse index in addition to its existing control over
> how the index is written to disk. It also introduces the ability to
> enable/disable the sparse index on a command-by-command basis (e.g.,
> allowing a user to troubleshoot a sparse-aware command with '-c
> index.sparse=false' [1]).
>
> While testing this change, a bug was found in 'test-tool read-cache' in
> which config settings for the repository were not initialized before
> preparing the repo settings. This caused index.sparse to always be 'false'
> when using the test helper in a cone-mode sparse checkout, breaking tests in
> t1091 and t1092. The issue is fixed by moving prepare_repo_settings after
> config setup.
>
>
> Changes since V1
> ================
>
>  * Add ensure_correct_sparsity function that ensures the index is sparse if
>    the repository settings (including index.sparse) allow it, otherwise
>    ensuring the index is expanded to full.
>  * Restructure condition in do_read_index to, rather than check specifically
>    for the index.sparse config setting, call ensure_correct_sparsity
>    unconditionally when command_requires_full_index is false.
>
>
> Changes since V2
> ================
>
>  * Rename can_convert_to_sparse to is_sparse_index_allowed to more
>    accurately reflect what the function returns.
>  * Remove index-iterating checks from is_sparse_index_allowed, leaving only
>    inexpensive checks on config settings & sparse checkout patterns. Checks
>    are still part of convert_to_sparse to ensure it behaves exactly as it
>    did before this series.
>  * Restructure ensure_correct_sparsity for better readability.
>  * Fix test_env variable scope.
>
>
> Changes since V3
> ================
>
>  * Add a new patch to avoid unnecessary cache tree free/recreation when
>    possible in convert_to_sparse.
>
>
> Changes since V4
> ================
>
>  * Updated patch 4/4 commit message to better explain practical reasons for
>    making this change.

Thanks.  This version looks good to me:

Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-11-23 17:21 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
2021-10-15 21:53   ` Junio C Hamano
2021-10-17  1:21     ` Derrick Stolee
2021-10-17  5:58       ` Junio C Hamano
2021-10-17 19:33         ` Derrick Stolee
2021-10-18  1:15           ` Junio C Hamano
2021-10-18 13:25             ` Derrick Stolee
2021-10-18 14:14               ` Victoria Dye
2021-10-21 13:26                 ` Derrick Stolee
2021-10-18 16:09               ` Junio C Hamano
2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-21 22:20     ` Junio C Hamano
2021-10-27 17:21       ` Victoria Dye
2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-27 20:07       ` Derrick Stolee
2021-10-27 21:32         ` Junio C Hamano
2021-10-28  1:24           ` Derrick Stolee
2021-10-29 13:43             ` Victoria Dye
2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-10-29 18:45         ` Junio C Hamano
2021-10-29 19:00           ` Derrick Stolee
2021-10-29 20:01             ` Junio C Hamano
2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-22 17:36         ` Elijah Newren
2021-11-22 18:59           ` Victoria Dye
2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).