git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Victoria Dye <vdye@github.com>
To: Derrick Stolee <stolee@gmail.com>, Junio C Hamano <gitster@pobox.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
Date: Mon, 18 Oct 2021 10:14:56 -0400	[thread overview]
Message-ID: <4656a934-5305-fbdf-76ca-17562fca62ef@github.com> (raw)
In-Reply-To: <b51500f2-854a-3006-810a-fb7fb8d8dcfb@gmail.com>

Derrick Stolee wrote:
> On 10/17/2021 9:15 PM, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>>
>>>>     * In addition, with these patches, if index.sparse=false, a
>>>>       sparse index will be expaned to full upon reading, and if it
>>>>       is true, a non-sparse index will be shrunk to sparse upon
>>>>       reading
>>>
>>> This is only half true. If "index.sparse=false", then a sparse
>>> index will be expanded to full upon reading. If "index.sparse=true"
>>> then nothing special will happen to an index on reading.
>>
>> OK.  I somehow got the impression that we convert in both ways from
>> the patch text under discussion, specifically this part in
>> do_read_index():
>>
>>> -	if (istate->repo->settings.command_requires_full_index)
>>> +	if (!istate->repo->settings.sparse_index ||
>>> +	    istate->repo->settings.command_requires_full_index)
>>>  		ensure_full_index(istate);
>>> +	else if (!istate->sparse_index)
>>> +		convert_to_sparse(istate, 0);
>>>  
>>>  	return istate->cache_nr;
>>
>> We used to say "when we know we are running a command that is not
>> sparse ready, expand it here" and nothing else.
>>
>> We now added a bit more condition for expansion, i.e. "when we are
>> told that the repository does not want sparse index, or when the
>> command is not sparse ready".
>>
>> But the patch goes one more step.  "If we didn't find a reason to
>> expand to full, and if the in-core index we read is not sparse, then
>> call convert_to_sparse() on it".  So if the repo settings tells us
>> that the repository wants a sparse index, and the index we read was
>> not sparse, what does convert_to_sparse() without MEM_ONLY flag bit
>> do?  Nothing special?
> 
> You are absolutely right. I've been talking about what I _thought_
> the code does (and should do) but I missed this 'else if' which is
> in fact doing what you have been claiming the entire time. I should
> have done a better job double-checking the code before doubling
> down on my claims.
> 
> I think the 'else if' should be removed, which would match my
> expectations.
> 

By leaving that part out, wouldn't you only solve half of the "mismatch"
between in-core index and repo setting? Earlier, you described your
expectation as:

> * If index.sparse=false, then a sparse index will be converted to
>   full upon read.
> 
> * If index.sparse=true, then a full index will be converted to sparse
>   on write.

Why should the direction of change to the setting value (false->true vs
true->false) cause the index to convert at different times? Consider the
scenario:

# In a cone-mode, sparse index-enabled sparse checkout repo
$ git -c index.sparse=false status    # 1
$ git status                          # 2
$ git status                          # 3

Before this patch, the index has the following states per command:

(1) the index is sparse in-core, writes full on-disk
(2) the index is full in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, I see two mismatches in my expectations: (1) operates on an in-core
sparse index, despite `index.sparse=false`, and (2) operates on an in-core
full index, despite `index.sparse=true`. 

What you're suggesting solves only the first mismatch. However, the second
mismatch incurs the performance hit of a supposedly-sparse command actually
operating on an in-core full index. It also creates an inconsistency between
(2) and (3) in their use of the sparse index. What I'd like this patch to do
is:

(1) the index is full in-core, writes full on-disk
(2) the index is sparse in-core, writes sparse on-disk
(3) the index is sparse in-core, writes sparse on-disk

Here, there are no more mismatches between in-core index usage and what is
written to disk, and (2) and (3) use the same index sparsity.

>> I see many unconditional calls to convert_to_sparse(istate, 0) on
>> the write code paths, where I instead would expect "if the repo
>> wants sparse, call convert_to_sparse(), and if the repo does not
>> want sparse, call ensure_full_index()", before actualy writing the
>> entries out.  These also are setting traps to confuse readers.
>>
>> Perhaps we want a helper function "ensure_right_sparsity(istate)"
>> that would be called where we have
>>
>> 	if (condition)
>> 		convert_to_sparse();
>> 	else
>> 		ensure_full_index();
>>
>> or an unconditonal call to convert_to_sparse() to unconfuse readers?
> 
> convert_to_sparse() has several checks that prevent the conversion
> from happening, such as having a split index. In particular, it will
> skip the conversion if the_repository->settings.sparse_index is false.
> Thus, calling it unconditionally in the write code is correct.
> 

I may have introduced some confusion by redundantly checking
`!istate->sparse_index` before converting to sparse (my goal was readability
- showing the index does not need to be converted to sparse if it is already
sparse - but I think it ended up doing the opposite). The condition could be
removed entirely, thanks to an equivalent check inside `convert_to_sparse`:

-       if (istate->repo->settings.command_requires_full_index)
+       if (!istate->repo->settings.sparse_index ||
+           istate->repo->settings.command_requires_full_index)
                ensure_full_index(istate);
+       else
+               convert_to_sparse(istate, 0);

  reply	other threads:[~2021-10-18 14:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
2021-10-15 21:53   ` Junio C Hamano
2021-10-17  1:21     ` Derrick Stolee
2021-10-17  5:58       ` Junio C Hamano
2021-10-17 19:33         ` Derrick Stolee
2021-10-18  1:15           ` Junio C Hamano
2021-10-18 13:25             ` Derrick Stolee
2021-10-18 14:14               ` Victoria Dye [this message]
2021-10-21 13:26                 ` Derrick Stolee
2021-10-18 16:09               ` Junio C Hamano
2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-21 22:20     ` Junio C Hamano
2021-10-27 17:21       ` Victoria Dye
2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-27 20:07       ` Derrick Stolee
2021-10-27 21:32         ` Junio C Hamano
2021-10-28  1:24           ` Derrick Stolee
2021-10-29 13:43             ` Victoria Dye
2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-10-29 18:45         ` Junio C Hamano
2021-10-29 19:00           ` Derrick Stolee
2021-10-29 20:01             ` Junio C Hamano
2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-22 17:36         ` Elijah Newren
2021-11-22 18:59           ` Victoria Dye
2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4656a934-5305-fbdf-76ca-17562fca62ef@github.com \
    --to=vdye@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).