git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Junio C Hamano <gitster@pobox.com>,
	Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Victoria Dye <vdye@github.com>
Subject: Re: [PATCH 2/2] sparse-index: update index read to consider index.sparse config
Date: Sat, 16 Oct 2021 21:21:48 -0400	[thread overview]
Message-ID: <e17b7e0f-edf0-0770-9b9b-dd092a8a7a41@gmail.com> (raw)
In-Reply-To: <xmqqh7di3qfu.fsf@gitster.g>

On 10/15/2021 5:53 PM, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Victoria Dye <vdye@github.com>
>>
>> Use the index.sparse config setting to expand or collapse the index when
>> read. Previously, index.sparse would determine how the index would be
>> written to disk, but would not enforce whether the index is read into memory
>> as full or sparse. Now, the index is expanded when a sparse index is read
>> with `index.sparse=false` and is collapsed to sparse when a full index is
>> read with `index.sparse=true` (and the command does not require a full
>> index).
> 
> Instead of calling both in-core index and on-disk index, perhaps use
> "the in-core index" as appropriately in the above description and
> the result would become much less ambigous.
> 
> My knee-jerk reaction was that it is of dubious value to spend
> cycles to make the in-core index sparse after reading a non-sparse
> index from the disk to give to the caller, but this hurts only the
> commands that are not yet sparse-aware and call ensure_full_index()
> as the first thing they do.  To them, we are wasting cycles to
> shrink and expand for no good reason, and after they are done, the
> final writeout would create a sparse on-disk index.

I think you are slightly mistaken here: If index.sparse=true, then a
full index will be converted to one on write, but not immediately upon
read. This means that subsequent commands will read a sparse index, and
they will either benefit from that or not depending on whether they are
integrated with the sparse index or not.

The new behavior here is that if index.sparse=false, then we convert
a sparse index to a full one upon read, avoiding any chance that a
Git command is operating on a sparse index in-memory.

The simplest summary I can say is here:

* If index.sparse=false, then a sparse index will be converted to
  full upon read.

* If index.sparse=true, then a full index will be converted to sparse
  on write.

> Besides, the on-disk index is expected to be sparse most of the time
> when index.sparse is true, so it is hopefully a one-time waste that
> corrects by itself.
> 
> For all commands that are sparse-aware, especially when asked to
> perform their operation on the paths that are not hidden by a
> tree-like index entry, it may or may not be a win, but the downside
> would be much smaller.  The cost to shrink a full in-core index
> before writing out as a sparse one should be comparable to the cost
> to shrink a full in-core index just read from the disk before the
> sparse-index-aware caller works on it and leaves a still mostly
> sparse in-core index to be written out without much extra work to
> re-shrink it to the disk.
> 
>> This makes the behavior of `index.sparse` more intuitive, as it now clearly
>> enables/disables usage of a sparse index.
> 
> It is a minor thing, so I am willing to let it pass, but I am not
> sure about this claim.  The write-out codepath ensures, independent
> of this change, that a full on-disk index is corrected to become
> sparse when the configuration is set to true, and a sparse on-disk
> index is corrected to become full when the configuration is set to
> false, no?

The previous behavior required an index write for the sparse-to-full
transition. After this change, an index write is still required for the
full-to-sparse transition.

> So the only "intuitive"-ness we may be gaining is that the codepaths
> that are sparse-aware would work in their "sparse" (non-"sparse")
> mode when index.sparse is set to true (false), respectively, no
> matter how sparse (or not sparse) the on-disk index they work on is
> initially.  That might help debuggability (assuming that converting
> between the full and sparse forms are working correctly), but I am
> not sure if that is something end users would even care about.

I think you have a case for concern with the word "intuitive". Even
though I agree that the new setup is the best possible interpretation
of the setting, its inherit asymmetry can be confusing.

Thanks,
-Stolee


  reply	other threads:[~2021-10-17  1:21 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-15 19:54 [PATCH 0/2] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 1/2] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-15 19:54 ` [PATCH 2/2] sparse-index: update index read to consider index.sparse config Victoria Dye via GitGitGadget
2021-10-15 21:53   ` Junio C Hamano
2021-10-17  1:21     ` Derrick Stolee [this message]
2021-10-17  5:58       ` Junio C Hamano
2021-10-17 19:33         ` Derrick Stolee
2021-10-18  1:15           ` Junio C Hamano
2021-10-18 13:25             ` Derrick Stolee
2021-10-18 14:14               ` Victoria Dye
2021-10-21 13:26                 ` Derrick Stolee
2021-10-18 16:09               ` Junio C Hamano
2021-10-21 20:48 ` [PATCH v2 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-21 20:48   ` [PATCH v2 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-21 22:20     ` Junio C Hamano
2021-10-27 17:21       ` Victoria Dye
2021-10-21 20:48   ` [PATCH v2 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 18:20   ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 1/3] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-27 18:20     ` [PATCH v3 2/3] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-27 20:07       ` Derrick Stolee
2021-10-27 21:32         ` Junio C Hamano
2021-10-28  1:24           ` Derrick Stolee
2021-10-29 13:43             ` Victoria Dye
2021-10-27 18:20     ` [PATCH v3 3/3] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-10-27 20:08     ` [PATCH v3 0/3] sparse-index: expand/collapse based on 'index.sparse' Derrick Stolee
2021-10-29 13:51     ` [PATCH v4 0/4] " Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-10-29 18:45         ` Junio C Hamano
2021-10-29 19:00           ` Derrick Stolee
2021-10-29 20:01             ` Junio C Hamano
2021-10-29 13:51       ` [PATCH v4 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-10-29 13:51       ` [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-22 17:36         ` Elijah Newren
2021-11-22 18:59           ` Victoria Dye
2021-11-22 17:40       ` [PATCH v4 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren
2021-11-23  0:20       ` [PATCH v5 " Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 1/4] test-read-cache.c: prepare_repo_settings after config init Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 2/4] sparse-index: avoid unnecessary cache tree clearing Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 3/4] sparse-index: add ensure_correct_sparsity function Victoria Dye via GitGitGadget
2021-11-23  0:20         ` [PATCH v5 4/4] sparse-index: update do_read_index to ensure correct sparsity Victoria Dye via GitGitGadget
2021-11-23 17:21         ` [PATCH v5 0/4] sparse-index: expand/collapse based on 'index.sparse' Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e17b7e0f-edf0-0770-9b9b-dd092a8a7a41@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).