git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 00/27] Sparse Index: API protections
Date: Wed, 17 Mar 2021 11:03:24 -0700	[thread overview]
Message-ID: <CABPp-BGfkPscpro=W5vcRHD5cV6pddYpvHLLzMWjL9WLaxyu3w@mail.gmail.com> (raw)
In-Reply-To: <pull.906.git.1615929435.gitgitgadget@gmail.com>

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Here is the second patch series submission coming out of the sparse-index
> RFC [1].
>
> [1]
> https://lore.kernel.org/git/pull.847.git.1611596533.gitgitgadget@gmail.com/
>
> This is based on v3 of the format series [2].
>
> [2]
> https://lore.kernel.org/git/pull.883.v3.git.1615912983.gitgitgadget@gmail.com/
>
> The point of this series is to insert protections for the consumers of the
> in-memory index. We mark certain regions of code as needing a full index, so
> we call ensure_full_index() to expand a sparse index to a full one, if
> necessary. These protections are inserted file-by-file in every loop over
> all cache entries. Well, "most" loops, because some are going to be handled
> in the very next series so I leave them out.
>
> Many callers use index_name_pos() to find a path by name. In these cases, we
> can check if that position resolves to a sparse directory instance. In those
> cases, we just expand to a full index and run the search again.
>
> The last few patches deal with the name-hash hashtable for doing O(1)
> lookups.
>
> These protections don't do much right now, since the previous series created
> the_repository->settings.command_requires_full_index to guard all index
> reads and writes to ensure the in-memory copy is full for commands that have
> not been tested with the sparse index yet.
>
> However, after this series is complete, we now have a straight-forward plan
> for making commands "sparse aware" one-by-one:
>
>  1. Disable settings.command_requires_full_index to allow an in-memory
>     sparse-index.
>  2. Run versions of that command under a debugger, breaking on
>     ensure_full_index().
>  3. Examine the call stack to determine the context of that expansion, then
>     implement the proper behavior in those locations.
>  4. Add tests to ensure we are checking this logic in the presence of sparse
>     directory entries.

I started reading the series, then noticed I didn't like the first few
additions of ensure_full_index().  The first was because I thought it
just wasn't needed as per a few lines of code later, but the more
important one that stuck out to me was another where if the
ensure_full_index() call is needed to avoid the code blowing up, then
the code has a good chance that it is doing something inherently wrong
in a sparse-checkout/sparse-index anyway.

So I guess that brings me to the question I asked in 07/27 -- is the
presence of ensure_full_index() a note that the code needs to be later
audited and tweaked to work with sparse-indexes?  If so, then good,
this series may make sense.  However, will that always be the case?
If we think some of these will be left in the code, is there a plan to
annotate each one that has been audited and determined that it needs
to stay?  If not, then each ensure_full_index() might or might not
have been audited for correctness and it becomes a pain to know what's
left to fix.  If the plan is that these are to be audited, and to be
marked if they are truly deemed necessary, then the series makes sense
to me.  If not, then I'm confused about something with the series and
need some help with the goals and plans.

If I'm confused about the goals and the plans, then my reviews will
probably be less than helpful, so I'll suspend reading the series
until I understand the plan a little better.

  parent reply	other threads:[~2021-03-17 18:04 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-16 21:16 [PATCH 00/27] Sparse Index: API protections Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 01/27] *: remove 'const' qualifier for struct index_state Derrick Stolee via GitGitGadget
2021-03-19 21:01   ` Junio C Hamano
2021-03-20  1:45     ` Derrick Stolee
2021-03-20  1:52     ` Junio C Hamano
2021-03-30 16:53       ` Derrick Stolee
2021-03-16 21:16 ` [PATCH 02/27] read-cache: expand on query into sparse-directory entry Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 03/27] sparse-index: API protection strategy Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 04/27] cache: move ensure_full_index() to cache.h Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 05/27] add: ensure full index Derrick Stolee via GitGitGadget
2021-03-17 17:35   ` Elijah Newren
2021-03-17 20:35     ` Matheus Tavares Bernardino
2021-03-17 20:55       ` Derrick Stolee
2021-03-16 21:16 ` [PATCH 06/27] checkout-index: " Derrick Stolee via GitGitGadget
2021-03-17 17:50   ` Elijah Newren
2021-03-17 20:05     ` Derrick Stolee
2021-03-17 21:10       ` Elijah Newren
2021-03-17 21:33         ` Derrick Stolee
2021-03-17 22:36           ` Elijah Newren
2021-03-18  1:17             ` Derrick Stolee
2021-03-16 21:16 ` [PATCH 07/27] checkout: " Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 08/27] commit: " Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 09/27] difftool: " Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 10/27] fsck: " Derrick Stolee via GitGitGadget
2021-03-16 21:16 ` [PATCH 11/27] grep: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 12/27] ls-files: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 13/27] merge-index: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 14/27] rm: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 15/27] sparse-checkout: " Derrick Stolee via GitGitGadget
2021-03-18  5:22   ` Elijah Newren
2021-03-23 13:13     ` Derrick Stolee
2021-03-16 21:17 ` [PATCH 16/27] update-index: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 17/27] diff-lib: " Derrick Stolee via GitGitGadget
2021-03-18  5:24   ` Elijah Newren
2021-03-23 13:15     ` Derrick Stolee
2021-03-16 21:17 ` [PATCH 18/27] dir: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 19/27] entry: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 20/27] merge-ort: " Derrick Stolee via GitGitGadget
2021-03-18  5:31   ` Elijah Newren
2021-03-23 13:26     ` Derrick Stolee
2021-03-16 21:17 ` [PATCH 21/27] merge-recursive: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 22/27] pathspec: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 23/27] read-cache: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 24/27] resolve-undo: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 25/27] revision: " Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 26/27] sparse-index: expand_to_path() Derrick Stolee via GitGitGadget
2021-03-16 21:17 ` [PATCH 27/27] name-hash: use expand_to_path() Derrick Stolee via GitGitGadget
2021-03-17 18:03 ` Elijah Newren [this message]
2021-03-18  6:32   ` [PATCH 00/27] Sparse Index: API protections Elijah Newren
2021-04-01  1:49 ` [PATCH v2 00/25] " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 01/25] sparse-index: API protection strategy Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 02/25] *: remove 'const' qualifier for struct index_state Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 03/25] read-cache: expand on query into sparse-directory entry Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 04/25] cache: move ensure_full_index() to cache.h Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 05/25] add: ensure full index Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 06/25] checkout-index: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 07/25] checkout: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 08/25] commit: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 09/25] difftool: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 10/25] fsck: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 11/25] grep: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 12/25] ls-files: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 13/25] merge-index: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 14/25] rm: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 15/25] stash: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 16/25] update-index: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 17/25] dir: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 18/25] entry: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 19/25] merge-recursive: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 20/25] pathspec: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 21/25] read-cache: " Derrick Stolee via GitGitGadget
2021-04-01  1:49   ` [PATCH v2 22/25] resolve-undo: " Derrick Stolee via GitGitGadget
2021-04-01  1:50   ` [PATCH v2 23/25] revision: " Derrick Stolee via GitGitGadget
2021-04-01  1:50   ` [PATCH v2 24/25] sparse-index: expand_to_path() Derrick Stolee via GitGitGadget
2021-04-05 19:32     ` Elijah Newren
2021-04-06 11:46       ` Derrick Stolee
2021-04-01  1:50   ` [PATCH v2 25/25] name-hash: use expand_to_path() Derrick Stolee via GitGitGadget
2021-04-05 19:53     ` Elijah Newren
2021-04-01  7:07   ` [PATCH v2 00/25] Sparse Index: API protections Junio C Hamano
2021-04-01 13:32     ` Derrick Stolee
2021-04-05 19:55   ` Elijah Newren
2021-04-12 21:07   ` [PATCH v3 00/26] " Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 01/26] sparse-index: API protection strategy Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 02/26] *: remove 'const' qualifier for struct index_state Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 03/26] read-cache: expand on query into sparse-directory entry Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 04/26] cache: move ensure_full_index() to cache.h Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 05/26] add: ensure full index Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 06/26] checkout-index: " Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 07/26] checkout: " Derrick Stolee via GitGitGadget
2021-04-12 21:07     ` [PATCH v3 08/26] commit: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 09/26] difftool: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 10/26] fsck: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 11/26] grep: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 12/26] ls-files: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 13/26] merge-index: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 14/26] rm: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 15/26] stash: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 16/26] update-index: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 17/26] dir: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 18/26] entry: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 19/26] merge-recursive: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 20/26] pathspec: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 21/26] read-cache: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 22/26] resolve-undo: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 23/26] revision: " Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 24/26] name-hash: don't add directories to name_hash Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 25/26] sparse-index: expand_to_path() Derrick Stolee via GitGitGadget
2021-04-12 21:08     ` [PATCH v3 26/26] name-hash: use expand_to_path() Derrick Stolee via GitGitGadget
2021-04-13 16:02     ` [PATCH v3 00/26] Sparse Index: API protections Elijah Newren
2021-04-14 20:44       ` Junio C Hamano
2021-04-15  2:42         ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BGfkPscpro=W5vcRHD5cV6pddYpvHLLzMWjL9WLaxyu3w@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).