From: Jonathan Tan <jonathantanmy@google.com> To: Taylor Blau <me@ttaylorr.com> Cc: Jonathan Tan <jonathantanmy@google.com>, git@vger.kernel.org, vdye@github.com, gitster@pobox.com Subject: Re: [PATCH 2/2] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects Date: Fri, 13 May 2022 16:06:39 -0700 [thread overview] Message-ID: <20220513230639.1099955-1-jonathantanmy@google.com> (raw) In-Reply-To: <bcc48004450769410d7e6d8a6e56f08bfa9a02a3.1652458395.git.me@ttaylorr.com> Taylor Blau <me@ttaylorr.com> writes: > An alternative approach to closing this race would have been to call > `is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This > has a couple of problems: > > - it is unnecessarily expensive in the cases where we don't actually > need to open any packs (e.g., in `git rev-list --use-bitmap-index > --count`) > > - more importantly, it means any time we would have hit this race, > we'll avoid using bitmaps altogether, leading to significant > slowdowns by forcing a full object traversal This answers a question I had about why we're only opening the preferred pack instead of all packs. (You mention in [1] that it's also answered in that patch message, but I didn't see it.) In any case, it might be clearer to move this part to the 1st commit. [1] https://lore.kernel.org/git/Yn63nDhSBIEa3%2F+%2F@nand.local/ > Work around this by calling `is_pack_valid()` from within > `want_found_object()`, matching the behavior in > `want_object_in_pack_one()` (which has an analogous call). Most calls to > `is_pack_valid()` should be basically no-ops, since only the first call > requires us to open a file (subsequent calls realize the file is already > open, and return immediately). > > This does require us to make a small change in > `want_object_in_pack_one()`, since `want_found_object()` may return `-1` > (indicating "keep searching for other packs containing this object") > when `*found_pack` is non-NULL. Force `want_object_in_pack_one()` to > call `is_pack_valid()` when `p != *found_pack`, not just when > `*found_pack` is non-NULL. It took me a while to realize that the relevant want_found_object() invocation that may return -1 is not the one in want_object_in_pack_one(), but in the latter's caller want_object_in_pack(). But even in this case, couldn't want_found_object() return -1 (see the very end of the function) even before this patch? > @@ -1424,14 +1427,15 @@ static int want_object_in_pack_one(struct packed_git *p, > off_t *found_offset) > { > off_t offset; > + int use_found = p == *found_pack; > > - if (p == *found_pack) > + if (use_found) > offset = *found_offset; > else > offset = find_pack_entry_one(oid->hash, p); > > if (offset) { > - if (!*found_pack) { > + if (!use_found) { > if (!is_pack_valid(p)) > return -1; > *found_offset = offset; My understanding of the purpose of this code change is that if we reach here with a non-NULL *found_pack, it is likely that *found_pack contains an invalid pack, and this part overwrites *found_pack (and *found_offset) if it finds a valid pack. This seems like a good change, but I don't see how this is a result of something that "does require us" (as far as I can tell, *found_pack could have already been invalid before this patch, so the downstream code should already be able to handle it). Maybe it's just that we couldn't tell if the pack is invalid previously, but now we can; but if so, it would be better to say "use this added information to overwrite *found_pack when it makes sense" or something like that. (An alternative to the change in this patch may be to reset *found_pack to NULL when it is found that the pack is invalid, but I haven't investigated all the callers to see if they can tolerate *found_pack moving changing non-NULL to NULL, so the change in this patch is probably more practical.) Other than that (and the "close(fd)" thing in patch 1 that Junio mentioned [2]), this series looks good to me. Thanks for noticing and fixing the bug. [2] https://lore.kernel.org/git/xmqqpmkh9tye.fsf@gitster.g/
next prev parent reply other threads:[~2022-05-13 23:11 UTC|newest] Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-05-13 16:23 [PATCH 0/2] pack-objects: fix a pair of MIDX bitmap-related races Taylor Blau 2022-05-13 16:23 ` [PATCH 1/2] pack-bitmap: check preferred pack validity when opening MIDX bitmap Taylor Blau 2022-05-13 18:19 ` Junio C Hamano 2022-05-13 19:55 ` Taylor Blau 2022-05-13 16:23 ` [PATCH 2/2] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects Taylor Blau 2022-05-13 23:06 ` Jonathan Tan [this message] 2022-05-14 13:17 ` Taylor Blau 2022-05-16 6:07 ` Jonathan Tan 2022-05-14 13:34 ` Taylor Blau 2022-05-16 6:11 ` Jonathan Tan 2022-05-24 18:54 ` [PATCH v2 0/4] pack-objects: fix a pair of MIDX bitmap-related races Taylor Blau 2022-05-24 18:54 ` [PATCH v2 1/4] pack-bitmap.c: check preferred pack validity when opening MIDX bitmap Taylor Blau 2022-05-24 19:36 ` Ævar Arnfjörð Bjarmason 2022-05-24 21:38 ` Taylor Blau 2022-05-24 21:51 ` Ævar Arnfjörð Bjarmason 2022-05-24 18:54 ` [PATCH v2 2/4] builtin/pack-objects.c: avoid redundant NULL check Taylor Blau 2022-05-24 21:44 ` Junio C Hamano 2022-05-25 0:11 ` Taylor Blau 2022-05-24 18:54 ` [PATCH v2 3/4] builtin/pack-objects.c: ensure included `--stdin-packs` exist Taylor Blau 2022-05-24 19:46 ` Ævar Arnfjörð Bjarmason 2022-05-24 21:33 ` Taylor Blau 2022-05-24 21:49 ` Ævar Arnfjörð Bjarmason 2022-05-24 22:03 ` Junio C Hamano 2022-05-25 0:14 ` Taylor Blau 2022-05-26 19:21 ` Victoria Dye 2022-05-26 20:05 ` Taylor Blau 2022-05-24 18:54 ` [PATCH v2 4/4] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects Taylor Blau 2022-05-24 21:38 ` [PATCH v2 0/4] pack-objects: fix a pair of MIDX bitmap-related races Junio C Hamano 2022-05-25 0:16 ` Taylor Blau
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220513230639.1099955-1-jonathantanmy@google.com \ --to=jonathantanmy@google.com \ --cc=git@vger.kernel.org \ --cc=gitster@pobox.com \ --cc=me@ttaylorr.com \ --cc=vdye@github.com \ --subject='Re: [PATCH 2/2] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).