git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: vdye@github.com, jonathantanmy@google.com, gitster@pobox.com
Subject: [PATCH 2/2] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
Date: Fri, 13 May 2022 12:23:20 -0400	[thread overview]
Message-ID: <bcc48004450769410d7e6d8a6e56f08bfa9a02a3.1652458395.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1652458395.git.me@ttaylorr.com>

When using a multi-pack bitmap, pack-objects will try to perform its
traversal using a call to `traverse_bitmap_commit_list()`, which calls
`add_object_entry_from_bitmap()` to add each object it finds to its
packing list.

This path can cause pack-objects to add objects from packs that don't
have open pack_fds on them, by avoiding a call to `is_pack_valid()`.
This is because we only call `is_pack_valid()` on the preferred pack (in
order to do verbatim reuse via `reuse_partial_packfile_from_bitmap()`)
and not others when loading a MIDX bitmap.

In this case, `add_object_entry_from_bitmap()` will check whether it
wants each object entry by calling `want_object_in_pack()`, which will
call `want_found_object` (since its caller already supplied a
`found_pack`). In most cases (particularly without `--local`, and when
`ignored_packed_keep_on_disk` and `ignored_packed_keep_in_core` are
both "0"), we'll take the entry from the pack contained in the MIDX
bitmap, all without an open pack_fd.

When we then try to use that entry later to assemble the actual pack,
we'll be susceptible to any simultaneous writers moving that pack out of
the way (e.g., due to a concurrent repack) without having an open file
descriptor, causing races that result in errors like:

    remote: Enumerating objects: 1498802, done.
    remote: fatal: packfile ./objects/pack/pack-e57d433b5a588daa37fbe946e2b28dfaec03a93e.pack cannot be accessed
    remote: aborting due to possible repository corruption on the remote side.

This race can happen even with multi-pack bitmaps, since we may open a
MIDX bitmap that is being rewritten long before its packs are actually
unlinked.

Work around this by calling `is_pack_valid()` from within
`want_found_object()`, matching the behavior in
`want_object_in_pack_one()` (which has an analogous call). Most calls to
`is_pack_valid()` should be basically no-ops, since only the first call
requires us to open a file (subsequent calls realize the file is already
open, and return immediately).

This does require us to make a small change in
`want_object_in_pack_one()`, since `want_found_object()` may return `-1`
(indicating "keep searching for other packs containing this object")
when `*found_pack` is non-NULL. Force `want_object_in_pack_one()` to
call `is_pack_valid()` when `p != *found_pack`, not just when
`*found_pack` is non-NULL.

An alternative approach to closing this race would have been to call
`is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This
has a couple of problems:

  - it is unnecessarily expensive in the cases where we don't actually
    need to open any packs (e.g., in `git rev-list --use-bitmap-index
    --count`)

  - more importantly, it means any time we would have hit this race,
    we'll avoid using bitmaps altogether, leading to significant
    slowdowns by forcing a full object traversal

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/pack-objects.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 014dcd4bc9..a243193807 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1357,6 +1357,9 @@ static int want_found_object(const struct object_id *oid, int exclude,
 	if (incremental)
 		return 0;
 
+	if (!is_pack_valid(p))
+		return -1;
+
 	/*
 	 * When asked to do --local (do not include an object that appears in a
 	 * pack we borrow from elsewhere) or --honor-pack-keep (do not include
@@ -1424,14 +1427,15 @@ static int want_object_in_pack_one(struct packed_git *p,
 				   off_t *found_offset)
 {
 	off_t offset;
+	int use_found = p == *found_pack;
 
-	if (p == *found_pack)
+	if (use_found)
 		offset = *found_offset;
 	else
 		offset = find_pack_entry_one(oid->hash, p);
 
 	if (offset) {
-		if (!*found_pack) {
+		if (!use_found) {
 			if (!is_pack_valid(p))
 				return -1;
 			*found_offset = offset;
-- 
2.36.1.76.g80c0bcd80e

  parent reply	other threads:[~2022-05-13 16:23 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-13 16:23 [PATCH 0/2] pack-objects: fix a pair of MIDX bitmap-related races Taylor Blau
2022-05-13 16:23 ` [PATCH 1/2] pack-bitmap: check preferred pack validity when opening MIDX bitmap Taylor Blau
2022-05-13 18:19   ` Junio C Hamano
2022-05-13 19:55     ` Taylor Blau
2022-05-13 16:23 ` Taylor Blau [this message]
2022-05-13 23:06   ` [PATCH 2/2] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects Jonathan Tan
2022-05-14 13:17     ` Taylor Blau
2022-05-16  6:07       ` Jonathan Tan
2022-05-14 13:34     ` Taylor Blau
2022-05-16  6:11       ` Jonathan Tan
2022-05-24 18:54 ` [PATCH v2 0/4] pack-objects: fix a pair of MIDX bitmap-related races Taylor Blau
2022-05-24 18:54   ` [PATCH v2 1/4] pack-bitmap.c: check preferred pack validity when opening MIDX bitmap Taylor Blau
2022-05-24 19:36     ` Ævar Arnfjörð Bjarmason
2022-05-24 21:38       ` Taylor Blau
2022-05-24 21:51         ` Ævar Arnfjörð Bjarmason
2022-05-24 18:54   ` [PATCH v2 2/4] builtin/pack-objects.c: avoid redundant NULL check Taylor Blau
2022-05-24 21:44     ` Junio C Hamano
2022-05-25  0:11       ` Taylor Blau
2022-05-24 18:54   ` [PATCH v2 3/4] builtin/pack-objects.c: ensure included `--stdin-packs` exist Taylor Blau
2022-05-24 19:46     ` Ævar Arnfjörð Bjarmason
2022-05-24 21:33       ` Taylor Blau
2022-05-24 21:49         ` Ævar Arnfjörð Bjarmason
2022-05-24 22:03     ` Junio C Hamano
2022-05-25  0:14       ` Taylor Blau
2022-05-26 19:21     ` Victoria Dye
2022-05-26 20:05       ` Taylor Blau
2022-05-24 18:54   ` [PATCH v2 4/4] builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects Taylor Blau
2022-05-24 21:38   ` [PATCH v2 0/4] pack-objects: fix a pair of MIDX bitmap-related races Junio C Hamano
2022-05-25  0:16     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcc48004450769410d7e6d8a6e56f08bfa9a02a3.1652458395.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).