git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, stolee@gmail.com
Subject: [PATCH v3 8/9] midx: read `RIDX` chunk when present
Date: Tue, 4 Jan 2022 13:16:00 -0500	[thread overview]
Message-ID: <55aa69de12c5f82a66836e829f915363cc73b421.1641320129.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1641320129.git.me@ttaylorr.com>

When a MIDX contains the new `RIDX` chunk, ensure that the reverse index
is read from it instead of the on-disk .rev file. Since we need to
encode the object order in the MIDX itself for correctness reasons,
there is no point in storing the same data again outside of the MIDX.

So, this patch stops writing separate .rev files, and reads it out of
the MIDX itself. This is possible to do with relatively little new code,
since the format of the RIDX chunk is identical to the data in the .rev
file. In other words, we can implement this by pointing the
`revindex_data` field at the reverse index chunk of the MIDX instead of
the .rev file without any other changes.

Note that we have two knobs that are adjusted for the new tests:
GIT_TEST_MIDX_WRITE_REV and GIT_TEST_MIDX_READ_RIDX. The former controls
whether the MIDX .rev is written at all, and the latter controls whether
we read the MIDX's RIDX chunk.

Both are necessary to ensure that the test added at the beginning of
this series continues to work. This is because we always need to write
the RIDX chunk in the MIDX in order to change its checksum, but we want
to make sure reading the existing .rev file still works (since the RIDX
chunk takes precedence by default).

Arguably this isn't a very interesting mode to test, because the
precedence rules mean that we'll always read the RIDX chunk over the
.rev file. But it makes it impossible for a user to induce corruption in
their repository by adjusting the test knobs (since if we had an
either/or knob they could stop writing the RIDX chunk, allowing them to
tweak the MIDX's object order without changing its checksum).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c                            |  6 +++++-
 midx.h                            |  1 +
 pack-revindex.c                   | 17 +++++++++++++++++
 t/lib-bitmap.sh                   |  4 ++--
 t/t5326-multi-pack-bitmaps.sh     |  6 ++++++
 t/t5327-multi-pack-bitmaps-rev.sh | 23 +++++++++++++++++++++++
 t/t7700-repack.sh                 |  4 ----
 7 files changed, 54 insertions(+), 7 deletions(-)
 create mode 100755 t/t5327-multi-pack-bitmaps-rev.sh

diff --git a/midx.c b/midx.c
index d3179e9c02..9aba13b5b1 100644
--- a/midx.c
+++ b/midx.c
@@ -162,6 +162,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local
 
 	pair_chunk(cf, MIDX_CHUNKID_LARGEOFFSETS, &m->chunk_large_offsets);
 
+	if (git_env_bool("GIT_TEST_MIDX_READ_RIDX", 1))
+		pair_chunk(cf, MIDX_CHUNKID_REVINDEX, &m->chunk_revindex);
+
 	m->num_objects = ntohl(m->chunk_oid_fanout[255]);
 
 	CALLOC_ARRAY(m->pack_names, m->num_packs);
@@ -1429,7 +1432,8 @@ static int write_midx_internal(const char *object_dir,
 	finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM);
 	free_chunkfile(cf);
 
-	if (flags & MIDX_WRITE_REV_INDEX)
+	if (flags & MIDX_WRITE_REV_INDEX &&
+	    git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0))
 		write_midx_reverse_index(midx_name.buf, midx_hash, &ctx);
 	if (flags & MIDX_WRITE_BITMAP) {
 		if (write_midx_bitmap(midx_name.buf, midx_hash, &ctx,
diff --git a/midx.h b/midx.h
index b7d79a515c..22e8e53288 100644
--- a/midx.h
+++ b/midx.h
@@ -36,6 +36,7 @@ struct multi_pack_index {
 	const unsigned char *chunk_oid_lookup;
 	const unsigned char *chunk_object_offsets;
 	const unsigned char *chunk_large_offsets;
+	const unsigned char *chunk_revindex;
 
 	const char **pack_names;
 	struct packed_git **packs;
diff --git a/pack-revindex.c b/pack-revindex.c
index bd15ebad03..08dc160167 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -298,9 +298,26 @@ int load_midx_revindex(struct multi_pack_index *m)
 {
 	struct strbuf revindex_name = STRBUF_INIT;
 	int ret;
+
 	if (m->revindex_data)
 		return 0;
 
+	if (m->chunk_revindex) {
+		/*
+		 * If the MIDX `m` has a `RIDX` chunk, then use its contents for
+		 * the reverse index instead of trying to load a separate `.rev`
+		 * file.
+		 *
+		 * Note that we do *not* set `m->revindex_map` here, since we do
+		 * not want to accidentally call munmap() in the middle of the
+		 * MIDX.
+		 */
+		trace2_data_string("load_midx_revindex", the_repository,
+				   "source", "midx");
+		m->revindex_data = (const uint32_t *)m->chunk_revindex;
+		return 0;
+	}
+
 	trace2_data_string("load_midx_revindex", the_repository,
 			   "source", "rev");
 
diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh
index 77b5f46a03..365d990ce3 100644
--- a/t/lib-bitmap.sh
+++ b/t/lib-bitmap.sh
@@ -290,7 +290,7 @@ test_rev_exists () {
 }
 
 midx_bitmap_core () {
-	rev_kind="${1:-rev}"
+	rev_kind="${1:-midx}"
 
 	setup_bitmap_history
 
@@ -434,7 +434,7 @@ midx_bitmap_core () {
 }
 
 midx_bitmap_partial_tests () {
-	rev_kind="${1:-rev}"
+	rev_kind="${1:-midx}"
 
 	test_expect_success 'setup partial bitmaps' '
 		test_commit packed &&
diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh
index 100ac90d15..c0924074c4 100755
--- a/t/t5326-multi-pack-bitmaps.sh
+++ b/t/t5326-multi-pack-bitmaps.sh
@@ -9,6 +9,12 @@ test_description='exercise basic multi-pack bitmap functionality'
 GIT_TEST_MULTI_PACK_INDEX=0
 GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
 
+# This test exercise multi-pack bitmap functionality where the object order is
+# stored and read from a special chunk within the MIDX, so use the default
+# behavior here.
+sane_unset GIT_TEST_MIDX_WRITE_REV
+sane_unset GIT_TEST_MIDX_READ_RIDX
+
 midx_bitmap_core
 
 bitmap_reuse_tests() {
diff --git a/t/t5327-multi-pack-bitmaps-rev.sh b/t/t5327-multi-pack-bitmaps-rev.sh
new file mode 100755
index 0000000000..d30ba632c8
--- /dev/null
+++ b/t/t5327-multi-pack-bitmaps-rev.sh
@@ -0,0 +1,23 @@
+#!/bin/sh
+
+test_description='exercise basic multi-pack bitmap functionality (.rev files)'
+
+. ./test-lib.sh
+. "${TEST_DIRECTORY}/lib-bitmap.sh"
+
+# We'll be writing our own midx and bitmaps, so avoid getting confused by the
+# automatic ones.
+GIT_TEST_MULTI_PACK_INDEX=0
+GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+
+# Unlike t5326, this test exercise multi-pack bitmap functionality where the
+# object order is stored in a separate .rev file.
+GIT_TEST_MIDX_WRITE_REV=1
+GIT_TEST_MIDX_READ_RIDX=0
+export GIT_TEST_MIDX_WRITE_REV
+export GIT_TEST_MIDX_READ_RIDX
+
+midx_bitmap_core rev
+midx_bitmap_partial_tests rev
+
+test_done
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 4693f8dc2b..02a6633a16 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -311,16 +311,13 @@ test_expect_success 'cleans up MIDX when appropriate' '
 		checksum=$(midx_checksum $objdir) &&
 		test_path_is_file $midx &&
 		test_path_is_file $midx-$checksum.bitmap &&
-		test_path_is_file $midx-$checksum.rev &&
 
 		test_commit repack-3 &&
 		GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx &&
 
 		test_path_is_file $midx &&
 		test_path_is_missing $midx-$checksum.bitmap &&
-		test_path_is_missing $midx-$checksum.rev &&
 		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
-		test_path_is_file $midx-$(midx_checksum $objdir).rev &&
 
 		test_commit repack-4 &&
 		GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb &&
@@ -353,7 +350,6 @@ test_expect_success '--write-midx with preferred bitmap tips' '
 		test_line_count = 1 before &&
 
 		rm -fr $midx-$(midx_checksum $objdir).bitmap &&
-		rm -fr $midx-$(midx_checksum $objdir).rev &&
 		rm -fr $midx &&
 
 		# instead of constructing the snapshot ourselves (c.f., the test
-- 
2.34.1.25.gb3157a20e6


  parent reply	other threads:[~2022-01-04 18:16 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-08 19:26 [PATCH 0/2] midx: prevent bitmap corruption when permuting pack order Taylor Blau
2021-12-08 19:26 ` [PATCH 1/2] t5326: demonstrate bitmap corruption after permutation Taylor Blau
2021-12-08 19:26 ` [PATCH 2/2] midx.c: make changing the preferred pack safe Taylor Blau
2021-12-08 19:30 ` [PATCH 0/2] midx: prevent bitmap corruption when permuting pack order Derrick Stolee
2021-12-08 19:55   ` Jeff King
2021-12-10 18:36     ` Taylor Blau
2021-12-10 22:31       ` Taylor Blau
2021-12-11  1:39         ` Taylor Blau
2021-12-13 14:00           ` Derrick Stolee
2021-12-13 14:31             ` Taylor Blau
2021-12-14  1:55 ` [PATCH v2 0/8] " Taylor Blau
2021-12-14  1:55   ` [PATCH v2 1/8] t5326: demonstrate bitmap corruption after permutation Taylor Blau
2021-12-14  1:55   ` [PATCH v2 2/8] midx.c: make changing the preferred pack safe Taylor Blau
2021-12-14  1:55   ` [PATCH v2 3/8] pack-revindex.c: instrument loading on-disk reverse index Taylor Blau
2021-12-14  1:55   ` [PATCH v2 4/8] t5326: drop unnecessary setup Taylor Blau
2021-12-14  1:55   ` [PATCH v2 5/8] t5326: extract `test_rev_exists` Taylor Blau
2021-12-20 18:33     ` Derrick Stolee
2022-01-04 15:33       ` Taylor Blau
2021-12-14  1:55   ` [PATCH v2 6/8] t5326: move tests to t/lib-bitmap.sh Taylor Blau
2021-12-14  1:55   ` [PATCH v2 7/8] t/lib-bitmap.sh: parameterize tests over reverse index source Taylor Blau
2021-12-14  1:55   ` [PATCH v2 8/8] midx: read `RIDX` chunk when present Taylor Blau
2021-12-20 18:42     ` Derrick Stolee
2022-01-04 15:21       ` Taylor Blau
2021-12-15 19:46   ` [PATCH v2 0/8] midx: prevent bitmap corruption when permuting pack order Junio C Hamano
2021-12-15 21:37     ` Taylor Blau
2021-12-15 22:17       ` Junio C Hamano
2021-12-15 22:55         ` Junio C Hamano
2021-12-20 18:51     ` Derrick Stolee
2021-12-20 19:52       ` Taylor Blau
2021-12-20 20:09         ` Derrick Stolee
2021-12-15 22:58   ` Junio C Hamano
2021-12-15 23:01     ` Taylor Blau
2022-01-04 18:15 ` [PATCH v3 0/9] " Taylor Blau
2022-01-04 18:15   ` [PATCH v3 1/9] t5326: demonstrate bitmap corruption after permutation Taylor Blau
2022-01-20 17:55     ` Jonathan Tan
2022-01-20 22:11       ` Taylor Blau
2022-01-20 22:41         ` Junio C Hamano
2022-01-20 22:46           ` Taylor Blau
2022-01-24 17:40         ` Jonathan Tan
2022-01-04 18:15   ` [PATCH v3 2/9] midx.c: make changing the preferred pack safe Taylor Blau
2022-01-14 21:35     ` Junio C Hamano
2022-01-14 21:43       ` Junio C Hamano
2022-01-15  0:59         ` Taylor Blau
2022-01-15  6:27           ` Junio C Hamano
2022-01-20 18:08     ` Jonathan Tan
2022-01-20 22:13       ` Taylor Blau
2022-01-04 18:15   ` [PATCH v3 3/9] pack-revindex.c: instrument loading on-disk reverse index Taylor Blau
2022-01-20 18:15     ` Jonathan Tan
2022-01-20 22:18       ` Taylor Blau
2022-01-24 17:53         ` Jonathan Tan
2022-01-04 18:15   ` [PATCH v3 4/9] t5326: drop unnecessary setup Taylor Blau
2022-01-04 18:15   ` [PATCH v3 5/9] t5326: extract `test_rev_exists` Taylor Blau
2022-01-04 18:15   ` [PATCH v3 6/9] t5326: move tests to t/lib-bitmap.sh Taylor Blau
2022-01-04 18:15   ` [PATCH v3 7/9] t/lib-bitmap.sh: parameterize tests over reverse index source Taylor Blau
2022-01-24 19:15     ` Jonathan Tan
2022-01-25 21:40       ` Taylor Blau
2022-01-26 21:00         ` Jonathan Tan
2022-01-04 18:16   ` Taylor Blau [this message]
2022-01-24 19:27     ` [PATCH v3 8/9] midx: read `RIDX` chunk when present Jonathan Tan
2022-01-25 21:45       ` Taylor Blau
2022-01-26 21:28         ` Jonathan Tan
2022-01-04 18:16   ` [PATCH v3 9/9] pack-bitmap.c: gracefully fallback after opening pack/MIDX Taylor Blau
2022-01-24 19:29     ` Jonathan Tan
2022-01-25 21:46       ` Taylor Blau
2022-01-25 22:40 ` [PATCH v4 0/9] midx: prevent bitmap corruption when permuting pack order Taylor Blau
2022-01-25 22:41   ` [PATCH v4 1/9] t5326: demonstrate bitmap corruption after permutation Taylor Blau
2022-01-26 15:01     ` Ævar Arnfjörð Bjarmason
2022-01-26 20:18       ` Taylor Blau
2022-01-25 22:41   ` [PATCH v4 2/9] midx.c: make changing the preferred pack safe Taylor Blau
2022-01-25 22:41   ` [PATCH v4 3/9] pack-revindex.c: instrument loading on-disk reverse index Taylor Blau
2022-01-26 15:03     ` Ævar Arnfjörð Bjarmason
2022-01-25 22:41   ` [PATCH v4 4/9] t5326: drop unnecessary setup Taylor Blau
2022-01-25 22:41   ` [PATCH v4 5/9] t5326: extract `test_rev_exists` Taylor Blau
2022-01-26 15:04     ` Ævar Arnfjörð Bjarmason
2022-01-26 20:19       ` Taylor Blau
2022-01-25 22:41   ` [PATCH v4 6/9] t5326: move tests to t/lib-bitmap.sh Taylor Blau
2022-01-25 22:41   ` [PATCH v4 7/9] t/lib-bitmap.sh: parameterize tests over reverse index source Taylor Blau
2022-01-25 22:41   ` [PATCH v4 8/9] midx: read `RIDX` chunk when present Taylor Blau
2022-01-26 15:10     ` Ævar Arnfjörð Bjarmason
2022-01-26 20:23       ` Taylor Blau
2022-01-25 22:41   ` [PATCH v4 9/9] pack-bitmap.c: gracefully fallback after opening pack/MIDX Taylor Blau
2022-01-26 15:08     ` Ævar Arnfjörð Bjarmason
2022-01-26 17:50   ` [PATCH v4 0/9] midx: prevent bitmap corruption when permuting pack order Ævar Arnfjörð Bjarmason
2022-01-26 20:24     ` Taylor Blau
2022-01-27 17:15       ` Jonathan Tan
2022-02-24 22:50         ` Taylor Blau
2022-01-27 14:13   ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55aa69de12c5f82a66836e829f915363cc73b421.1641320129.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).