git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/4] midx: verify MIDX checksum before reusing
@ 2021-06-23 18:39 Taylor Blau
  2021-06-23 18:39 ` [PATCH 1/4] csum-file: introduce checksum_valid() Taylor Blau
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Taylor Blau @ 2021-06-23 18:39 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee

Here is a small series that came out of a corruption we noticed at GitHub. The
corruption is described in the third patch, but the gist is this:

  - After writing a MIDX, we experience a disk failure that corrupts the
    contents of the repository's MIDX.

  - We later go to write another MIDX, notice that one is already there, and
    then blindly reuse its data without checking its validity, resulting in a
    MIDX that contains corrupt data, but whose trailing checksum matches the
    preceding contents.

This series checks the validity of the trailing checksum when reusing a MIDX
while writing a new one, as well as a few other related fix-ups. That should be
sufficient to avoid propagating corrupt data forward while writing a MIDX.

We could go further here and run 'git multi-pack-index verify' on the existing
MIDX while writing a new one, but the point of this reuse is to avoid having to
re-compute/verify the contents of an existing MIDX. So a more lightweight option
is pursued here, but alternatives are discussed in 3/4.

Thanks in advance for your review.

Taylor Blau (4):
  csum-file: introduce checksum_valid()
  commit-graph: rewrite to use checksum_valid()
  midx: don't reuse corrupt MIDXs when writing
  midx: report checksum mismatches during 'verify'

 commit-graph.c              | 14 ++++++--------
 csum-file.c                 | 16 ++++++++++++++++
 csum-file.h                 |  3 +++
 midx.c                      | 13 +++++++++++++
 pack-check.c                | 11 +----------
 t/t5319-multi-pack-index.sh | 13 +++++++++++++
 6 files changed, 52 insertions(+), 18 deletions(-)

-- 
2.31.1.163.ga65ce7f831

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] csum-file: introduce checksum_valid()
  2021-06-23 18:39 [PATCH 0/4] midx: verify MIDX checksum before reusing Taylor Blau
@ 2021-06-23 18:39 ` Taylor Blau
  2021-06-24 19:42   ` Jeff King
  2021-06-23 18:39 ` [PATCH 2/4] commit-graph: rewrite to use checksum_valid() Taylor Blau
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2021-06-23 18:39 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee

Introduce a new function which checks the validity of a file's trailing
checksum. This is similar to hashfd_check(), but different since it is
intended to be used by callers who aren't writing the same data (like
`git index-pack --verify`), but who instead want to validate the
integrity of data that they are reading.

Rewrite the first of two callers which could benefit from this new
function in pack-check.c. Subsequent callers will be added in the
following patches.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 csum-file.c  | 16 ++++++++++++++++
 csum-file.h  |  3 +++
 pack-check.c | 11 +----------
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/csum-file.c b/csum-file.c
index 3487d28ed7..c951cf8277 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -217,3 +217,19 @@ uint32_t crc32_end(struct hashfile *f)
 	f->do_crc = 0;
 	return f->crc32;
 }
+
+int hashfile_checksum_valid(const unsigned char *data, size_t total_len)
+{
+	unsigned char got[GIT_MAX_RAWSZ];
+	git_hash_ctx ctx;
+	size_t data_len = total_len - the_hash_algo->rawsz;
+
+	if (total_len < the_hash_algo->rawsz)
+		return 0; /* say "too short"? */
+
+	the_hash_algo->init_fn(&ctx);
+	the_hash_algo->update_fn(&ctx, data, data_len);
+	the_hash_algo->final_fn(got, &ctx);
+
+	return hasheq(got, data + data_len);
+}
diff --git a/csum-file.h b/csum-file.h
index 3044bd19ab..291215b34e 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -44,6 +44,9 @@ void hashflush(struct hashfile *f);
 void crc32_begin(struct hashfile *);
 uint32_t crc32_end(struct hashfile *);
 
+/* Verify checksum validity while reading. Returns non-zero on success. */
+int hashfile_checksum_valid(const unsigned char *data, size_t len);
+
 /*
  * Returns the total number of bytes fed to the hashfile so far (including ones
  * that have not been written out to the descriptor yet).
diff --git a/pack-check.c b/pack-check.c
index 4b089fe8ec..c8e560d71a 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -164,22 +164,13 @@ static int verify_packfile(struct repository *r,
 
 int verify_pack_index(struct packed_git *p)
 {
-	size_t len;
-	const unsigned char *index_base;
-	git_hash_ctx ctx;
-	unsigned char hash[GIT_MAX_RAWSZ];
 	int err = 0;
 
 	if (open_pack_index(p))
 		return error("packfile %s index not opened", p->pack_name);
-	index_base = p->index_data;
-	len = p->index_size - the_hash_algo->rawsz;
 
 	/* Verify SHA1 sum of the index file */
-	the_hash_algo->init_fn(&ctx);
-	the_hash_algo->update_fn(&ctx, index_base, len);
-	the_hash_algo->final_fn(hash, &ctx);
-	if (!hasheq(hash, index_base + len))
+	if (!hashfile_checksum_valid(p->index_data, p->index_size))
 		err = error("Packfile index for %s hash mismatch",
 			    p->pack_name);
 	return err;
-- 
2.31.1.163.ga65ce7f831


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] commit-graph: rewrite to use checksum_valid()
  2021-06-23 18:39 [PATCH 0/4] midx: verify MIDX checksum before reusing Taylor Blau
  2021-06-23 18:39 ` [PATCH 1/4] csum-file: introduce checksum_valid() Taylor Blau
@ 2021-06-23 18:39 ` Taylor Blau
  2021-06-24 19:42   ` Jeff King
  2021-06-23 18:39 ` [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing Taylor Blau
  2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
  3 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2021-06-23 18:39 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee

Rewrite an existing caller in `git commit-graph verify` to take
advantage of checksum_valid().

Note that the replacement isn't a verbatim cut-and-paste, since the new
function avoids using hashfile at all and instead talks to the_hash_algo
directly, but it is functionally equivalent.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 commit-graph.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89..1a2602da61 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2422,14 +2422,16 @@ static void graph_report(const char *fmt, ...)
 #define GENERATION_ZERO_EXISTS 1
 #define GENERATION_NUMBER_EXISTS 2
 
+static int commit_graph_checksum_valid(struct commit_graph *g)
+{
+	return hashfile_checksum_valid(g->data, g->data_len);
+}
+
 int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
 {
 	uint32_t i, cur_fanout_pos = 0;
 	struct object_id prev_oid, cur_oid;
-	unsigned char checksum[GIT_MAX_HEXSZ];
 	int generation_zero = 0;
-	struct hashfile *f;
-	int devnull;
 	struct progress *progress = NULL;
 	int local_error = 0;
 
@@ -2442,11 +2444,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
 	if (verify_commit_graph_error)
 		return verify_commit_graph_error;
 
-	devnull = open("/dev/null", O_WRONLY);
-	f = hashfd(devnull, NULL);
-	hashwrite(f, g->data, g->data_len - g->hash_len);
-	finalize_hashfile(f, checksum, CSUM_CLOSE);
-	if (!hasheq(checksum, g->data + g->data_len - g->hash_len)) {
+	if (!commit_graph_checksum_valid(g)) {
 		graph_report(_("the commit-graph file has incorrect checksum and is likely corrupt"));
 		verify_commit_graph_error = VERIFY_COMMIT_GRAPH_ERROR_HASH;
 	}
-- 
2.31.1.163.ga65ce7f831


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing
  2021-06-23 18:39 [PATCH 0/4] midx: verify MIDX checksum before reusing Taylor Blau
  2021-06-23 18:39 ` [PATCH 1/4] csum-file: introduce checksum_valid() Taylor Blau
  2021-06-23 18:39 ` [PATCH 2/4] commit-graph: rewrite to use checksum_valid() Taylor Blau
@ 2021-06-23 18:39 ` Taylor Blau
  2021-06-24 20:00   ` Jeff King
  2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
  3 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2021-06-23 18:39 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee

When writing a new multi-pack index, Git tries to reuse as much of the
data from an existing MIDX as possible, like object offsets. This is
done to avoid re-opening a bunch of *.idx files unnecessarily, but can
lead to problems if the data we are reusing is corrupt.

That's because we'll blindly reuse data from an existing MIDX without
checking its trailing checksum for validity. So if there is memory
corruption while writing a MIDX, or disk corruption in the intervening
period between writing and reuse, we'll blindly propagate those bad
values forward.

Suppose we experience a memory corruption while writing a MIDX such that
we write an incorrect object offset (or alternatively, the disk corrupts
the data after being written, but before being reused). Then when we go
to write a new MIDX, we'll reuse the bad object offset without checking
its validity. This means that the MIDX we just wrote is broken, but its
trailing checksum is in-tact, since we never bothered to look at the
values before writing.

In the above, a "git multi-pack-index verify" would have caught the
problem before writing, but writing a new MIDX wouldn't have noticed
anything wrong, blindly carrying forward the corrupt offset.

Individual pack indexes check their validity by verifying the crc32
attached to each entry when carrying data forward during a repack.
We could solve this problem for MIDXs in the same way, but individual
crc32's don't make much sense, since their entries are so small.
Likewise, checking the whole file on every read may be prohibitively
expensive if a repository has a lot of objects, packs, or both.

But we can check the trailing checksum when reusing an existing MIDX
when writing a new one. And a corrupt MIDX need not stop us from writing
a new one, since we can just avoid reusing the existing one at all and
pretend as if we are writing a new MIDX from scratch.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c                      | 10 ++++++++++
 t/t5319-multi-pack-index.sh |  8 ++++++++
 2 files changed, 18 insertions(+)

diff --git a/midx.c b/midx.c
index 21d6a05e88..a12cbbf928 100644
--- a/midx.c
+++ b/midx.c
@@ -885,6 +885,11 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 static void clear_midx_files_ext(struct repository *r, const char *ext,
 				 unsigned char *keep_hash);
 
+static int midx_checksum_valid(struct multi_pack_index *m)
+{
+	return hashfile_checksum_valid(m->data, m->data_len);
+}
+
 static int write_midx_internal(const char *object_dir, struct multi_pack_index *m,
 			       struct string_list *packs_to_drop,
 			       const char *preferred_pack_name,
@@ -911,6 +916,11 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 	else
 		ctx.m = load_multi_pack_index(object_dir, 1);
 
+	if (ctx.m && !midx_checksum_valid(ctx.m)) {
+		warning(_("ignoring existing multi-pack-index; checksum mismatch"));
+		ctx.m = NULL;
+	}
+
 	ctx.nr = 0;
 	ctx.alloc = ctx.m ? ctx.m->num_packs : 16;
 	ctx.info = NULL;
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 5641d158df..d582f370c4 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -410,6 +410,14 @@ test_expect_success 'git-fsck incorrect offset' '
 		"git -c core.multipackindex=true fsck"
 '
 
+test_expect_success 'corrupt MIDX is not reused' '
+	corrupt_midx_and_verify $MIDX_BYTE_OFFSET "\377" $objdir \
+		"incorrect object offset" &&
+	git multi-pack-index write 2>err &&
+	test_i18ngrep checksum.mismatch err &&
+	git multi-pack-index verify
+'
+
 test_expect_success 'repack progress off for redirected stderr' '
 	GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err &&
 	test_line_count = 0 err
-- 
2.31.1.163.ga65ce7f831


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-06-23 18:39 [PATCH 0/4] midx: verify MIDX checksum before reusing Taylor Blau
                   ` (2 preceding siblings ...)
  2021-06-23 18:39 ` [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing Taylor Blau
@ 2021-06-23 18:39 ` Taylor Blau
  2021-06-24  4:22   ` Bagas Sanjaya
                     ` (2 more replies)
  3 siblings, 3 replies; 16+ messages in thread
From: Taylor Blau @ 2021-06-23 18:39 UTC (permalink / raw)
  To: git; +Cc: peff, dstolee

'git multi-pack-index verify' inspects the data in an existing MIDX for
correctness by checking that the recorded object offsets are correct,
and so on.

But it does not check that the file's trailing checksum matches the data
that it records. So, if an on-disk corruption happened to occur in the
final few bytes (and all other data was recorded correctly), we would:

  - get a clean result from 'git multi-pack-index verify', but
  - be unable to reuse the existing MIDX when writing a new one (since
    we now check for checksum mismatches before reusing a MIDX)

Teach the 'verify' sub-command to recognize corruption in the checksum
by calling midx_checksum_valid().

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c                      | 3 +++
 t/t5319-multi-pack-index.sh | 5 +++++
 2 files changed, 8 insertions(+)

diff --git a/midx.c b/midx.c
index a12cbbf928..9a35b0255d 100644
--- a/midx.c
+++ b/midx.c
@@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		return result;
 	}
 
+	if (!midx_checksum_valid(m))
+		midx_report(_("incorrect checksum"));
+
 	if (flags & MIDX_PROGRESS)
 		progress = start_delayed_progress(_("Looking for referenced packfiles"),
 					  m->num_packs);
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index d582f370c4..7609f1ea64 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -418,6 +418,11 @@ test_expect_success 'corrupt MIDX is not reused' '
 	git multi-pack-index verify
 '
 
+test_expect_success 'verify incorrect checksum' '
+	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
+	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
+'
+
 test_expect_success 'repack progress off for redirected stderr' '
 	GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err &&
 	test_line_count = 0 err
-- 
2.31.1.163.ga65ce7f831

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
@ 2021-06-24  4:22   ` Bagas Sanjaya
  2021-06-24 20:10   ` Jeff King
  2021-11-10 23:11   ` SZEDER Gábor
  2 siblings, 0 replies; 16+ messages in thread
From: Bagas Sanjaya @ 2021-06-24  4:22 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: peff, dstolee

On 24/06/21 01.39, Taylor Blau wrote:
> But it does not check that the file's trailing checksum matches the data
> that it records. So, if an on-disk corruption happened to occur in the
> final few bytes (and all other data was recorded correctly), we would:
> 
>    - get a clean result from 'git multi-pack-index verify', but
>    - be unable to reuse the existing MIDX when writing a new one (since
>      we now check for checksum mismatches before reusing a MIDX)
> 
> Teach the 'verify' sub-command to recognize corruption in the checksum
> by calling midx_checksum_valid().
...
> +	if (!midx_checksum_valid(m))
> +		midx_report(_("incorrect checksum"));
> +
...
> +test_expect_success 'verify incorrect checksum' '
> +	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
> +	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
> +'
> +

We're in the context of checksum mismatches because of on-disk 
corruption, so why didn't the message say "incorrect checksum, possibly 
corrupt"?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] csum-file: introduce checksum_valid()
  2021-06-23 18:39 ` [PATCH 1/4] csum-file: introduce checksum_valid() Taylor Blau
@ 2021-06-24 19:42   ` Jeff King
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-24 19:42 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee

On Wed, Jun 23, 2021 at 02:39:07PM -0400, Taylor Blau wrote:

> +int hashfile_checksum_valid(const unsigned char *data, size_t total_len)
> +{
> +	unsigned char got[GIT_MAX_RAWSZ];
> +	git_hash_ctx ctx;
> +	size_t data_len = total_len - the_hash_algo->rawsz;
> +
> +	if (total_len < the_hash_algo->rawsz)
> +		return 0; /* say "too short"? */
> +
> +	the_hash_algo->init_fn(&ctx);
> +	the_hash_algo->update_fn(&ctx, data, data_len);
> +	the_hash_algo->final_fn(got, &ctx);
> +
> +	return hasheq(got, data + data_len);
> +}

This "too short" comment is a little ugly, but I don't think it's easy
to do better. This function does not otherwise print anything to stderr,
so calling error() here would probably be bad. And having multiple
return types would complicate the callers.

I _suspect_ it's hard to trigger anyway, because the callers would
generally have complained about a too-short file in the first place. So
possibly it could be a BUG(). But I think leaving it as-is is the more
conservative choice, as we don't know what checks callers will have
done.

In the long run, we might want to teach this function to actually write
to stderr (or an error buffer), because it may be useful to say:

  hash mismatch: trailer checksum claims 1234abcd, but we computed 5678cdef

or similar. In which case we could then say "too short" in the same way.
I'm content to leave that for later if somebody cares. In every case of
file corruption I've seen, things are sufficiently complicated and
confusing that I turn to a hex dump of the file anyway. ;)

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] commit-graph: rewrite to use checksum_valid()
  2021-06-23 18:39 ` [PATCH 2/4] commit-graph: rewrite to use checksum_valid() Taylor Blau
@ 2021-06-24 19:42   ` Jeff King
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-24 19:42 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee

On Wed, Jun 23, 2021 at 02:39:09PM -0400, Taylor Blau wrote:

> Rewrite an existing caller in `git commit-graph verify` to take
> advantage of checksum_valid().
> 
> Note that the replacement isn't a verbatim cut-and-paste, since the new
> function avoids using hashfile at all and instead talks to the_hash_algo
> directly, but it is functionally equivalent.

Yay. IMHO the result is much nicer, as we do not have to wonder about
open() returning an error.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing
  2021-06-23 18:39 ` [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing Taylor Blau
@ 2021-06-24 20:00   ` Jeff King
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-24 20:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee

On Wed, Jun 23, 2021 at 02:39:12PM -0400, Taylor Blau wrote:

> Suppose we experience a memory corruption while writing a MIDX such that
> we write an incorrect object offset (or alternatively, the disk corrupts
> the data after being written, but before being reused). Then when we go
> to write a new MIDX, we'll reuse the bad object offset without checking
> its validity. This means that the MIDX we just wrote is broken, but its
> trailing checksum is in-tact, since we never bothered to look at the
> values before writing.
> 
> In the above, a "git multi-pack-index verify" would have caught the
> problem before writing, but writing a new MIDX wouldn't have noticed
> anything wrong, blindly carrying forward the corrupt offset.
> 
> Individual pack indexes check their validity by verifying the crc32
> attached to each entry when carrying data forward during a repack.
> We could solve this problem for MIDXs in the same way, but individual
> crc32's don't make much sense, since their entries are so small.
> Likewise, checking the whole file on every read may be prohibitively
> expensive if a repository has a lot of objects, packs, or both.
> 
> But we can check the trailing checksum when reusing an existing MIDX
> when writing a new one. And a corrupt MIDX need not stop us from writing
> a new one, since we can just avoid reusing the existing one at all and
> pretend as if we are writing a new MIDX from scratch.

Nicely explained. This is not saving us from any corruption, but it is
noticing it sooner and helping us better diagnose the root problem
(e.g., in our real-world case it is still only a guess that the data was
corrupted after write, though there was strong correlating evidence).

I wondered what the performance implications might be.

In my fully-packed clone of linux.git with no midx, I get (before your
patch):

  $ time git multi-pack-index write
  real	0m1.358s
  user	0m1.119s
  sys	0m0.185s

Running again immediately is much faster, because we realize there's
nothing new to pack:

  $ time git multi-pack-index write
  real	0m0.007s
  user	0m0.000s
  sys	0m0.007s

If we create a dummy pack, then we do real work, which seems to take
about the same amount of time as generating from scratch:

  $ git pack-objects .git/objects/pack/pack </dev/null
  029d08823bd8a8eab510ad6ac75c823cfd3ed31e
  Total 0 (delta 0), reused 0 (delta 0), pack-reused 0

  $ time git multi-pack-index write
  real	0m1.399s
  user	0m1.126s
  sys	0m0.209s

which makes sense. Even if we are reusing the midx, there is no speedup
to be gained versus the original single-pack (whereas with many packs,
it would be an improvement to use the midx rather than going back to
each pack idx).

Running the same sequence (after deleting the midx and the dummy pack)
using your patch, the initial write is the same (unsurprising, since
there is no midx to check). The final case (with the dummy pack)
becomes:

  $ time git multi-pack-index write
  real	0m1.947s
  user	0m1.668s
  sys	0m0.216s

So we really do pay a 500-600ms penalty to to look at the checksum. IMHO
that is worth it, given the overall cost of the write operation, but I
was surprised how big it was as a relative cost.

There is one more interesting thing, though. The 7ms "noop" version
where we don't write anything also gets more expensive:

  $ time git multi-pack-index write
  real	0m0.550s
  user	0m0.537s
  sys	0m0.013s

I suspect we could avoid that by pushing our checksum validation down a
bit further in the function, but I'm not sure if that creates further
problems (i.e., if we'd already have started trusting some of the midx's
data by then).

I'm not sure if this "noop" midx write is common enough to worry too
much about. Presumably whatever triggers "multi-pack-index write" is
doing so because it just created or removed some packs. If it knows that
what it did was a noop, it's probably better off skipping the call to
git-multi-pack-index entirely.

All of which is to say this looks just fine to me. But I think we should
make a conscious decision on the performance implications (and seeing
the real numbers, it is quite clear that checking the trailer checksum
on normal reads is _way_ too expensive).

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
  2021-06-24  4:22   ` Bagas Sanjaya
@ 2021-06-24 20:10   ` Jeff King
  2021-11-10 23:11   ` SZEDER Gábor
  2 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2021-06-24 20:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee

On Wed, Jun 23, 2021 at 02:39:15PM -0400, Taylor Blau wrote:

> 'git multi-pack-index verify' inspects the data in an existing MIDX for
> correctness by checking that the recorded object offsets are correct,
> and so on.
> 
> But it does not check that the file's trailing checksum matches the data
> that it records. So, if an on-disk corruption happened to occur in the
> final few bytes (and all other data was recorded correctly), we would:
> 
>   - get a clean result from 'git multi-pack-index verify', but
>   - be unable to reuse the existing MIDX when writing a new one (since
>     we now check for checksum mismatches before reusing a MIDX)
> 
> Teach the 'verify' sub-command to recognize corruption in the checksum
> by calling midx_checksum_valid().

Makes sense. I was a little surprised we didn't do this already, but I
guess it does not do the same "regenerate and make sure hashfile
produces the same checksum" trick that the pack idx verifier does (as an
aside, I think what the midx code is doing here is much _better_,
because it is looking at semantic problems in the file, and is more
robust against irrelevant changes in the format).

> @@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>  		return result;
>  	}
>  
> +	if (!midx_checksum_valid(m))
> +		midx_report(_("incorrect checksum"));

This "midx_report()" function doesn't provide much context on stderr. I
get:

  $ echo foo >>.git/objects/pack/multi-pack-index
  $ git multi-pack-index verify
  incorrect checksum
  Verifying OID order in multi-pack-index: 100% (282/282), done.
  Sorting objects by packfile: 100% (283/283), done.
  Verifying object offsets: 100% (283/283), done.

I think we should at least say "error:", but something along the lines
of "midx file at %s does not match its trailing checksum (possibly
corruption?)". Or something like that.

I think all of the existing calls to midx_report() share this issue,
though. We probably want to at least say "error:" here, but maybe even
something like:

diff --git a/midx.c b/midx.c
index 9a35b0255d..e464907a7c 100644
--- a/midx.c
+++ b/midx.c
@@ -1172,10 +1172,12 @@ void clear_midx_file(struct repository *r)
 
 static int verify_midx_error;
 
-static void midx_report(const char *fmt, ...)
+static void midx_report(struct multi_pack_index *m, const char *fmt, ...)
 {
 	va_list ap;
 	verify_midx_error = 1;
+	/* do we need to care about the "next" pointer here? */
+	fprintf(stderr, ("error: %s/multi-pack-index: "), m->object_dir);
 	va_start(ap, fmt);
 	vfprintf(stderr, fmt, ap);
 	fprintf(stderr, "\n");

Also, a side note: we should use __attribute__((format)) on this
function to get compile-time checks of our format strings.

-Peff

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
  2021-06-24  4:22   ` Bagas Sanjaya
  2021-06-24 20:10   ` Jeff King
@ 2021-11-10 23:11   ` SZEDER Gábor
  2021-11-11 10:05     ` Jeff King
  2 siblings, 1 reply; 16+ messages in thread
From: SZEDER Gábor @ 2021-11-10 23:11 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, dstolee

On Wed, Jun 23, 2021 at 02:39:15PM -0400, Taylor Blau wrote:
> 'git multi-pack-index verify' inspects the data in an existing MIDX for
> correctness by checking that the recorded object offsets are correct,
> and so on.
> 
> But it does not check that the file's trailing checksum matches the data
> that it records. So, if an on-disk corruption happened to occur in the
> final few bytes (and all other data was recorded correctly), we would:
> 
>   - get a clean result from 'git multi-pack-index verify', but
>   - be unable to reuse the existing MIDX when writing a new one (since
>     we now check for checksum mismatches before reusing a MIDX)
> 
> Teach the 'verify' sub-command to recognize corruption in the checksum
> by calling midx_checksum_valid().
> 
> Suggested-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  midx.c                      | 3 +++
>  t/t5319-multi-pack-index.sh | 5 +++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/midx.c b/midx.c
> index a12cbbf928..9a35b0255d 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>  		return result;
>  	}
>  
> +	if (!midx_checksum_valid(m))
> +		midx_report(_("incorrect checksum"));
> +
>  	if (flags & MIDX_PROGRESS)
>  		progress = start_delayed_progress(_("Looking for referenced packfiles"),
>  					  m->num_packs);
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index d582f370c4..7609f1ea64 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -418,6 +418,11 @@ test_expect_success 'corrupt MIDX is not reused' '
>  	git multi-pack-index verify
>  '
>  
> +test_expect_success 'verify incorrect checksum' '
> +	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
> +	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"

This test is flaky and can fail with:

  ...
  + printf \377
  + dd of=.git/objects/pack/multi-pack-index bs=1 seek=3839 conv=notrunc
  1+0 records in
  1+0 records out
  1 byte copied, 5.0656e-05 s, 19.7 kB/s
  + test_must_fail git multi-pack-index verify --object-dir=.git/objects
  test_must_fail: command succeeded: git multi-pack-index verify --object-dir=.git/objects
  error: last command exited with $?=1
  + mv midx-backup .git/objects/pack/multi-pack-index
  not ok 44 - verify incorrect checksum

So the test corrupts the checksum trailer in the 'multi-pack-index'
file by overwriting its last byte with 0xff, but if that byte were
already 0xff, then the file would be left as is, and 'git
multi-pack-index verify' wouldn't find anything amiss.

Since SHA1 is essentially random, there's a 1:256 chance of that
happening, assuming that the file's content is random.  That't not
really the case, however, because both the test repository's objects
and the resulting packfiles are deterministic, and, consequently, the
content of the MIDX is somewhat deterministic.  Only "somewhat",
though, because several of those objects appear in multiple packfiles,
and the MIDX selects a copy in the most recently modified packfile, so
filesystem mtime resolution and second boundaries become significant,
and cause some variance in the contents of the OOFF chunk.

Recently a laptop crossed my way that is somehow exceptionally good at
generating MIDX files ending with 0xff:

  # ad-hoc checksum statistics:
  $ git diff
  diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
  index f1ee2ce56d..605713b518 100755
  --- a/t/t5319-multi-pack-index.sh
  +++ b/t/t5319-multi-pack-index.sh
  @@ -482,9 +482,13 @@ test_expect_success 'corrupt MIDX is not reused' '
   '
   
   test_expect_success 'verify incorrect checksum' '
  +	skip=$(($(wc -c <$objdir/pack/multi-pack-index) - 20)) &&
  +	printf "checksum: " >&5 &&
  +	od -x -w20 -j$skip --endian=big -An "$objdir/pack/multi-pack-index" >&5 &&
   	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
   	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
   '
  +test_done
   
   test_expect_success 'repack progress off for redirected stderr' '
   	GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err &&
  $ for i in {1..500} ; do ./t5319-multi-pack-index.sh |sed -n -e 's/^checksum:  //p' ; done |sort |uniq -c
       31 1a70 3b1c 8ed3 56a6 5101 2a38 057e 698d 6faf fbaa
      340 5fc0 552f 0ac0 c876 f229 d9e3 ef13 a314 5847 89ff
      129 ce7d 3710 fd21 ef7b 8316 2b99 4e6c e5d5 5e7c 7b08

That's almost 70% failure rate, though I haven't been able to trigger
this failure on any of the other machines that I have access to.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-11-10 23:11   ` SZEDER Gábor
@ 2021-11-11 10:05     ` Jeff King
  2021-11-16 21:10       ` Taylor Blau
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2021-11-11 10:05 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Taylor Blau, git, dstolee

On Thu, Nov 11, 2021 at 12:11:32AM +0100, SZEDER Gábor wrote:

> So the test corrupts the checksum trailer in the 'multi-pack-index'
> file by overwriting its last byte with 0xff, but if that byte were
> already 0xff, then the file would be left as is, and 'git
> multi-pack-index verify' wouldn't find anything amiss.

Good find. I tried running with the patch you showed:

>   $ for i in {1..500} ; do ./t5319-multi-pack-index.sh |sed -n -e 's/^checksum:  //p' ; done |sort |uniq -c
>        31 1a70 3b1c 8ed3 56a6 5101 2a38 057e 698d 6faf fbaa
>       340 5fc0 552f 0ac0 c876 f229 d9e3 ef13 a314 5847 89ff
>       129 ce7d 3710 fd21 ef7b 8316 2b99 4e6c e5d5 5e7c 7b08

but I only ever got the last one, even under load. However, it's easy
enough to tweak the mtimes to stimulate a failure:

  diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
  index 3f69e43178..b84e96779f 100755
  --- a/t/t5319-multi-pack-index.sh
  +++ b/t/t5319-multi-pack-index.sh
  @@ -481,10 +481,25 @@ test_expect_success 'corrupt MIDX is not reused' '
   	git multi-pack-index verify
   '
   
  +while true; do
   test_expect_success 'verify incorrect checksum' '
  +	t=1234567890 &&
  +	for i in $(ls $objdir/pack/*.pack | shuf)
  +	do
  +		touch -d @$t $i &&
  +		t=$((t+100))
  +	done &&
  +	rm -f $objdir/pack/multi-pack-index &&
  +	git multi-pack-index write &&
  +	skip=$(($(wc -c <$objdir/pack/multi-pack-index) - 20)) &&
  +	printf "checksum: " >&5 &&
  +	od -x -w20 -j$skip --endian=big -An "$objdir/pack/multi-pack-index" >&5 &&
   	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
   	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
   '
  +done
  +
  +test_done
   
   test_expect_success 'repack progress off for redirected stderr' '
   	GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err &&

Brute force, but it fails after a few hundred attempts. :)

> Since SHA1 is essentially random, there's a 1:256 chance of that
> happening, assuming that the file's content is random.

The most exact fix here would be to read the final byte, increment it
mod-256, and write it back, which would ensure it was always changed.
But that's a bit annoying to do in shell. Perhaps just corrupting more
bytes is a good solution? The patch below should reduce your changes to
1 in 2^80.

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3f69e43178..a612e44547 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' '
 '
 
 test_expect_success 'verify incorrect checksum' '
-	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
-	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
+	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) &&
+	corrupt_midx_and_verify $pos \
+		"\377\377\377\377\377\377\377\377\377\377" \
+		$objdir "incorrect checksum"
 '
 
 test_expect_success 'repack progress off for redirected stderr' '

There are other variants, of course. Just appending a single byte to the
file is enough to give you a high probability of failing the checksum
(since it shifts it all by one byte, making it essentially random), but
the corrupt_midx() helper doesn't support that.

-Peff

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] midx: report checksum mismatches during 'verify'
  2021-11-11 10:05     ` Jeff King
@ 2021-11-16 21:10       ` Taylor Blau
  2021-11-16 21:38         ` [PATCH] t5319: corrupt more bytes of the midx checksum Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: Taylor Blau @ 2021-11-16 21:10 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, git, dstolee

On Thu, Nov 11, 2021 at 05:05:06AM -0500, Jeff King wrote:
> > Since SHA1 is essentially random, there's a 1:256 chance of that
> > happening, assuming that the file's content is random.
>
> The most exact fix here would be to read the final byte, increment it
> mod-256, and write it back, which would ensure it was always changed.
> But that's a bit annoying to do in shell. Perhaps just corrupting more
> bytes is a good solution? The patch below should reduce your changes to
> 1 in 2^80.
>
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 3f69e43178..a612e44547 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' '
>  '
>
>  test_expect_success 'verify incorrect checksum' '
> -	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
> -	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
> +	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) &&
> +	corrupt_midx_and_verify $pos \
> +		"\377\377\377\377\377\377\377\377\377\377" \
> +		$objdir "incorrect checksum"
>  '
>
>  test_expect_success 'repack progress off for redirected stderr' '
>
> There are other variants, of course. Just appending a single byte to the
> file is enough to give you a high probability of failing the checksum
> (since it shifts it all by one byte, making it essentially random), but
> the corrupt_midx() helper doesn't support that.

Thanks SZEDER for the report, and thanks Peff for the fix :).

I agree that it's annoyingly cumbersome to write back the last byte
incremented mod-256. So I'm content to just make it astronomically
unlikely to run into a collision in practice. (As a matter of fact, I'm
surprised that the current implementation hasn't produced failures for
us more often).

Peff: do you want me to turn this into a patch or were you planning on
it?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] t5319: corrupt more bytes of the midx checksum
  2021-11-16 21:10       ` Taylor Blau
@ 2021-11-16 21:38         ` Jeff King
  2021-11-16 21:43           ` Taylor Blau
  2021-11-16 22:12           ` Derrick Stolee
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff King @ 2021-11-16 21:38 UTC (permalink / raw)
  To: Taylor Blau; +Cc: SZEDER Gábor, git, dstolee

On Tue, Nov 16, 2021 at 04:10:10PM -0500, Taylor Blau wrote:

> I agree that it's annoyingly cumbersome to write back the last byte
> incremented mod-256. So I'm content to just make it astronomically
> unlikely to run into a collision in practice. (As a matter of fact, I'm
> surprised that the current implementation hasn't produced failures for
> us more often).

It's quite deterministic in practice. The randomness is driven by the
mtimes of the packfiles, so there are only a handful of combinations
you'd see in practice (though I'm still curious why one of Gábor's
machines sees it so frequently).

> Peff: do you want me to turn this into a patch or were you planning on
> it?

I hadn't thought that far ahead. ;)

One thing I did wonder is whether other corruption tests might have the
same problem. But if they're not triggering in practice, I don't think
it's worth spending a bunch of time looking at them.

So here's this fix wrapped up with a commit message.

-- >8 --
Subject: [PATCH] t5319: corrupt more bytes of the midx checksum

One of the tests in t5319 corrupts the checksum of the midx file by
writing a single 0xff over the final byte, and then confirms that we
detect the problem. This usually works fine, but would break if the
actual checksum ended with that same byte already.

It seems like this should happen in 1 out of 256 test runs, but it turns
out to be less often in practice. The contents of the midx are mostly
deterministic because it's based on the objects, and we remove most
sources of randomness by setting GIT_COMMITTER_DATE, etc.  However,
there's still some randomness: some objects are duplicated between
packs, and the midx must decide which to use, which can be based on
timing.

So very occasionally we can end up with a real 0xff byte, and the test
fails. The most robust fix would be to read out the final byte and then
change it to something else (e.g., adding 1 mod 256). But that's awkward
to do in shell. Let's just blindly corrupt 10 bytes instead of 1, which
reduces our chances of an accidental noop to 1 in 2^80.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5319-multi-pack-index.sh | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 3f69e43178..a612e44547 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' '
 '
 
 test_expect_success 'verify incorrect checksum' '
-	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
-	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
+	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) &&
+	corrupt_midx_and_verify $pos \
+		"\377\377\377\377\377\377\377\377\377\377" \
+		$objdir "incorrect checksum"
 '
 
 test_expect_success 'repack progress off for redirected stderr' '
-- 
2.34.0.633.g181affe237


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] t5319: corrupt more bytes of the midx checksum
  2021-11-16 21:38         ` [PATCH] t5319: corrupt more bytes of the midx checksum Jeff King
@ 2021-11-16 21:43           ` Taylor Blau
  2021-11-16 22:12           ` Derrick Stolee
  1 sibling, 0 replies; 16+ messages in thread
From: Taylor Blau @ 2021-11-16 21:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, SZEDER Gábor, git, dstolee

On Tue, Nov 16, 2021 at 04:38:50PM -0500, Jeff King wrote:
> On Tue, Nov 16, 2021 at 04:10:10PM -0500, Taylor Blau wrote:
> > Peff: do you want me to turn this into a patch or were you planning on
> > it?
>
> I hadn't thought that far ahead. ;)
>
> One thing I did wonder is whether other corruption tests might have the
> same problem. But if they're not triggering in practice, I don't think
> it's worth spending a bunch of time looking at them.

Definitely agreed. We can apply a similar fix if and when any other
spots become a problem.

> So here's this fix wrapped up with a commit message.

Thanks, what you wrote looks good to me.

    Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] t5319: corrupt more bytes of the midx checksum
  2021-11-16 21:38         ` [PATCH] t5319: corrupt more bytes of the midx checksum Jeff King
  2021-11-16 21:43           ` Taylor Blau
@ 2021-11-16 22:12           ` Derrick Stolee
  1 sibling, 0 replies; 16+ messages in thread
From: Derrick Stolee @ 2021-11-16 22:12 UTC (permalink / raw)
  To: Jeff King, Taylor Blau; +Cc: SZEDER Gábor, git, dstolee

On 11/16/2021 4:38 PM, Jeff King wrote:
> So very occasionally we can end up with a real 0xff byte, and the test
> fails. The most robust fix would be to read out the final byte and then
> change it to something else (e.g., adding 1 mod 256). But that's awkward
> to do in shell. Let's just blindly corrupt 10 bytes instead of 1, which
> reduces our chances of an accidental noop to 1 in 2^80.
> 
> Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  t/t5319-multi-pack-index.sh | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 3f69e43178..a612e44547 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' '
>  '
>  
>  test_expect_success 'verify incorrect checksum' '
> -	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) &&
> -	corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum"
> +	pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) &&
> +	corrupt_midx_and_verify $pos \
> +		"\377\377\377\377\377\377\377\377\377\377" \
> +		$objdir "incorrect checksum"
>  '

Thanks for taking the time to make this a patch. This approach looks
good to me.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-16 22:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-23 18:39 [PATCH 0/4] midx: verify MIDX checksum before reusing Taylor Blau
2021-06-23 18:39 ` [PATCH 1/4] csum-file: introduce checksum_valid() Taylor Blau
2021-06-24 19:42   ` Jeff King
2021-06-23 18:39 ` [PATCH 2/4] commit-graph: rewrite to use checksum_valid() Taylor Blau
2021-06-24 19:42   ` Jeff King
2021-06-23 18:39 ` [PATCH 3/4] midx: don't reuse corrupt MIDXs when writing Taylor Blau
2021-06-24 20:00   ` Jeff King
2021-06-23 18:39 ` [PATCH 4/4] midx: report checksum mismatches during 'verify' Taylor Blau
2021-06-24  4:22   ` Bagas Sanjaya
2021-06-24 20:10   ` Jeff King
2021-11-10 23:11   ` SZEDER Gábor
2021-11-11 10:05     ` Jeff King
2021-11-16 21:10       ` Taylor Blau
2021-11-16 21:38         ` [PATCH] t5319: corrupt more bytes of the midx checksum Jeff King
2021-11-16 21:43           ` Taylor Blau
2021-11-16 22:12           ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).