git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* commit-graph overflow generation chicken and egg
@ 2022-06-08 19:33 Jeff King
  2022-06-08 20:08 ` Derrick Stolee
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2022-06-08 19:33 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee

I hadn't touched my git.git repository for a while, so I upgraded to the
most recent version of Git (v2.36.1) and was met with this:

  $ git rev-list --all
  fatal: commit-graph requires overflow generation data but has none

Not very friendly, but OK, maybe my commit graph is out of date. Let's
regenerate it:

  $ git gc
  fatal: commit-graph requires overflow generation data but has none
  fatal: failed to run repack

OK, we can't get far enough in the gc to rebuild the commit graph. Let's
try doing it manually:

  $ git commit-graph write --reachable
  fatal: commit-graph requires overflow generation data but has none
  $ git commit-graph write
  fatal: commit-graph requires overflow generation data but has none

Yikes. Here's where it happens within the write process:

  $ GIT_PROGRESS_DELAY=0 git commit-graph write
  Finding commits for commit graph among packed objects: 100% (360229/360229), done.
  Loading known commits in commit graph: 100% (78366/78366), done.
  Expanding reachable commits in commit graph: 78366, done.
  Clearing commit marks in commit graph: 100% (78366/78366), done.
  Finding extra edges in commit graph: 100% (78366/78366), done.
  Computing commit graph topological levels: 100% (78366/78366), done.
  Computing commit graph generation numbers: 100% (78366/78366), done.
  fatal: commit-graph requires overflow generation data but has none

Now being the enterprising fellow that I am, I was able to get out of it
like this:

  $ rm -f objects/info/commit-graph
  $ git gc

But I wonder if this is a foot-gun waiting for some other user. I'm not
sure how I got into the broken state exactly. The repo was last touched
in December using a version of Git running 'next'. It worked fine with
versions of Git prior to 6dbf4b8172 (commit-graph: declare bankruptcy on
GDAT chunks, 2022-03-02). It's entirely possible that the bad state was
generated by a version of Git that wasn't ever released, and this isn't
a problem that normal humans would ever run into. It does feel a bit
unfriendly that neither gc nor commit-graph could unstick things,
though. Especially because 6dbf4b8172 says:

  [...]a previous version of Git wrote possibly erroneous data in these
  chunks with the IDs "GDAT" and "GDOV". By changing the IDs, newer
  versions of Git will silently ignore those older chunks[...]

Presumably we _are_ ignoring those chunks, but some other part of the
commit-graph file has a dependency on them (and of course we don't have
the new GDA2/GDO2 chunks to read in their place). If that's true, then
the solution may be a more graceful "we can't use this commit graph"
error return rather than the "fatal:" message seen above.

I have a copy of the broken repo state if anybody would care to look at
it.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-08 19:33 commit-graph overflow generation chicken and egg Jeff King
@ 2022-06-08 20:08 ` Derrick Stolee
  2022-06-08 23:17   ` Jeff King
  2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 13+ messages in thread
From: Derrick Stolee @ 2022-06-08 20:08 UTC (permalink / raw)
  To: Jeff King, git

On 6/8/2022 3:33 PM, Jeff King wrote:
> I hadn't touched my git.git repository for a while, so I upgraded to the
> most recent version of Git (v2.36.1) and was met with this:
> 
>   $ git rev-list --all
>   fatal: commit-graph requires overflow generation data but has none
> 
> Not very friendly, but OK, maybe my commit graph is out of date. Let's
> regenerate it:
> 
>   $ git gc
>   fatal: commit-graph requires overflow generation data but has none
>   fatal: failed to run repack
> 
> OK, we can't get far enough in the gc to rebuild the commit graph. Let's
> try doing it manually:
> 
>   $ git commit-graph write --reachable
>   fatal: commit-graph requires overflow generation data but has none
>   $ git commit-graph write
>   fatal: commit-graph requires overflow generation data but has none
> 
> Yikes. Here's where it happens within the write process:
> 
>   $ GIT_PROGRESS_DELAY=0 git commit-graph write
>   Finding commits for commit graph among packed objects: 100% (360229/360229), done.
>   Loading known commits in commit graph: 100% (78366/78366), done.
>   Expanding reachable commits in commit graph: 78366, done.
>   Clearing commit marks in commit graph: 100% (78366/78366), done.
>   Finding extra edges in commit graph: 100% (78366/78366), done.
>   Computing commit graph topological levels: 100% (78366/78366), done.
>   Computing commit graph generation numbers: 100% (78366/78366), done.
>   fatal: commit-graph requires overflow generation data but has none
> 
> Now being the enterprising fellow that I am, I was able to get out of it
> like this:
> 
>   $ rm -f objects/info/commit-graph
>   $ git gc
> 
> But I wonder if this is a foot-gun waiting for some other user. I'm not
> sure how I got into the broken state exactly. The repo was last touched
> in December using a version of Git running 'next'. It worked fine with
> versions of Git prior to 6dbf4b8172 (commit-graph: declare bankruptcy on
> GDAT chunks, 2022-03-02). It's entirely possible that the bad state was
> generated by a version of Git that wasn't ever released, and this isn't
> a problem that normal humans would ever run into. It does feel a bit
> unfriendly that neither gc nor commit-graph could unstick things,
> though. Especially because 6dbf4b8172 says:
> 
>   [...]a previous version of Git wrote possibly erroneous data in these
>   chunks with the IDs "GDAT" and "GDOV". By changing the IDs, newer
>   versions of Git will silently ignore those older chunks[...]

You've done enough homework to discover exactly what's going on here,
including talking about the commit that I was going to point out.

> Presumably we _are_ ignoring those chunks, but some other part of the
> commit-graph file has a dependency on them (and of course we don't have
> the new GDA2/GDO2 chunks to read in their place). If that's true, then
> the solution may be a more graceful "we can't use this commit graph"
> error return rather than the "fatal:" message seen above.
> 
> I have a copy of the broken repo state if anybody would care to look at
> it.

I'd love to see the full binary, but for the sake of sharing on the
list, could you give the following output?

	xxd .git/objects/info/commit-graph | head

or any other command that shows the first few hex bytes along with
their ASCII equivalents. Here is one that used Git 2.34.0:

00000000: 4347 5048 0101 0500 4f49 4446 0000 0000  CGPH....OIDF....
00000010: 0000 0050 4f49 444c 0000 0000 0000 0450  ...POIDL.......P
00000020: 4344 4154 0000 0000 002d cf0c 4744 4154  CDAT.....-..GDAT
00000030: 0000 0000 0080 3bf8 4544 4745 0000 0000  ......;.EDGE....
00000040: 0089 6484 0000 0000 0000 0000 0089 6660  ..d...........f`
00000050: 0000 0246 0000 0498 0000 06c8 0000 0908  ...F............
00000060: 0000 0b4b 0000 0d58 0000 0f9e 0000 1207  ...K...X........
00000070: 0000 1423 0000 168e 0000 18d3 0000 1b36  ...#...........6
00000080: 0000 1d67 0000 1f9b 0000 2219 0000 2456  ...g......"...$V
00000090: 0000 26a2 0000 2908 0000 2b5c 0000 2d96  ..&...)...+\..-.

and here is one with latest master:

00000000: 4347 5048 0101 0500 4f49 4446 0000 0000  CGPH....OIDF....
00000010: 0000 0050 4f49 444c 0000 0000 0000 0450  ...POIDL.......P
00000020: 4344 4154 0000 0000 002e 20b0 4744 4132  CDAT...... .GDA2
00000030: 0000 0000 0081 2090 4544 4745 0000 0000  ...... .EDGE....
00000040: 008a 5970 0000 0000 0000 0000 008a 5b4c  ..Yp..........[L
00000050: 0000 024c 0000 04a1 0000 06d6 0000 091a  ...L............
00000060: 0000 0b60 0000 0d70 0000 0fbb 0000 1229  ...`...p.......)
00000070: 0000 1448 0000 16b6 0000 1905 0000 1b6c  ...H...........l
00000080: 0000 1da3 0000 1fd9 0000 225b 0000 249d  .........."[..$.
00000090: 0000 26f0 0000 2956 0000 2bac 0000 2dea  ..&...)V..+...-.

If you have a GDA2 chunk, then we are somehow incorrectly writing the
offset there. However, I don't see this happening in the latest code.

We have this macro in commit.h:

#define GENERATION_NUMBER_V2_OFFSET_MAX ((1ULL << 31) - 1)

and this in commit-graph.c:

#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31)

being used here:

static int write_graph_chunk_generation_data(struct hashfile *f,
					     void *data)
{
	struct write_commit_graph_context *ctx = data;
	int i, num_generation_data_overflows = 0;

	for (i = 0; i < ctx->commits.nr; i++) {
		struct commit *c = ctx->commits.list[i];
		timestamp_t offset;
		repo_parse_commit(ctx->r, c);
		offset = commit_graph_data_at(c)->generation - c->date;
		display_progress(ctx->progress, ++ctx->progress_cnt);

		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) {
			offset = CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW | num_generation_data_overflows;
			num_generation_data_overflows++;
		}

		hashwrite_be32(f, offset);
	}

	return 0;
}

But this hasn't been changed since eb9071912f (commit-graph: anonymize
data in chunk_write_fn, 2021-02-05) and 90cb1c47c7 (commit-graph: always
parse before commit_graph_data_at(), 2021-02-01) which were not meaningfully
different from the first implementation in e8b63005c4 (commit-graph:
implement generation data chunk, 2021-01-16).

However, the lack of the large offset chunk could be due to the bug fixed by
75979d9460 (commit-graph: fix ordering bug in generation numbers,
2022-03-01). Perhaps that was the thing that was missing from your version?

But otherwise, I'm stumped. I'd be very interested to see a repro from a
fresh repository. That is: what situation do we need to be in to write such
an offset without including the large offset chunk?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-08 20:08 ` Derrick Stolee
@ 2022-06-08 23:17   ` Jeff King
  2022-07-01 12:06     ` Patrick Steinhardt
  2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff King @ 2022-06-08 23:17 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Wed, Jun 08, 2022 at 04:08:03PM -0400, Derrick Stolee wrote:

> I'd love to see the full binary, but for the sake of sharing on the
> list, could you give the following output?
> 
> 	xxd .git/objects/info/commit-graph | head
> 
> or any other command that shows the first few hex bytes along with
> their ASCII equivalents. Here is one that used Git 2.34.0:
> [...]

Interesting. My earlier email was a bit misleading. I do in fact have a
GDA2 chunk. And looking at the timestamp on the commit-graph file, it's
from May 24th. I hadn't been keeping the repo up to date regularly, but
I did occasionally pull and rebuild. So I think it was a much more
recent version of Git that built the problematic file, though it's
possible it was carrying forward bad data.

So 6dbf4b8172ef may be a bit of a red herring, if the file has a GDA2
section that was simply ignored before that commit.

Looking at my reflog, my best guess for the version of Git that produced
the file is e46751e96fa.

> However, the lack of the large offset chunk could be due to the bug fixed by
> 75979d9460 (commit-graph: fix ordering bug in generation numbers,
> 2022-03-01). Perhaps that was the thing that was missing from your version?

So I _think_ I would have had that, though there's a good chance that an
older version of the commit-graph file was written using a version of
Git without it.

> But otherwise, I'm stumped. I'd be very interested to see a repro from a
> fresh repository. That is: what situation do we need to be in to write such
> an offset without including the large offset chunk?

Not exactly a fresh reproduction, but you can grab my broken file from:

  https://peff.net/tmp/broken-commit-graph

Dropping it into a fresh clone of git.git shows the problem.

I tried a few obvious from-scratch reproductions like building a file
with 75979d9460^ (so with the generation number bug), and then jumping
forward to e46751e96fa (so bug fixed, but now we write GDA2), but
couldn't get it to trigger.

It may not be worth spending too much time on, if this is a weird
one-off caused by a mix of buggy unreleased versions of Git. If real
users aren't seeing it, and we know the nuclear option is "rm
commit-graph", then that may be enough.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-08 20:08 ` Derrick Stolee
  2022-06-08 23:17   ` Jeff King
@ 2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
  2022-06-09 15:26     ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-09  7:49 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jeff King, git, Abhishek Kumar


On Wed, Jun 08 2022, Derrick Stolee wrote:

> On 6/8/2022 3:33 PM, Jeff King wrote:
>> [...]
> You've done enough homework to discover exactly what's going on here,
> including talking about the commit that I was going to point out.
>
>> Presumably we _are_ ignoring those chunks, but some other part of the
>> commit-graph file has a dependency on them (and of course we don't have
>> the new GDA2/GDO2 chunks to read in their place). If that's true, then
>> the solution may be a more graceful "we can't use this commit graph"
>> error return rather than the "fatal:" message seen above.
>> 
>> I have a copy of the broken repo state if anybody would care to look at
>> it.
>
> I'd love to see the full binary, but for the sake of sharing on the
> list, could you give the following output?
>
> [...]
>
> But otherwise, I'm stumped. I'd be very interested to see a repro from a
> fresh repository. That is: what situation do we need to be in to write such
> an offset without including the large offset chunk?

It's certainly interesting to see *how* we got to this state, but just
so we're on the same page: I fundimentally don't think it matters to the
*real* bug here.

Which is that at the very least f90fca638e9 (commit-graph: consolidate
fill_commit_graph_info, 2021-01-16) and e8b63005c48 (commit-graph:
implement generation data chunk, 2021-01-16) (CC'd author) have a bad
regression on earlier fixes that read-only operations of the
commit-graph *must not die*. I.e. the "parse" and "verify" paths of the
commit-graph.c code shouldn't call exit(), die() etc.

I.e. the changes I made in 2ac138d568a (commit-graph: fix segfault on
e.g. "git status", 2019-03-25), 61df89c8e55 (commit-graph: don't early
exit(1) on e.g. "git status", 2019-03-25) and 43d35618055 (commit-graph
write: don't die if the existing graph is corrupt, 2019-03-25).

The below patch is a start at fixing some of that, but as noted in the
TODO comments I really wouldn't trust it as-is (and haven't looked
deeply into this).

If you replace your graph with Jeff's corrupt one and run "git status",
"git log" etc. it's still emitting one verbose complaint, but it no
longer does so in loops (at least for these paths, but e.g. "git gc" is
still doing that).

But it does get us to where we can run "git gc", and while complaining
too much along the way will write out a new & valid commit graph at the
end ("[... comments are mine"):
	
	$ git gc
	[...]
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
	[... a lot of the above lines snipped out ...]
	Enumerating objects: 1306881, done.
	Counting objects: 100% (1306881/1306881), done.
	Delta compression using up to 8 threads
	Compressing objects: 100% (262837/262837), done.
	Writing objects: 100% (1306881/1306881), done.
	Selecting bitmap commits: 378069, done.
	Building bitmaps: 100% (380/380), done.
	Total 1306881 (delta 1047925), reused 1293996 (delta 1036311), pack-reused 0
	Removing duplicate objects: 100% (256/256), done.
	commit-graph requires overflow generation data but has none
	commit-graph requires overflow generation data but has none
        [... these 2x warnings came from "git prune" ...]
	Checking connectivity: 10359, done.
	Expanding reachable commits in commit graph: 344133, done.
	commit-graph requires overflow generation data but has none
	Computing commit changed paths Bloom filters: 100% (344133/344133), done.
	Writing out commit graph in 7 passes: 100% (2408931/2408931), done.

If anyone wants to run with it the below hack has my SOB.

diff --git a/commit-graph.c b/commit-graph.c
index 92d45033366..58e238aaa57 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -202,6 +202,20 @@ static struct commit_graph *alloc_commit_graph(void)
 
 extern int read_replace_refs;
 
+static int verify_commit_graph_error;
+
+__attribute__((format (printf, 1, 2)))
+static void graph_report(const char *fmt, ...)
+{
+	va_list ap;
+
+	verify_commit_graph_error = 1;
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	fprintf(stderr, "\n");
+	va_end(ap);
+}
+
 static int commit_graph_compatible(struct repository *r)
 {
 	if (!r->gitdir)
@@ -220,6 +234,9 @@ static int commit_graph_compatible(struct repository *r)
 	if (is_repository_shallow(r))
 		return 0;
 
+	if (verify_commit_graph_error) /* TODO: will be stale? */
+		return 0;
+
 	return 1;
 }
 
@@ -625,6 +642,10 @@ static int prepare_commit_graph(struct repository *r)
 {
 	struct object_directory *odb;
 
+	if (verify_commit_graph_error) /* TODO: will be stale? */
+		return 0;
+
+
 	/*
 	 * Early return if there is no git dir or if the commit graph is
 	 * disabled.
@@ -766,7 +787,7 @@ static struct commit_list **insert_parent_or_die(struct repository *r,
 	return &commit_list_insert(c, pptr)->next;
 }
 
-static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, uint32_t pos)
+static int fill_commit_graph_info(struct commit *item, struct commit_graph *g, uint32_t pos)
 {
 	const unsigned char *commit_data;
 	struct commit_graph_data *graph_data;
@@ -776,8 +797,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 	while (pos < g->num_commits_in_base)
 		g = g->base_graph;
 
-	if (pos >= g->num_commits + g->num_commits_in_base)
-		die(_("invalid commit position. commit-graph is likely corrupt"));
+	if (pos >= g->num_commits + g->num_commits_in_base) {
+		graph_report(_("invalid commit position. commit-graph is likely corrupt"));
+		return 0;
+	}
 
 	lex_index = pos - g->num_commits_in_base;
 	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
@@ -793,8 +816,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 		offset = (timestamp_t)get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index);
 
 		if (offset & CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW) {
-			if (!g->chunk_generation_data_overflow)
-				die(_("commit-graph requires overflow generation data but has none"));
+			if (!g->chunk_generation_data_overflow) {
+				graph_report(_("commit-graph requires overflow generation data but has none"));
+				return 0;
+			}
 
 			offset_pos = offset ^ CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW;
 			graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
@@ -805,6 +830,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 
 	if (g->topo_levels)
 		*topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2;
+
+	return 1;
 }
 
 static inline void set_commit_tree(struct commit *c, struct tree *t)
@@ -825,7 +852,8 @@ static int fill_commit_in_graph(struct repository *r,
 	while (pos < g->num_commits_in_base)
 		g = g->base_graph;
 
-	fill_commit_graph_info(item, g, pos);
+	if (!fill_commit_graph_info(item, g, pos))
+		return 0;
 
 	lex_index = pos - g->num_commits_in_base;
 	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
@@ -946,6 +974,8 @@ int parse_commit_in_graph(struct repository *r, struct commit *item)
 void load_commit_graph_info(struct repository *r, struct commit *item)
 {
 	uint32_t pos;
+	if (verify_commit_graph_error) /* TODO: will be stale? */
+		return;
 	if (!prepare_commit_graph(r))
 		return;
 	if (find_commit_pos_in_graph(item, r->objects->commit_graph, &pos))
@@ -2444,19 +2474,6 @@ int write_commit_graph(struct object_directory *odb,
 }
 
 #define VERIFY_COMMIT_GRAPH_ERROR_HASH 2
-static int verify_commit_graph_error;
-
-__attribute__((format (printf, 1, 2)))
-static void graph_report(const char *fmt, ...)
-{
-	va_list ap;
-
-	verify_commit_graph_error = 1;
-	va_start(ap, fmt);
-	vfprintf(stderr, fmt, ap);
-	fprintf(stderr, "\n");
-	va_end(ap);
-}
 
 #define GENERATION_ZERO_EXISTS 1
 #define GENERATION_NUMBER_EXISTS 2
@@ -2510,9 +2527,11 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(r, g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit)) {
 			graph_report(_("failed to parse commit %s from commit-graph"),
 				     oid_to_hex(&cur_oid));
+			return verify_commit_graph_error;
+		}
 	}
 
 	while (cur_fanout_pos < 256) {



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
@ 2022-06-09 15:26     ` Jeff King
  2022-06-09 15:39       ` Derrick Stolee
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2022-06-09 15:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, git, Abhishek Kumar

On Thu, Jun 09, 2022 at 09:49:15AM +0200, Ævar Arnfjörð Bjarmason wrote:

> It's certainly interesting to see *how* we got to this state, but just
> so we're on the same page: I fundimentally don't think it matters to the
> *real* bug here.
> 
> Which is that at the very least f90fca638e9 (commit-graph: consolidate
> fill_commit_graph_info, 2021-01-16) and e8b63005c48 (commit-graph:
> implement generation data chunk, 2021-01-16) (CC'd author) have a bad
> regression on earlier fixes that read-only operations of the
> commit-graph *must not die*. I.e. the "parse" and "verify" paths of the
> commit-graph.c code shouldn't call exit(), die() etc.

Yeah, I'd agree that this is a good philosophy to follow. The
commit-graph data is meant to be an optimization, and we can always
continue without it.

> If you replace your graph with Jeff's corrupt one and run "git status",
> "git log" etc. it's still emitting one verbose complaint, but it no
> longer does so in loops (at least for these paths, but e.g. "git gc" is
> still doing that).
> 
> But it does get us to where we can run "git gc", and while complaining
> too much along the way will write out a new & valid commit graph at the
> end ("[... comments are mine"):

Yeah, getting through "git gc" is the key thing here. Then the problem
solves itself, sometimes even automatically (via auto-gc).

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-09 15:26     ` Jeff King
@ 2022-06-09 15:39       ` Derrick Stolee
  0 siblings, 0 replies; 13+ messages in thread
From: Derrick Stolee @ 2022-06-09 15:39 UTC (permalink / raw)
  To: Jeff King, Ævar Arnfjörð Bjarmason; +Cc: git, Abhishek Kumar

On 6/9/2022 11:26 AM, Jeff King wrote:
> On Thu, Jun 09, 2022 at 09:49:15AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
>> It's certainly interesting to see *how* we got to this state, but just
>> so we're on the same page: I fundimentally don't think it matters to the
>> *real* bug here.
>>
>> Which is that at the very least f90fca638e9 (commit-graph: consolidate
>> fill_commit_graph_info, 2021-01-16) and e8b63005c48 (commit-graph:
>> implement generation data chunk, 2021-01-16) (CC'd author) have a bad
>> regression on earlier fixes that read-only operations of the
>> commit-graph *must not die*. I.e. the "parse" and "verify" paths of the
>> commit-graph.c code shouldn't call exit(), die() etc.
> 
> Yeah, I'd agree that this is a good philosophy to follow. The
> commit-graph data is meant to be an optimization, and we can always
> continue without it.

I agree that the die() is part of what is frustrating here, but we need
to be careful: when we recognize that the commit-graph data is erroneous
_at this stage_ we may have already made decisions based on the existence
of a commit-graph (such as "we should trust generation numbers" or "we
have parsed some of the commits using the commit-graph") and so we cannot
guarantee that the process will complete with correct results from this
point.
 
>> If you replace your graph with Jeff's corrupt one and run "git status",
>> "git log" etc. it's still emitting one verbose complaint, but it no
>> longer does so in loops (at least for these paths, but e.g. "git gc" is
>> still doing that).
>>
>> But it does get us to where we can run "git gc", and while complaining
>> too much along the way will write out a new & valid commit graph at the
>> end ("[... comments are mine"):
> 
> Yeah, getting through "git gc" is the key thing here. Then the problem
> solves itself, sometimes even automatically (via auto-gc).

Perhaps we could change the die() behavior to be a warning() plus a
reset of all commit data if we are in a mode that can handle that error.
Specifically, during a commit-graph write.

I think the current die() behavior is the safest thing to do right now
until someone has time to think through these scenarios carefully.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-06-08 23:17   ` Jeff King
@ 2022-07-01 12:06     ` Patrick Steinhardt
  2022-07-04 10:46       ` Patrick Steinhardt
  0 siblings, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2022-07-01 12:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Derrick Stolee, git

[-- Attachment #1: Type: text/plain, Size: 3983 bytes --]

On Wed, Jun 08, 2022 at 07:17:39PM -0400, Jeff King wrote:
> On Wed, Jun 08, 2022 at 04:08:03PM -0400, Derrick Stolee wrote:
> 
> > I'd love to see the full binary, but for the sake of sharing on the
> > list, could you give the following output?
> > 
> > 	xxd .git/objects/info/commit-graph | head
> > 
> > or any other command that shows the first few hex bytes along with
> > their ASCII equivalents. Here is one that used Git 2.34.0:
> > [...]
> 
> Interesting. My earlier email was a bit misleading. I do in fact have a
> GDA2 chunk. And looking at the timestamp on the commit-graph file, it's
> from May 24th. I hadn't been keeping the repo up to date regularly, but
> I did occasionally pull and rebuild. So I think it was a much more
> recent version of Git that built the problematic file, though it's
> possible it was carrying forward bad data.
> 
> So 6dbf4b8172ef may be a bit of a red herring, if the file has a GDA2
> section that was simply ignored before that commit.
> 
> Looking at my reflog, my best guess for the version of Git that produced
> the file is e46751e96fa.
> 
> > However, the lack of the large offset chunk could be due to the bug fixed by
> > 75979d9460 (commit-graph: fix ordering bug in generation numbers,
> > 2022-03-01). Perhaps that was the thing that was missing from your version?
> 
> So I _think_ I would have had that, though there's a good chance that an
> older version of the commit-graph file was written using a version of
> Git without it.
> 
> > But otherwise, I'm stumped. I'd be very interested to see a repro from a
> > fresh repository. That is: what situation do we need to be in to write such
> > an offset without including the large offset chunk?
> 
> Not exactly a fresh reproduction, but you can grab my broken file from:
> 
>   https://peff.net/tmp/broken-commit-graph
> 
> Dropping it into a fresh clone of git.git shows the problem.
> 
> I tried a few obvious from-scratch reproductions like building a file
> with 75979d9460^ (so with the generation number bug), and then jumping
> forward to e46751e96fa (so bug fixed, but now we write GDA2), but
> couldn't get it to trigger.
> 
> It may not be worth spending too much time on, if this is a weird
> one-off caused by a mix of buggy unreleased versions of Git. If real
> users aren't seeing it, and we know the nuclear option is "rm
> commit-graph", then that may be enough.
> 
> -Peff

I have also repeatedly run into the same problem. I had already
discussed this with Derrick in the past in [1], but back then we also
declared bancruptcy and said that this seems to only be caused by some
weird in-between states of Git versions.

I have experienced the issue again in git.git now, again without having
a clue how I arrived at that state. The funny thing is that I explicitly
tried to reproduce the error in that repo a few days ago, without any
success at all, by writing commit-graphs with different Git versions.
Only today when I got back to it completely unsuspecting did Git start
to complain.

But more imporantly, we started to see the issue in one of our repos in
our staging systems as well [2], where we're currently running with a
mixture of Git v2.35.1 and v2.36.1 with a small set of patches on top of
them. None of the patches are related to commit-graphs though. The repo
in question is a pooled repository (like last time I reported the bug),
where the pool itself has a single commit-graph with GDAT chunks and the
pool member has a single commit-graph with GDA2 chunks.

I spent a lot of time today to try and come up with a reproducer to get
to this state from a clean repo, but again with no success so far. Also,
staring at the code for extended periods of time didn't result in any
insights.

This issue continues to puzzle me.

Patrick

[1]: http://public-inbox.org/git/Yh3rZX6cJpkHmRZc@ncase/
[2]: https://gitlab.com/gitlab-org/gitlab/-/issues/365903

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-01 12:06     ` Patrick Steinhardt
@ 2022-07-04 10:46       ` Patrick Steinhardt
  2022-07-04 20:50         ` Derrick Stolee
  0 siblings, 1 reply; 13+ messages in thread
From: Patrick Steinhardt @ 2022-07-04 10:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Derrick Stolee, git

[-- Attachment #1: Type: text/plain, Size: 5477 bytes --]

On Fri, Jul 01, 2022 at 02:07:03PM +0200, Patrick Steinhardt wrote:
> On Wed, Jun 08, 2022 at 07:17:39PM -0400, Jeff King wrote:
> > On Wed, Jun 08, 2022 at 04:08:03PM -0400, Derrick Stolee wrote:
> > 
> > > I'd love to see the full binary, but for the sake of sharing on the
> > > list, could you give the following output?
> > > 
> > > 	xxd .git/objects/info/commit-graph | head
> > > 
> > > or any other command that shows the first few hex bytes along with
> > > their ASCII equivalents. Here is one that used Git 2.34.0:
> > > [...]
> > 
> > Interesting. My earlier email was a bit misleading. I do in fact have a
> > GDA2 chunk. And looking at the timestamp on the commit-graph file, it's
> > from May 24th. I hadn't been keeping the repo up to date regularly, but
> > I did occasionally pull and rebuild. So I think it was a much more
> > recent version of Git that built the problematic file, though it's
> > possible it was carrying forward bad data.
> > 
> > So 6dbf4b8172ef may be a bit of a red herring, if the file has a GDA2
> > section that was simply ignored before that commit.
> > 
> > Looking at my reflog, my best guess for the version of Git that produced
> > the file is e46751e96fa.
> > 
> > > However, the lack of the large offset chunk could be due to the bug fixed by
> > > 75979d9460 (commit-graph: fix ordering bug in generation numbers,
> > > 2022-03-01). Perhaps that was the thing that was missing from your version?
> > 
> > So I _think_ I would have had that, though there's a good chance that an
> > older version of the commit-graph file was written using a version of
> > Git without it.
> > 
> > > But otherwise, I'm stumped. I'd be very interested to see a repro from a
> > > fresh repository. That is: what situation do we need to be in to write such
> > > an offset without including the large offset chunk?
> > 
> > Not exactly a fresh reproduction, but you can grab my broken file from:
> > 
> >   https://peff.net/tmp/broken-commit-graph
> > 
> > Dropping it into a fresh clone of git.git shows the problem.
> > 
> > I tried a few obvious from-scratch reproductions like building a file
> > with 75979d9460^ (so with the generation number bug), and then jumping
> > forward to e46751e96fa (so bug fixed, but now we write GDA2), but
> > couldn't get it to trigger.
> > 
> > It may not be worth spending too much time on, if this is a weird
> > one-off caused by a mix of buggy unreleased versions of Git. If real
> > users aren't seeing it, and we know the nuclear option is "rm
> > commit-graph", then that may be enough.
> > 
> > -Peff
> 
> I have also repeatedly run into the same problem. I had already
> discussed this with Derrick in the past in [1], but back then we also
> declared bancruptcy and said that this seems to only be caused by some
> weird in-between states of Git versions.
> 
> I have experienced the issue again in git.git now, again without having
> a clue how I arrived at that state. The funny thing is that I explicitly
> tried to reproduce the error in that repo a few days ago, without any
> success at all, by writing commit-graphs with different Git versions.
> Only today when I got back to it completely unsuspecting did Git start
> to complain.
> 
> But more imporantly, we started to see the issue in one of our repos in
> our staging systems as well [2], where we're currently running with a
> mixture of Git v2.35.1 and v2.36.1 with a small set of patches on top of
> them. None of the patches are related to commit-graphs though. The repo
> in question is a pooled repository (like last time I reported the bug),
> where the pool itself has a single commit-graph with GDAT chunks and the
> pool member has a single commit-graph with GDA2 chunks.
> 
> I spent a lot of time today to try and come up with a reproducer to get
> to this state from a clean repo, but again with no success so far. Also,
> staring at the code for extended periods of time didn't result in any
> insights.
> 
> This issue continues to puzzle me.
> 
> Patrick
> 
> [1]: http://public-inbox.org/git/Yh3rZX6cJpkHmRZc@ncase/
> [2]: https://gitlab.com/gitlab-org/gitlab/-/issues/365903

While I still haven't been able to reproduce the error, I did find a
different error. Here's the reproducer, which works with Git v2.37.0 and
older:

```
+ rm -rf /tmp/repo
+ git init /tmp/repo
Initialized empty Git repository in /tmp/repo/.git/
+ cd /tmp/repo/
+ GIT_COMMITTER_DATE='2000-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main (root-commit) 62ebc8d] x
+ git branch other
+ GIT_COMMITTER_DATE='1970-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main c628d6d] x
+ GIT_COMMITTER_DATE='2040-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[main 0d73218] x
+ git commit-graph write --reachable --split=replace
+ git switch other
Switched to branch 'other'
+ GIT_COMMITTER_DATE='2000-01-01T00:00:00 +0100'
+ git commit --allow-empty -mx
[other 7d03e12] x
+ git commit-graph write --reachable --split=replace
+ git commit-graph verify
commit date for commit c628d6dc7292b6d481f0ec4ed39ed2bb4a8cff49 in commit-graph is 17179865584 != 18446744073709548016
Verifying commits in commit graph: 100% (4/4), done.
```

I may have a look at a later point at what's happening, but for the time
being I'll continue to hunt down the other bug. Still wanted to document
my finding out here.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-04 10:46       ` Patrick Steinhardt
@ 2022-07-04 20:50         ` Derrick Stolee
  2022-07-05 21:03           ` Will Chandler
  2022-07-06  9:11           ` Jeff King
  0 siblings, 2 replies; 13+ messages in thread
From: Derrick Stolee @ 2022-07-04 20:50 UTC (permalink / raw)
  To: Patrick Steinhardt, Jeff King; +Cc: git

On 7/4/22 4:46 AM, Patrick Steinhardt wrote:
> On Fri, Jul 01, 2022 at 02:07:03PM +0200, Patrick Steinhardt wrote:

> While I still haven't been able to reproduce the error, I did find a
> different error. Here's the reproducer, which works with Git v2.37.0 and
> older:


Hi Patrick. Thanks for taking a close look. There is
an issue here, and it's due to using a negative
timestamp:

> + GIT_COMMITTER_DATE='1970-01-01T00:00:00 +0100'

Because of the "+0100" in the time zone, this date
becomes a negative value. The commit-graph does not
store dates with more than 34 bits (and Git does
not handle negative timestamps very well? Peff can
clarify here).

The commit-graph could certainly warn better here to
say we do not have enough date bits to store this
timestamp (the same would happen with a date beyond
2138 or something like that).

However, this is a failure since the commit-graph first
started parsing dates in 177722b3442 (commit: integrate
commit graph with commit parsing, 2018-04-10).

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-04 20:50         ` Derrick Stolee
@ 2022-07-05 21:03           ` Will Chandler
  2022-07-05 22:28             ` Taylor Blau
  2022-07-06  9:11           ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Will Chandler @ 2022-07-05 21:03 UTC (permalink / raw)
  To: derrickstolee; +Cc: git, peff, ps

Hi all,

I think I've reproduced this problem on v2.36.1 and main.

First, remove any existing commit graphs:

  $ rm -rf .git/objects/info/commit-graph*

Generate a commit graph without GDA2:

  $ git -c commitGraph.generationVersion=1 commit-graph write

Peff's bad commit graph included bloom filters, so let's update the
commit graph to include them and add GDA2:

  $ git -c commitGraph.generationVersion=2 commit-graph write --changed-paths

The commit graph is now broken:

  $ git rev-list --all
  fatal: commit-graph requires overflow generation data but has none

The header has a GDA2 section, but no GDO2:

  $ xxd .git/objects/info/commit-graphs/*graph | head
  00000000: 4347 5048 0101 0700 4f49 4446 0000 0000  CGPH....OIDF....
  00000010: 0000 0068 4f49 444c 0000 0000 0000 0468  ...hOIDL.......h
  00000020: 4344 4154 0000 0000 0015 43b4 4744 4132  CDAT......C.GDA2
  00000030: 0000 0000 003b 8270 4544 4745 0000 0000  .....;.pEDGE....
  00000040: 003f c24c 4249 4458 0000 0000 003f c3cc  .?.LBIDX.....?..
  00000050: 4244 4154 0000 0000 0044 03a8 0000 0000  BDAT.....D......
  00000060: 0000 0000 0049 79df 0000 010e 0000 0216  .....Iy.........
  00000070: 0000 031f 0000 042f 0000 0535 0000 0623  ......./...5...#
  00000080: 0000 0730 0000 086d 0000 096d 0000 0a8c  ...0...m...m....
  00000090: 0000 0b8e 0000 0c96 0000 0d9c 0000 0ea2  ................

Viewing the GDA2 block, the overflow bit is set for virtually all of the commits:

  $ xxd -s 0x003b8270 .git/objects/info/commit-graphs/*.graph | head
  003b8270: 8000 0000 8000 0001 8000 0002 8000 0003  ................
  003b8280: 8000 0004 8000 0005 8000 0006 8000 0007  ................
  003b8290: 8000 0008 8000 0009 8000 000a 8000 000b  ................
  003b82a0: 8000 000c 8000 000d 8000 000e 8000 000f  ................
  003b82b0: 8000 0010 8000 0011 8000 0012 8000 0013  ................
  003b82c0: 8000 0014 8000 0015 8000 0016 8000 0017  ................
  003b82d0: 8000 0018 8000 0019 8000 001a 8000 001b  ................
  003b82e0: 8000 001c 8000 001d 8000 001e 8000 001f  ................
  003b82f0: 8000 0020 8000 0021 8000 0022 8000 0023  ... ...!..."...#
  003b8300: 8000 0024 8000 0025 8000 0026 8000 0027  ...$...%...&...'

Stepping though, it looks like this is the scenario:

compute_generation_numbers correctly calculates the generation, e.g.:

  {
    graph_pos = 0,
    generation = 1502820840
  }

Because the existing commit graph did not include generation data,
graph.read_generation_data is 0. When compute_bloom_filters executes, it
will call fill_commit_graph_info which checks that value and falls back
to the older generation calculation if false:

  if (g->read_generation_data) {
  	offset = (timestamp_t)get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index);

  	if (offset & CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW) {
  		if (!g->chunk_generation_data_overflow)
  			die(_("commit-graph requires overflow generation data but has none"));

  		offset_pos = offset ^ CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW;
  		graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
  	} else
  		graph_data->generation = item->date + offset;
  } else
  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;

This re-writes the commit data to:

  {
    graph_pos = 0,
    generation = 17631
  }

The smaller generation value underflows when the commit date is
subtracted from it by write_graph_chunk_generation_data, causing the
overflow flag to be set:

  static int write_graph_chunk_generation_data(struct hashfile *f,
					     void *data)
  {
  	struct write_commit_graph_context *ctx = data;
  	int i, num_generation_data_overflows = 0;

  	for (i = 0; i < ctx->commits.nr; i++) {
  		struct commit *c = ctx->commits.list[i];
  		timestamp_t offset;
  		repo_parse_commit(ctx->r, c);
  		offset = commit_graph_data_at(c)->generation - c->date;
  		display_progress(ctx->progress, ++ctx->progress_cnt);

  		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) {
  			offset = CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW | num_generation_data_overflows;
  			num_generation_data_overflows++;
  		}

  		hashwrite_be32(f, offset);
  	}

  	return 0;
  }

We've already decided by this point to skip writing the overflow chunk
based on compute_generation_numbers, so the commit graph is now
invalid.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-05 21:03           ` Will Chandler
@ 2022-07-05 22:28             ` Taylor Blau
  2022-07-06  8:52               ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2022-07-05 22:28 UTC (permalink / raw)
  To: Will Chandler; +Cc: derrickstolee, git, peff, ps

On Tue, Jul 05, 2022 at 05:03:44PM -0400, Will Chandler wrote:
> Hi all,
>
> I think I've reproduced this problem on v2.36.1 and main.

Great find. I played around with this a little bit locally and your
reproduction works for me on any sufficiently-large repository. Looking
around in the code, I'm pretty confident that this is the issue.

>   $ git -c commitGraph.generationVersion=2 commit-graph write --changed-paths

This step is critical: without it we never end up overwriting the
`->generation` data, and the conversion works fine.

> Because the existing commit graph did not include generation data,
> graph.read_generation_data is 0. When compute_bloom_filters executes, it
> will call fill_commit_graph_info which checks that value and falls back
> to the older generation calculation if false:
>
>   if (g->read_generation_data) {
>   	offset = (timestamp_t)get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index);
>
>   	if (offset & CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW) {
>   		if (!g->chunk_generation_data_overflow)
>   			die(_("commit-graph requires overflow generation data but has none"));
>
>   		offset_pos = offset ^ CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW;
>   		graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
>   	} else
>   		graph_data->generation = item->date + offset;
>   } else
>   	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
>
> This re-writes the commit data to:
>
>   {
>     graph_pos = 0,
>     generation = 17631
>   }

Nicely explained. To me, it seems like the problem is that we're reusing
the same slab to:

  - store data that we're going to write as a part of commit-graph
    generation, and
  - store auxiliary data about commits that we have *read* from a
    commit-graph

A complete fix might be to use a separate slab to store data that we
read from data that we are about to write, to avoid polluting the latter
with the former.

In the meantime, a more minimal fix would be to avoid reading and
overwriting the generation data where we can avoid it. AFAICT, this is
the only spot that would need to be changed. The following patch does
the trick for me:

--- >8 ---

diff --git a/bloom.c b/bloom.c
index 5e297038bb..863052fa68 100644
--- a/bloom.c
+++ b/bloom.c
@@ -30,10 +30,9 @@ static inline unsigned char get_bitmask(uint32_t pos)

 static int load_bloom_filter_from_graph(struct commit_graph *g,
 					struct bloom_filter *filter,
-					struct commit *c)
+					uint32_t graph_pos)
 {
 	uint32_t lex_pos, start_index, end_index;
-	uint32_t graph_pos = commit_graph_position(c);

 	while (graph_pos < g->num_commits_in_base)
 		g = g->base_graph;
@@ -203,9 +202,11 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
 	filter = bloom_filter_slab_at(&bloom_filters, c);

 	if (!filter->data) {
-		load_commit_graph_info(r, c);
-		if (commit_graph_position(c) != COMMIT_NOT_FROM_GRAPH)
-			load_bloom_filter_from_graph(r->objects->commit_graph, filter, c);
+		uint32_t pos;
+		if ((repo_find_commit_pos_in_graph(r, c, &pos)) &&
+		    (pos != COMMIT_NOT_FROM_GRAPH))
+			load_bloom_filter_from_graph(r->objects->commit_graph,
+						     filter, pos);
 	}

 	if (filter->data && filter->len)
diff --git a/commit-graph.c b/commit-graph.c
index 92d4503336..2d9fad5910 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -889,6 +889,14 @@ static int find_commit_pos_in_graph(struct commit *item, struct commit_graph *g,
 	}
 }

+int repo_find_commit_pos_in_graph(struct repository *r, struct commit *c,
+				  uint32_t *pos)
+{
+	if (!prepare_commit_graph(r))
+		return 0;
+	return find_commit_pos_in_graph(c, r->objects->commit_graph, pos);
+}
+
 struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id)
 {
 	struct commit *commit;
diff --git a/commit-graph.h b/commit-graph.h
index 2e3ac35237..52ddfbe349 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -40,6 +40,9 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
  */
 int parse_commit_in_graph(struct repository *r, struct commit *item);

+int repo_find_commit_pos_in_graph(struct repository *r, struct commit *c,
+				  uint32_t *pos);
+
 /*
  * Look up the given commit ID in the commit-graph. This will only return a
  * commit if the ID exists both in the graph and in the object database such

--- 8< ---

Thanks,
Taylor

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-05 22:28             ` Taylor Blau
@ 2022-07-06  8:52               ` Jeff King
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff King @ 2022-07-06  8:52 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Will Chandler, derrickstolee, git, ps

On Tue, Jul 05, 2022 at 06:28:47PM -0400, Taylor Blau wrote:

> > This re-writes the commit data to:
> >
> >   {
> >     graph_pos = 0,
> >     generation = 17631
> >   }
> 
> Nicely explained. To me, it seems like the problem is that we're reusing
> the same slab to:
> 
>   - store data that we're going to write as a part of commit-graph
>     generation, and
>   - store auxiliary data about commits that we have *read* from a
>     commit-graph
> 
> A complete fix might be to use a separate slab to store data that we
> read from data that we are about to write, to avoid polluting the latter
> with the former.

Yeah, that was my first instinct, too. But from a skim of the code, I
think that may be complicated, as the two uses don't seem to be
differentiated well. Or at least it resisted some basic hacky efforts I
made (e.g., splitting commit_graph_data_at() from the more limited
readers). I'm not that familiar with the commit-graph writing code; I
suspect Stolee might have something intelligent to say.

> In the meantime, a more minimal fix would be to avoid reading and
> overwriting the generation data where we can avoid it. AFAICT, this is
> the only spot that would need to be changed. The following patch does
> the trick for me:

Yeah, it fixes it for me, too, but it leaves me with many questions. ;)
Specifically:

  - if it's so easy to get the position, why do we have the position in
    the slab in the first place? Or is it for the cases where we're
    writing (though then the question is: why do we fill it for the read
    case, then)?

  - if I understand correctly, the real sin here was calling
    commit_graph_position() during the write. How can we find
    other potential problem spots? From my (again, admittedly not well
    developed) view of the code, it seems like an intentional choice
    that you could intermingle computed and from-disk results.

At any rate, thanks to both you and Will for moving this forward.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: commit-graph overflow generation chicken and egg
  2022-07-04 20:50         ` Derrick Stolee
  2022-07-05 21:03           ` Will Chandler
@ 2022-07-06  9:11           ` Jeff King
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff King @ 2022-07-06  9:11 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Patrick Steinhardt, git

On Mon, Jul 04, 2022 at 02:50:57PM -0600, Derrick Stolee wrote:

> > + GIT_COMMITTER_DATE='1970-01-01T00:00:00 +0100'
> 
> Because of the "+0100" in the time zone, this date
> becomes a negative value. The commit-graph does not
> store dates with more than 34 bits (and Git does
> not handle negative timestamps very well? Peff can
> clarify here).

I don't think we handle them at all. Certainly we'll refuse to parse
any, but here we get tricked by the timezone, and end up with the
underflow. Git writes 2^64 - 3600 into the commit object with this one.

The code in gm_time_t() tries to catch this (as well as overflow), but
this one gets handled by parse_date_basic(), which doesn't. Adding this
causes it barf:

diff --git a/date.c b/date.c
index 68a260c214..07a60fee35 100644
--- a/date.c
+++ b/date.c
@@ -893,8 +893,11 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset)
 		}
 	}
 
-	if (!tm_gmt)
+	if (!tm_gmt) {
+		if (*timestamp < *offset * 60)
+			return -1;
 		*timestamp -= *offset * 60;
+	}
 	return 0; /* success */
 }
 

We should probably check for overflow there, as well, though I think in
practice it would get caught earlier. The timestamp variable comes from
tm_to_time_t(), which already has some limits.

Of course the right fix is for us to handle negative timestamps. I have
some patches working towards that, but...

> The commit-graph could certainly warn better here to
> say we do not have enough date bits to store this
> timestamp (the same would happen with a date beyond
> 2138 or something like that).

Right. By the time it gets to the commit-graph, it's a gigantic positive
time. But either way, yes, it should realize it's not representable and
omit the commit from the graph file.

That's something I haven't looked at for my negative timestamp patches
at all. I think it's OK for the commit-graph to simply punt on them, as
long as it doesn't die or produce wrong answers.

-Peff

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-07-06  9:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-08 19:33 commit-graph overflow generation chicken and egg Jeff King
2022-06-08 20:08 ` Derrick Stolee
2022-06-08 23:17   ` Jeff King
2022-07-01 12:06     ` Patrick Steinhardt
2022-07-04 10:46       ` Patrick Steinhardt
2022-07-04 20:50         ` Derrick Stolee
2022-07-05 21:03           ` Will Chandler
2022-07-05 22:28             ` Taylor Blau
2022-07-06  8:52               ` Jeff King
2022-07-06  9:11           ` Jeff King
2022-06-09  7:49   ` Ævar Arnfjörð Bjarmason
2022-06-09 15:26     ` Jeff King
2022-06-09 15:39       ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).