git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: Derrick Stolee <stolee@gmail.com>, Taylor Blau <me@ttaylorr.com>,
	Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, dstolee@microsoft.com
Subject: Re: [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference when merging
Date: Sun, 22 Mar 2020 09:47:49 -0600	[thread overview]
Message-ID: <20200322154749.GB53402@syl.local> (raw)
In-Reply-To: <20200322060434.GC578498@coredump.intra.peff.net>

On Sun, Mar 22, 2020 at 02:04:34AM -0400, Jeff King wrote:
> On Sun, Mar 22, 2020 at 01:49:16AM -0400, Jeff King wrote:
>
> > [1] I'm actually not quite sure about correctness here. It should be
> >     fine to generate a graph file without any given commit; readers will
> >     just have to load that commit the old-fashioned way. But at this
> >     phase of "commit-graph write", I think we'll already have done the
> >     close_reachable() check. What does it mean to throw away a commit at
> >     this stage? If we're the parent of another commit, then it will have
> >     trouble referring to us by a uint32_t. Will the actual writing phase
> >     barf, or will we generate an invalid graph file?
>
> It doesn't seem great. If I instrument Git like this to simulate an
> object temporarily "missing" (if it were really missing the whole repo
> would be corrupt; we're trying to see what would happen if a race causes
> us to momentarily not see it):

This is definitely a problem on either side of this patch, which is
demonstrated by the fact that you applied your changes without my patch
on top (and that my patch isn't changing anything substantial in this
area like removing the 'continue' statement).

Should we address this before moving on with my patch? I think that we
*could*, but I'd rather go forward with what we have for now, since it's
only improving the situation, and not introducing a new bug.

> diff --git a/commit-graph.c b/commit-graph.c
> index 3da52847e4..71419c2532 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1596,6 +1596,19 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  	}
>  }
>
> +static int pretend_commit_is_missing(const struct object_id *oid)
> +{
> +	static int initialized;
> +	static struct object_id missing;
> +	if (!initialized) {
> +		const char *x = getenv("PRETEND_COMMIT_IS_MISSING");
> +		if (x)
> +			get_oid_hex(x, &missing);
> +		initialized = 1;
> +	}
> +	return oideq(&missing, oid);
> +}
> +
>  static void merge_commit_graph(struct write_commit_graph_context *ctx,
>  			       struct commit_graph *g)
>  {
> @@ -1612,6 +1625,11 @@ static void merge_commit_graph(struct write_commit_graph_context *ctx,
>
>  		load_oid_from_graph(g, i + offset, &oid);
>
> +		if (pretend_commit_is_missing(&oid)) {
> +			warning("pretending %s is missing", oid_to_hex(&oid));
> +			continue;
> +		}
> +
>  		/* only add commits if they still exist in the repo */
>  		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
>
>
> and then I make a fully-graphed repo like this:
>
>   git init repo
>   cd repo
>   for i in $(seq 10); do
>     git commit --allow-empty -m $i
>   done
>   git commit-graph write --input=reachable --split=no-merge
>
> if we pretend a parent is missing, I get a BUG():
>
>   $ git rev-parse HEAD |
>     PRETEND_COMMIT_IS_MISSING=$(git rev-parse HEAD^) \
>     git commit-graph write --stdin-commits --split=merge-all
>   warning: pretending 35e6e15c738cf2bfbe495957b2a941c2efe86dd9 is missing
>   BUG: commit-graph.c:879: missing parent 35e6e15c738cf2bfbe495957b2a941c2efe86dd9 for commit d4141fb57a9bbe26b247f23c790d63d078977833
>   Aborted
>
> So it seems like just skipping here (either with the new patch or
> without) isn't really a good strategy.
>
> -Peff
Thanks,
Taylor

  reply	other threads:[~2020-03-22 15:47 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-21  3:44 [PATCH 0/1] commit-graph: avoid unnecessary tag deference when merging Taylor Blau
2020-03-21  3:44 ` [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference " Taylor Blau
2020-03-21  5:00   ` Jeff King
2020-03-21  6:11     ` Taylor Blau
2020-03-21  6:24       ` Taylor Blau
2020-03-21  7:03       ` Jeff King
2020-03-21 17:27         ` Taylor Blau
2020-03-22  5:36           ` Jeff King
2020-03-22 11:04             ` SZEDER Gábor
2020-03-22 18:45               ` looking up object types quickly, was " Jeff King
2020-03-22 19:18                 ` Jeff King
2020-03-23 20:15               ` Taylor Blau
2020-03-22 16:45             ` Taylor Blau
2020-03-24  6:06               ` Jeff King
2020-03-21 18:50         ` Junio C Hamano
2020-03-22  0:03           ` Derrick Stolee
2020-03-22  0:20             ` Taylor Blau
2020-03-22  0:23               ` Derrick Stolee
2020-03-22  5:49                 ` Jeff King
2020-03-22  6:04                   ` Jeff King
2020-03-22 15:47                     ` Taylor Blau [this message]
2020-03-24  6:11                       ` Jeff King
2020-03-24 23:08                         ` Taylor Blau
2020-03-27  8:42                           ` Jeff King
2020-03-27 15:03                             ` Taylor Blau
2020-03-22 15:44                   ` Taylor Blau
2020-03-24  6:14                     ` Jeff King
2020-03-21  5:01   ` Junio C Hamano
2020-03-21  4:56 ` [PATCH 0/1] commit-graph: avoid unnecessary tag deference " Junio C Hamano
2020-03-21  5:04   ` Jeff King
2020-03-21  6:12     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200322154749.GB53402@syl.local \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).