git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Mike Hommey <mh@glandium.org>
Cc: git@vger.kernel.org, Shawn Pearce <spearce@spearce.org>,
	Jonathan Nieder <jrnieder@gmail.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH] fast-import: Do less work when given "from" matches current branch head
Date: Thu, 09 Jul 2015 13:37:02 -0700	[thread overview]
Message-ID: <xmqqk2u9uky9.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1436424609-26159-1-git-send-email-mh@glandium.org> (Mike Hommey's message of "Thu, 9 Jul 2015 15:50:09 +0900")

Mike Hommey <mh@glandium.org> writes:

Cc'ed a few people who appear at the top of "shortlog --no-merges";
I think the end result is not incorrect, but I want to hear second
opinions on this one.  I do not know Shawn still remembers this
code, but what is under discussion seems to have come mostly from
ea5e370a (fast-import: Support reusing 'from' and brown paper bag
fix reset., 2007-02-12).

>  	if (!skip_prefix(command_buf.buf, "from ", &from))
>  		return 0;
>  
> -	if (b->branch_tree.tree) {
> -		release_tree_content_recursive(b->branch_tree.tree);
> -		b->branch_tree.tree = NULL;
> -	}
> +	hashcpy(sha1, b->branch_tree.versions[1].sha1);
>  
>  	s = lookup_branch(from);
>  	if (b == s)

The part that deals with a branch that is different from the current
one is not visible in the context (i.e. when s = lookup_branch(from)
returned a non-NULL result that is different from b) but it used to,
and continues to with this patch, copy sha1 from branch_tree.sha1
and branch_tree.versions[] from sha1 and branch_tree.versions[1] of
the specified branch.

That codepath used to release the contents of branch_tree.tree when
it did so, but it no longer does so after this patch because of the
removal we see above.

Does that mean the original code was doing a release that was
unnecessary?  Or does it mean this patch changes what happens on
that codepath, namely (1) leaking resource, and/or (2) keeping a
tree of the original 'b' that does not have anything to do with the
tree of 's', preventing the later lazy-load code from reading the
tree of 's' and instead of building on top of a wrong tree content?

... me goes and reads on ...

> @@ -2610,14 +2608,16 @@ static int parse_from(struct branch *b)
>  		struct object_entry *oe = find_mark(idnum);
>  		if (oe->type != OBJ_COMMIT)
>  			die("Mark :%" PRIuMAX " not a commit", idnum);
> -		hashcpy(b->sha1, oe->idx.sha1);
> -		if (oe->pack_id != MAX_PACK_ID) {
> -			unsigned long size;
> -			char *buf = gfi_unpack_entry(oe, &size);
> -			parse_from_commit(b, buf, size);
> -			free(buf);
> -		} else
> -			parse_from_existing(b);
> +		if (hashcmp(b->sha1, oe->idx.sha1)) {
> +			hashcpy(b->sha1, oe->idx.sha1);
> +			if (oe->pack_id != MAX_PACK_ID) {
> +				unsigned long size;
> +				char *buf = gfi_unpack_entry(oe, &size);
> +				parse_from_commit(b, buf, size);
> +				free(buf);
> +			} else
> +				parse_from_existing(b);
> +		}
>  	} else if (!get_sha1(from, b->sha1)) {
>  		parse_from_existing(b);
>  		if (is_null_sha1(b->sha1))

This part is straight-forward.

> @@ -2626,6 +2626,11 @@ static int parse_from(struct branch *b)
>  	else
>  		die("Invalid ref name or SHA1 expression: %s", from);
>  
> +	if (b->branch_tree.tree && hashcmp(sha1, b->branch_tree.versions[1].sha1)) {
> +		release_tree_content_recursive(b->branch_tree.tree);
> +		b->branch_tree.tree = NULL;
> +	}
> +

This looks like an attempt to compensate for that "what happens if
(s != NULL && s != b)?" issue, and also for the surviving codepaths.

As both parse_from_commit() and parse_from_existing() only touch
branch_tree.versions[] and they do not seem to get affected if
b->branch_tree.tree holds a stale and unrelated content, this looks
OK to me from a cursory reading, but it does make me feel dirty that
it has to put *b temporarily into an inconsistent state.

  reply	other threads:[~2015-07-09 20:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06 22:07 suboptimal behavior of fast-import in some cases with "from" Mike Hommey
2015-07-06 22:54 ` Junio C Hamano
2015-07-09  5:03   ` Mike Hommey
2015-07-09  5:52     ` Mike Hommey
2015-07-09  6:50       ` [PATCH] fast-import: Do less work when given "from" matches current branch head Mike Hommey
2015-07-09 20:37         ` Junio C Hamano [this message]
2015-07-09 22:30           ` Mike Hommey
2015-07-09 22:44             ` Junio C Hamano
2015-07-09 22:47               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqk2u9uky9.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=mh@glandium.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).