git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Mike Hommey <mh@glandium.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: suboptimal behavior of fast-import in some cases with "from"
Date: Thu, 9 Jul 2015 14:03:15 +0900	[thread overview]
Message-ID: <20150709050314.GA6013@glandium.org> (raw)
In-Reply-To: <xmqq4mlgzyl0.fsf@gitster.dls.corp.google.com>

On Mon, Jul 06, 2015 at 03:54:35PM -0700, Junio C Hamano wrote:
> Mike Hommey <mh@glandium.org> writes:
> 
> > One of the first things parse_from does is unconditionally throw away
> > the tree for the given branch, and then the "from" tree is loaded. So
> > when the "from" commit is the current head of the branch, that make
> > fast-import do more work than necessary.
> 
> If it is very common that the next commit the input stream wants to
> create is often on top of the commit that was just created, and if
> it is very common that the input stream producer knows what the
> commit object name of the commit that was just created, then
> optimising for that case does not sound too bad.  It really depends
> on two factors:
> 
>  - How likely is it that other people make the same mistake?

Looks like the answer is: very. Assuming my quick glance at the code is
not mistaken, the same mistake is made in at least git-p4.py (yes, the
one that comes with git), felipec's git-remote-hg, and hg-fast-export,
and that's 100% of the sample I looked at.

I won't claim to know what fast-import is doing, not having looked at
more than the parse_from* functions and the commit message for 4cabf858,
but it seems plausible this also skips making tree deltas for those
trees.

>  - How bad the damage to parse_from() would be if we wanted to
>    optimize for this case?

I /think/ it would look like this (untested), which doesn't seem too
damaging:

diff --git a/fast-import.c b/fast-import.c
index e78ca10..cb232e0 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2588,14 +2588,12 @@ static int parse_from(struct branch *b)
 {
        const char *from;
        struct branch *s;
+       unsigned char sha1[20];
 
        if (!skip_prefix(command_buf.buf, "from ", &from))
                return 0;
 
-       if (b->branch_tree.tree) {
-               release_tree_content_recursive(b->branch_tree.tree);
-               b->branch_tree.tree = NULL;
-       }
+       hashcpy(sha1, b->branch_tree.versions[1].sha1);
 
        s = lookup_branch(from);
        if (b == s)
@@ -2626,6 +2624,11 @@ static int parse_from(struct branch *b)
        else
                die("Invalid ref name or SHA1 expression: %s", from);
 
+       if (b->branch_tree.tree && hashcmp(sha1, b->branch_tree.versions[1].sha1)) {
+               release_tree_content_recursive(b->branch_tree.tree);
+               b->branch_tree.tree = NULL;
+       }
+
        read_next_command();
        return 1;
 }

Mike

  reply	other threads:[~2015-07-09  5:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06 22:07 suboptimal behavior of fast-import in some cases with "from" Mike Hommey
2015-07-06 22:54 ` Junio C Hamano
2015-07-09  5:03   ` Mike Hommey [this message]
2015-07-09  5:52     ` Mike Hommey
2015-07-09  6:50       ` [PATCH] fast-import: Do less work when given "from" matches current branch head Mike Hommey
2015-07-09 20:37         ` Junio C Hamano
2015-07-09 22:30           ` Mike Hommey
2015-07-09 22:44             ` Junio C Hamano
2015-07-09 22:47               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150709050314.GA6013@glandium.org \
    --to=mh@glandium.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).