git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Ramkumar Ramachandra" <artagnon@gmail.com>,
	"Alex Bennée" <kernel-hacker@bennee.com>,
	"Antoine Pelisse" <apelisse@gmail.com>,
	"John Keeping" <john@keeping.me.uk>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}
Date: Fri, 31 May 2013 10:08:06 +0200	[thread overview]
Message-ID: <87sj138tcp.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <20130530212223.GA2135@sigill.intra.peff.net> (Jeff King's message of "Thu, 30 May 2013 17:22:23 -0400")

Jeff King <peff@peff.net> writes:

> On Thu, May 30, 2013 at 10:00:23PM +0200, Thomas Rast wrote:
>
>> lookup_commit_reference_gently unconditionally parses the object given
>> to it.  This slows down git-describe a lot if you have a repository
>> with large tagged blobs in it: parse_object() will read the entire
>> blob and verify that its sha1 matches, only to then throw it away.
>> 
>> Speed it up by checking the type with sha1_object_info() prior to
>> unpacking.
>
> This would speed up the case where we do not end up looking at the
> object at all, but it will slow down the (presumably common) case where
> we will in fact find a commit and end up parsing the object anyway.
>
> Have you measured the impact of this on normal operations? During a
> traversal, we spend a measurable amount of time looking up commits in
> packfiles, and this would presumably double it.

I don't think so, but admittedly I didn't measure it.

The reason why it's unlikely is that this is specific to
lookup_commit_reference_gently, which according to some grepping is
usually done on refs or values that refs might have; e.g. on the old&new
sides of a fetch in remote.c, or in many places in the callback of some
variant of for_each_ref.

Of course if you have a ridiculously large number of refs (and I gather
_you_ do), this will hurt somewhat in the usual case, but speed up the
case where there is a ref (usually a lightweight tag) directly pointing
at a large blob.

I'm not sure this can be fixed without the change you outline here:

> This is not the first time I have seen this tradeoff in git.  It would
> be nice if our object access was structured to do incremental
> examination of the objects (i.e., store the packfile index lookup or
> partial unpack of a loose object header, and then use that to complete
> the next step of actually getting the contents).

But in any case I see the point, I should try and gather some
performance numbers.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  parent reply	other threads:[~2013-05-31  8:08 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 10:38 Poor performance of git describe in big repos Alex Bennée
2013-05-30 11:33 ` Ramkumar Ramachandra
2013-05-30 13:09   ` Alex Bennée
2013-05-30 14:32     ` Ramkumar Ramachandra
2013-05-30 15:01       ` Alex Bennée
2013-05-30 15:17         ` Ramkumar Ramachandra
2013-05-30 15:33     ` Thomas Rast
2013-05-30 16:01       ` Alex Bennée
2013-05-30 16:21         ` Thomas Rast
2013-05-30 16:44           ` Thomas Rast
2013-05-30 19:01             ` Antoine Pelisse
2013-05-30 20:00             ` [PATCH 1/2] sha1_file: silence sha1_loose_object_info Thomas Rast
2013-05-30 20:00               ` [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit} Thomas Rast
2013-05-30 21:22                 ` Jeff King
2013-05-31  0:52                   ` Duy Nguyen
2013-05-31  8:08                   ` Thomas Rast [this message]
2013-05-31 16:00                     ` Jeff King
2013-05-31  6:43                 ` Ramkumar Ramachandra
2013-05-31  8:16                   ` Thomas Rast
2013-05-30 19:30           ` Poor performance of git describe in big repos John Keeping
2013-05-31  8:14             ` Alex Bennée
2013-05-31  8:24               ` Thomas Rast
2013-05-31  8:40                 ` Alex Bennée
2013-05-31  8:46                   ` Thomas Rast
2013-05-31  9:57                     ` Alex Bennée
2013-06-03  8:02                       ` Alex Bennée
2013-06-03 16:32                         ` Junio C Hamano
2013-06-03 17:48                           ` Junio C Hamano
2013-05-31 10:27                     ` Thomas Rast
2013-05-31 16:17                       ` Jeff King
2013-06-03  8:39                         ` Alex Bennée
2013-06-03 14:49                           ` Jeff King
2013-05-31  8:32               ` John Keeping
2013-05-31  8:49                 ` Alex Bennée
2013-05-31  8:59                   ` John Keeping
2013-05-30 11:48 ` John Keeping
2013-05-30 12:29   ` Alex Bennée
2013-05-30 13:20     ` Duy Nguyen
     [not found]       ` <CAJ-05NPacjAEC99Ntd9eMnTD9_PMMYFob-_tAx5CeSB79TkRSg@mail.gmail.com>
2013-05-30 13:45         ` Duy Nguyen
2013-05-30 14:02           ` Alex Bennée
2013-05-30 13:16   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sj138tcp.fsf@linux-k42r.v.cablecom.net \
    --to=trast@inf.ethz.ch \
    --cc=apelisse@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=john@keeping.me.uk \
    --cc=kernel-hacker@bennee.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).