git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Alex Bennée" <kernel-hacker@bennee.com>
To: Thomas Rast <trast@inf.ethz.ch>
Cc: John Keeping <john@keeping.me.uk>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Poor performance of git describe in big repos
Date: Fri, 31 May 2013 09:40:01 +0100	[thread overview]
Message-ID: <CAJ-05NOdg5TvjzEMrXaPgogU5z5W6kywZhD-82eTUmvE9Hp=Lw@mail.gmail.com> (raw)
In-Reply-To: <87obbr5zg3.fsf@linux-k42r.v.cablecom.net>

On 31 May 2013 09:24, Thomas Rast <trast@inf.ethz.ch> wrote:
> Alex Bennée <kernel-hacker@bennee.com> writes:
>> On 30 May 2013 20:30, John Keeping <john@keeping.me.uk> wrote:
>>> On Thu, May 30, 2013 at 06:21:55PM +0200, Thomas Rast wrote:
>>>> Alex Bennée <kernel-hacker@bennee.com> writes:
>>>> > On 30 May 2013 16:33, Thomas Rast <trast@inf.ethz.ch> wrote:
> <snip>
>>>> No, my theory is that you tagged *the blobs*.  Git supports this.
>>
>> Wait is this the difference between annotated and non-annotated tags?
>> I thought a non-annotated just acted like references to a particular
>> tree state?
>
> A tag is just a ref.  It can point at anything, in particular also a
> blob (= some file *contents*).
>
> An annotated tag is just a tag pointing at a "tag object".  A tag object
> contains tagger name/email/date, a reference to an object, and a tag
> message.
>
> The slowness I found relates to having tags that point at blobs directly
> (unannotated).

I think you are right. I was brave (well I assumed the tags would come
back from the upstream repo) and ran:

git for-each-ref | grep "refs/tags" | grep "commit" | cut -d '/' -f 3
| xargs git tag -d

And boom:

09:19 ajb@sloy/x86_64 [work.git] >time /usr/bin/git --no-pager
describe --long --tags
ajb-build-test-5225-2-gdc0b771

real    0m0.009s
user    0m0.008s
sys     0m0.000s

Which is much better performance. So it does look like unannotated
tags pointing at binary blobs is the failure case.

<snip>
>
> I would be more interested in this:
>
>   git for-each-ref | grep ' blob'

Hmmm that gives nothing. All the refs are either tag or commit

> and
>
>   (git for-each-ref | grep ' blob' | cut -d\  -f1 | xargs -n1 git
>cat-file blob) | wc -c

However I have some big commits it seems:

09:37 ajb@sloy/x86_64 [work.git] >(git for-each-ref | grep ' commit' |
cut -d\  -f1 | xargs -n1 git cat-file commit) | wc -c
1147231984

>
> The first tells you if you have any refs pointing at blobs.  The second
> computes their total unpacked size.  My theory is that the second yields
> some large number (hundreds of megabytes at least).
>
> It would be nice if you checked, because if there turn out to be big
> blobs, we have all the pieces and just need to assemble the best
> solution.  Otherwise, there's something else going on and the problem
> remains open.

If you want any other numbers I'm only too happy to help. Sorry I
can't share the repo though...

-- 
Alex, homepage: http://www.bennee.com/~alex/

  reply	other threads:[~2013-05-31  8:40 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 10:38 Poor performance of git describe in big repos Alex Bennée
2013-05-30 11:33 ` Ramkumar Ramachandra
2013-05-30 13:09   ` Alex Bennée
2013-05-30 14:32     ` Ramkumar Ramachandra
2013-05-30 15:01       ` Alex Bennée
2013-05-30 15:17         ` Ramkumar Ramachandra
2013-05-30 15:33     ` Thomas Rast
2013-05-30 16:01       ` Alex Bennée
2013-05-30 16:21         ` Thomas Rast
2013-05-30 16:44           ` Thomas Rast
2013-05-30 19:01             ` Antoine Pelisse
2013-05-30 20:00             ` [PATCH 1/2] sha1_file: silence sha1_loose_object_info Thomas Rast
2013-05-30 20:00               ` [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit} Thomas Rast
2013-05-30 21:22                 ` Jeff King
2013-05-31  0:52                   ` Duy Nguyen
2013-05-31  8:08                   ` Thomas Rast
2013-05-31 16:00                     ` Jeff King
2013-05-31  6:43                 ` Ramkumar Ramachandra
2013-05-31  8:16                   ` Thomas Rast
2013-05-30 19:30           ` Poor performance of git describe in big repos John Keeping
2013-05-31  8:14             ` Alex Bennée
2013-05-31  8:24               ` Thomas Rast
2013-05-31  8:40                 ` Alex Bennée [this message]
2013-05-31  8:46                   ` Thomas Rast
2013-05-31  9:57                     ` Alex Bennée
2013-06-03  8:02                       ` Alex Bennée
2013-06-03 16:32                         ` Junio C Hamano
2013-06-03 17:48                           ` Junio C Hamano
2013-05-31 10:27                     ` Thomas Rast
2013-05-31 16:17                       ` Jeff King
2013-06-03  8:39                         ` Alex Bennée
2013-06-03 14:49                           ` Jeff King
2013-05-31  8:32               ` John Keeping
2013-05-31  8:49                 ` Alex Bennée
2013-05-31  8:59                   ` John Keeping
2013-05-30 11:48 ` John Keeping
2013-05-30 12:29   ` Alex Bennée
2013-05-30 13:20     ` Duy Nguyen
     [not found]       ` <CAJ-05NPacjAEC99Ntd9eMnTD9_PMMYFob-_tAx5CeSB79TkRSg@mail.gmail.com>
2013-05-30 13:45         ` Duy Nguyen
2013-05-30 14:02           ` Alex Bennée
2013-05-30 13:16   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ-05NOdg5TvjzEMrXaPgogU5z5W6kywZhD-82eTUmvE9Hp=Lw@mail.gmail.com' \
    --to=kernel-hacker@bennee.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=john@keeping.me.uk \
    --cc=trast@inf.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).