git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: "Alex Bennée" <kernel-hacker@bennee.com>
Cc: John Keeping <john@keeping.me.uk>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Poor performance of git describe in big repos
Date: Fri, 31 May 2013 10:24:44 +0200	[thread overview]
Message-ID: <87obbr5zg3.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <CAJ-05NOEuxOVy7LFp_XRa_08G-Mj0x7q+RiR=u71-iyfOXpHow@mail.gmail.com> ("Alex \=\?utf-8\?Q\?Benn\=C3\=A9e\=22's\?\= message of "Fri, 31 May 2013 09:14:49 +0100")

Alex Bennée <kernel-hacker@bennee.com> writes:

> On 30 May 2013 20:30, John Keeping <john@keeping.me.uk> wrote:
>> On Thu, May 30, 2013 at 06:21:55PM +0200, Thomas Rast wrote:
>>> Alex Bennée <kernel-hacker@bennee.com> writes:
>>>
>>> > On 30 May 2013 16:33, Thomas Rast <trast@inf.ethz.ch> wrote:
>>> >> Alex Bennée <kernel-hacker@bennee.com> writes:
>> <snip>
>>> > Will it be loading the blob for every commit it traverses or just ones that hit
>>> > a tag? Why does it need to load the blob at all? Surely the commit
>>> > tree state doesn't
>>> > need to be walked down?
>>>
>>> No, my theory is that you tagged *the blobs*.  Git supports this.
>
> Wait is this the difference between annotated and non-annotated tags?
> I thought a non-annotated just acted like references to a particular
> tree state?

A tag is just a ref.  It can point at anything, in particular also a
blob (= some file *contents*).

An annotated tag is just a tag pointing at a "tag object".  A tag object
contains tagger name/email/date, a reference to an object, and a tag
message.

The slowness I found relates to having tags that point at blobs directly
(unannotated).

>> You can see if that is the case by doing something like this:
>>
>>     eval $(git for-each-ref --shell --format '
>>         test $(git cat-file -t %(objectname)^{}) = commit ||
>>         echo %(refname);')
>>
>> That will print out the name of any ref that doesn't point at a
>> commit.
>
> Hmm that didn't seem to work. But looking at the output by hand I
> certainly have a mix of tags that are commits vs tags:
>
>
> 09:08 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags"
> | grep "commit" | wc -l
> 1345
> 09:12 ajb@sloy/x86_64 [work.git] >git for-each-ref | grep "refs/tags"
> | grep -v "commit" | wc -l
> 66
>
> Unfortunately I can't just delete those tags as they do refer to known
> releases which we obviously care about. If I delete the tags on my
> local repo and test for a speed increase can I re-create them as
> annotated tag objects?

I would be more interested in this:

  git for-each-ref | grep ' blob'

and

  (git for-each-ref | grep ' blob' | cut -d\  -f1 | xargs -n1 git cat-file blob) | wc -c

The first tells you if you have any refs pointing at blobs.  The second
computes their total unpacked size.  My theory is that the second yields
some large number (hundreds of megabytes at least).

It would be nice if you checked, because if there turn out to be big
blobs, we have all the pieces and just need to assemble the best
solution.  Otherwise, there's something else going on and the problem
remains open.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  reply	other threads:[~2013-05-31  8:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 10:38 Poor performance of git describe in big repos Alex Bennée
2013-05-30 11:33 ` Ramkumar Ramachandra
2013-05-30 13:09   ` Alex Bennée
2013-05-30 14:32     ` Ramkumar Ramachandra
2013-05-30 15:01       ` Alex Bennée
2013-05-30 15:17         ` Ramkumar Ramachandra
2013-05-30 15:33     ` Thomas Rast
2013-05-30 16:01       ` Alex Bennée
2013-05-30 16:21         ` Thomas Rast
2013-05-30 16:44           ` Thomas Rast
2013-05-30 19:01             ` Antoine Pelisse
2013-05-30 20:00             ` [PATCH 1/2] sha1_file: silence sha1_loose_object_info Thomas Rast
2013-05-30 20:00               ` [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit} Thomas Rast
2013-05-30 21:22                 ` Jeff King
2013-05-31  0:52                   ` Duy Nguyen
2013-05-31  8:08                   ` Thomas Rast
2013-05-31 16:00                     ` Jeff King
2013-05-31  6:43                 ` Ramkumar Ramachandra
2013-05-31  8:16                   ` Thomas Rast
2013-05-30 19:30           ` Poor performance of git describe in big repos John Keeping
2013-05-31  8:14             ` Alex Bennée
2013-05-31  8:24               ` Thomas Rast [this message]
2013-05-31  8:40                 ` Alex Bennée
2013-05-31  8:46                   ` Thomas Rast
2013-05-31  9:57                     ` Alex Bennée
2013-06-03  8:02                       ` Alex Bennée
2013-06-03 16:32                         ` Junio C Hamano
2013-06-03 17:48                           ` Junio C Hamano
2013-05-31 10:27                     ` Thomas Rast
2013-05-31 16:17                       ` Jeff King
2013-06-03  8:39                         ` Alex Bennée
2013-06-03 14:49                           ` Jeff King
2013-05-31  8:32               ` John Keeping
2013-05-31  8:49                 ` Alex Bennée
2013-05-31  8:59                   ` John Keeping
2013-05-30 11:48 ` John Keeping
2013-05-30 12:29   ` Alex Bennée
2013-05-30 13:20     ` Duy Nguyen
     [not found]       ` <CAJ-05NPacjAEC99Ntd9eMnTD9_PMMYFob-_tAx5CeSB79TkRSg@mail.gmail.com>
2013-05-30 13:45         ` Duy Nguyen
2013-05-30 14:02           ` Alex Bennée
2013-05-30 13:16   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87obbr5zg3.fsf@linux-k42r.v.cablecom.net \
    --to=trast@inf.ethz.ch \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=john@keeping.me.uk \
    --cc=kernel-hacker@bennee.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).