git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git performance problems with many tags
@ 2007-03-26  4:53 Tim Allen
  2007-03-26  6:07 ` Shawn O. Pearce
  2007-03-26  6:13 ` Junio C Hamano
  0 siblings, 2 replies; 3+ messages in thread
From: Tim Allen @ 2007-03-26  4:53 UTC (permalink / raw
  To: git

I'm not subscribed, please Cc me on replies.

My company is considering switching from CVS to a more branch-friendly
version-control tool, and so of course we've been playing with git.
We imported our CVS repository into git with git-cvsimport, which worked
well enough, and resulted in a tree about the same size as the official
kernel repository: 454121 objects, 334977 deltas.

However, operations like 'git-fetch' take much, much longer in our
repository than in the kernel repository: a git-fetch that pulls no
updates in the kernel repository takes 1.7s, while our repository
(fetching from one repository to a clone on the same local disk) takes
about 20 seconds. After some experimentation, we discovered that
deleting all the 5557 imported CVS tags made things fast again.
(Interestingly, "git-fetch --no-tags" was not appreciably quicker, while
the tags were still around)

I searched the mailing list archives for similar problems, and the
closest thread I could find was this one:

    http://thread.gmane.org/gmane.comp.version-control.git/20682/

...however, that thread seems to have decided that large numbers of
binary files were the problem, which is not the case in our repository.

Does git have known scalability problems with large numbers of tags? Is
there anything we can do to mitigate this slowdown, apart from just not
using git's tag feature at all? Are there any details I've overlooked or
misunderstood?

Tim Allen

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Git performance problems with many tags
  2007-03-26  4:53 Git performance problems with many tags Tim Allen
@ 2007-03-26  6:07 ` Shawn O. Pearce
  2007-03-26  6:13 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Shawn O. Pearce @ 2007-03-26  6:07 UTC (permalink / raw
  To: Tim Allen; +Cc: git

Tim Allen <tim@commsecure.com.au> wrote:
> However, operations like 'git-fetch' take much, much longer in our
> repository than in the kernel repository: a git-fetch that pulls no
> updates in the kernel repository takes 1.7s, while our repository
> (fetching from one repository to a clone on the same local disk) takes
> about 20 seconds. After some experimentation, we discovered that
> deleting all the 5557 imported CVS tags made things fast again.

Yes.  git-fetch in the current stable versions is a Bourne shell
script.  Its not very fast as it loops through the refs (tags)
that the two ends have.

There is work in under development (and being tested) that improves
this by converting some of the critical parts to C.

> (Interestingly, "git-fetch --no-tags" was not appreciably quicker, while
> the tags were still around)

Yes, because that swtich didn't bypass that section of the fetch code
where the slowdown is occuring.

> Does git have known scalability problems with large numbers of tags?

Yup.

> Is
> there anything we can do to mitigate this slowdown, apart from just not
> using git's tag feature at all?

Upgrade to the upcoming 1.5.1 release.  Junio recently tagged
1.5.1-rc1.  You can get it by cloning git.git and building the
'master' branch.  It is quite stable.  I would encourage you to
give it a try.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Git performance problems with many tags
  2007-03-26  4:53 Git performance problems with many tags Tim Allen
  2007-03-26  6:07 ` Shawn O. Pearce
@ 2007-03-26  6:13 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2007-03-26  6:13 UTC (permalink / raw
  To: Tim Allen; +Cc: git

Tim Allen <tim@commsecure.com.au> writes:

> However, operations like 'git-fetch' take much, much longer in our
> repository than in the kernel repository: a git-fetch that pulls no
> updates in the kernel repository takes 1.7s, while our repository
> (fetching from one repository to a clone on the same local disk) takes
> about 20 seconds.
> ...
> Does git have known scalability problems with large numbers of tags? Is
> there anything we can do to mitigate this slowdown, apart from just not
> using git's tag feature at all? Are there any details I've overlooked or
> misunderstood?

An optimization for "fetching between repositories with insane
number of refs" problem was merged after v1.5.0, so if you try
the current tip of 'master' (which happens to be at v1.5.1-rc2
right now), I suspect you may see significant improvements.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-03-26  6:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-26  4:53 Git performance problems with many tags Tim Allen
2007-03-26  6:07 ` Shawn O. Pearce
2007-03-26  6:13 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).