git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: git@vger.kernel.org
Subject: Re: q: git-fetch a tad slow?
Date: Tue, 29 Jul 2008 21:48:55 -0700	[thread overview]
Message-ID: <20080730044855.GA7225@spearce.org> (raw)
In-Reply-To: <20080729090802.GA11373@elte.hu>

Ingo Molnar <mingo@elte.hu> wrote:
> * Shawn O. Pearce <spearce@spearce.org> wrote:
> > Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > Setup/background: distributed kernel testing cluster, [...]
> > > 
> > > Problem: i noticed that git-fetch is a tad slow:
> > > 
> > >   titan:~/tip> time git-fetch
> > >   real    0m2.372s
>
> note that titan is a very beefy box, almost 3 GHz Core2Duo:

That isn't going to matter if you have a quadratic algorithm and a
large dataset.  Especially when the inner loops are doing multiple
system calls per item in a long list of items.  :-|   Linux is fast,
but it isn't magic pixie dust.  It cannot fix broken applications.
 
> [...] So if we have a quadratic overhead on number of 
> branches, that's going to be quite a PITA.

Right.

> > I wonder if git-pack-refs + fetching only a single branch will get you 
> > closer to the tip-fetch time.
> 
> should i pack on both repos? I dont explicitly pack anything, but on the 
> server it goes into regular gc runs. (which will pack most stuff, 
> right?)

git-gc automatically runs `git pack-refs --all --prune` like I
recommended, unless you disabled it with config gc.packrefs = false.
So its probably already packed.

What does `find .git/refs -type f | wc -l` give for the repository
on the central server?  If its more than a handful (~20) I would
suggest running git-gc before testing again.

But I'm really suspecting that this is just our quadratic matching
algorithm running up against a large number of branches, causing
it to suck.

jgit at least uses an O(N) algorithm here, but since it is written
in Java its of course slow compared to C Git.  Takes a while to
get that JVM running.

I'll try to find some time to reproduce the issue and look at the
bottleneck here.  I'm two days into a new job so my git time has
been really quite short this week.  :-|

-- 
Shawn.

  reply	other threads:[~2008-07-30  4:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-28 16:01 q: git-fetch a tad slow? Ingo Molnar
2008-07-29  5:50 ` Shawn O. Pearce
2008-07-29  9:08   ` Ingo Molnar
2008-07-30  4:48     ` Shawn O. Pearce [this message]
2008-07-30 19:06       ` Ingo Molnar
2008-07-30 22:38         ` Shawn O. Pearce
2008-07-31  4:45         ` Shawn O. Pearce
2008-07-31 21:03           ` Ingo Molnar
2008-07-31 21:11             ` Ingo Molnar
2008-07-31 21:19               ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080730044855.GA7225@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).