git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: upload-pack is slow with lots of refs
Date: Wed, 3 Oct 2012 14:03:24 -0400	[thread overview]
Message-ID: <20121003180324.GB27446@sigill.intra.peff.net> (raw)
In-Reply-To: <CACBZZX70NTic2WtrXooTg+yBbiFFDAEX_Y-b=W=rAkcYKJ3T2g@mail.gmail.com>

On Wed, Oct 03, 2012 at 02:36:00PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I'm creating a system where a lot of remotes constantly fetch from a
> central repository for deployment purposes, but I've noticed that even
> with a remote.$name.fetch configuration to only get certain refs a
> "git fetch" will still call git-upload pack which will provide a list
> of all references.
> 
> This is being done against a repository with tens of thousands of refs
> (it has a tag for each deployment), so it ends up burning a lot of CPU
> time on the uploader/receiver side.

Where is the CPU being burned? Are your refs packed (that's a huge
savings)? What are the refs like? Are they .have refs from an alternates
repository, or real refs? Are they pointing to commits or tag objects?

What version of git are you using?  In the past year or so, I've made
several tweaks to speed up large numbers of refs, including:

  - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
    that this only helps if they are being pulled in by an alternates
    repo. And even then, it only helps if they are mostly duplicates;
    distinct ones are still O(n^2).

  - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
    a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
    Both in v1.7.11. I think there is still a potential quadratic loop
    in mark_complete()

  - 90108a2 (upload-pack: avoid parsing tag destinations)
    926f1dd (upload-pack: avoid parsing objects during ref advertisement)
    Both in v1.7.10. Note that tag objects are more expensive to
    advertise than commits, because we have to load and peel them.

Even with those patches, though, I found that it was something like ~2s
to advertise 100,000 refs.

> Has there been any work on extending the protocol so that the client
> tells the server what refs it's interested in?

I don't think so. It would be hard to do in a backwards-compatible way,
because the advertisement is the first thing the server says, before it
has negotiated any capabilities with the client at all.

-Peff

  parent reply	other threads:[~2012-10-03 18:03 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-03 12:36 Ævar Arnfjörð Bjarmason
2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
2012-10-03 18:03 ` Jeff King [this message]
2012-10-03 18:53   ` Junio C Hamano
2012-10-03 18:55     ` Jeff King
2012-10-03 19:41       ` Shawn Pearce
2012-10-03 20:13         ` Jeff King
2012-10-04 21:52           ` Sascha Cunz
2012-10-05  0:20             ` Jeff King
2012-10-05  6:24         ` Johannes Sixt
2012-10-05 16:57           ` Shawn Pearce
2012-10-08 15:05             ` Johannes Sixt
2012-10-09  6:46               ` Shawn Pearce
2012-10-09 20:30                 ` Johannes Sixt
2012-10-09 20:46                   ` Johannes Sixt
2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
2012-10-03 21:20     ` Jeff King
2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
2012-10-03 23:15         ` Jeff King
2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
2012-10-04 18:24                 ` Junio C Hamano
2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
2012-10-04 18:32                 ` Junio C Hamano
2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
2012-10-04 19:06                 ` Junio C Hamano
2012-10-04 19:41                   ` Jeff King
2012-10-04 20:41                     ` Junio C Hamano
2012-10-04 21:59                       ` Jeff King
2012-10-04  8:03               ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King
2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
2012-10-04 12:14                   ` Nazri Ramliy
2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
2012-10-03 23:21     ` Jeff King
2012-10-03 23:47       ` Ævar Arnfjörð Bjarmason
2012-10-03 19:13 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121003180324.GB27446@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --subject='Re: upload-pack is slow with lots of refs' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).