git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: upload-pack is slow with lots of refs
Date: Thu, 4 Oct 2012 01:47:00 +0200	[thread overview]
Message-ID: <CACBZZX764OOH82CiLYPr+_qNU65U4Zxuod_7G5ef8yAtHApXog@mail.gmail.com> (raw)
In-Reply-To: <20121003232115.GB11618@sigill.intra.peff.net>

On Thu, Oct 4, 2012 at 1:21 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Oct 04, 2012 at 12:32:35AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@peff.net> wrote:
>> > What version of git are you using?  In the past year or so, I've made
>> > several tweaks to speed up large numbers of refs, including:
>> >
>> >   - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
>> >     that this only helps if they are being pulled in by an alternates
>> >     repo. And even then, it only helps if they are mostly duplicates;
>> >     distinct ones are still O(n^2).
>> >
>> >   - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
>> >     a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
>> >     Both in v1.7.11. I think there is still a potential quadratic loop
>> >     in mark_complete()
>> >
>> >   - 90108a2 (upload-pack: avoid parsing tag destinations)
>> >     926f1dd (upload-pack: avoid parsing objects during ref advertisement)
>> >     Both in v1.7.10. Note that tag objects are more expensive to
>> >     advertise than commits, because we have to load and peel them.
>> >
>> > Even with those patches, though, I found that it was something like ~2s
>> > to advertise 100,000 refs.
>>
>> FWIW I bisected between 1.7.9 and 1.7.10 and found that the point at
>> which it went from 1.5/s to 2.5/s upload-pack runs on the pathological
>> git.git repository was none of those, but:
>>
>>     ccdc6037fe - parse_object: try internal cache before reading object db
>
> Ah, yeah, I forgot about that one. That implies that you have a lot of
> refs pointing to the same objects (since the benefit of that commit is
> to avoid reading from disk when we have already seen it).
>
> Out of curiosity, what does your repo contain? I saw a lot of speedup
> with that commit because my repos are big object stores, where we have
> the same duplicated tag refs for every fork of the repo.

Things are much faster with your monkeypatch, got up to around 10
runs/s.

The repository mainly contains a lot of git-deploy[1] generated tags
which are added for every rollout to several subsystems.

Of the ~50k references in the repo 75% point to a commit that no other
reference points to. Around 98% of the references are annotated tags,
the rest are branches.

1. https://github.com/git-deploy/git-deploy

  reply	other threads:[~2012-10-04 21:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-03 12:36 Ævar Arnfjörð Bjarmason
2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
2012-10-03 18:03 ` Jeff King
2012-10-03 18:53   ` Junio C Hamano
2012-10-03 18:55     ` Jeff King
2012-10-03 19:41       ` Shawn Pearce
2012-10-03 20:13         ` Jeff King
2012-10-04 21:52           ` Sascha Cunz
2012-10-05  0:20             ` Jeff King
2012-10-05  6:24         ` Johannes Sixt
2012-10-05 16:57           ` Shawn Pearce
2012-10-08 15:05             ` Johannes Sixt
2012-10-09  6:46               ` Shawn Pearce
2012-10-09 20:30                 ` Johannes Sixt
2012-10-09 20:46                   ` Johannes Sixt
2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
2012-10-03 21:20     ` Jeff King
2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
2012-10-03 23:15         ` Jeff King
2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
2012-10-04 18:24                 ` Junio C Hamano
2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
2012-10-04 18:32                 ` Junio C Hamano
2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
2012-10-04 19:06                 ` Junio C Hamano
2012-10-04 19:41                   ` Jeff King
2012-10-04 20:41                     ` Junio C Hamano
2012-10-04 21:59                       ` Jeff King
2012-10-04  8:03               ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King
2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
2012-10-04 12:14                   ` Nazri Ramliy
2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
2012-10-03 23:21     ` Jeff King
2012-10-03 23:47       ` Ævar Arnfjörð Bjarmason [this message]
2012-10-03 19:13 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACBZZX764OOH82CiLYPr+_qNU65U4Zxuod_7G5ef8yAtHApXog@mail.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --subject='Re: upload-pack is slow with lots of refs' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).