git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* upload-pack is slow with lots of refs
@ 2012-10-03 12:36 Ævar Arnfjörð Bjarmason
  2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 12:36 UTC (permalink / raw)
  To: Git Mailing List

I'm creating a system where a lot of remotes constantly fetch from a
central repository for deployment purposes, but I've noticed that even
with a remote.$name.fetch configuration to only get certain refs a
"git fetch" will still call git-upload pack which will provide a list
of all references.

This is being done against a repository with tens of thousands of refs
(it has a tag for each deployment), so it ends up burning a lot of CPU
time on the uploader/receiver side.

Has there been any work on extending the protocol so that the client
tells the server what refs it's interested in?

I've been looking at turning this into a push mechanism instead of a
poll mechanism, but I've also noted that even when you one tag you
still end up listing all refs on the remote side:

    $ GIT_TRACE=1 git push origin my-new-tag -n
    trace: built-in: git 'push' 'origin' 'my-new-tag' '-n'
    trace: run_command: 'ssh' 'avar@git.example.com' 'git-receive-pack
'\''/gitroot/example.git'\'''
    nohup: redirecting stderr to stdout
    To ssh://avar@git.example.com/gitroot/example.git
     * [new tag]         my-new-tag -> my-new-tag

Which seems like a lot of superfluous work when it presumably only
needs to check if there's a remote "my-new-tag" tag which conflicts
with what you're pushing..

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 12:36 upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
@ 2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
  2012-10-03 18:03 ` Jeff King
  2012-10-03 19:13 ` Junio C Hamano
  2 siblings, 0 replies; 38+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-10-03 13:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Wed, Oct 3, 2012 at 7:36 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> I'm creating a system where a lot of remotes constantly fetch from a
> central repository for deployment purposes, but I've noticed that even
> with a remote.$name.fetch configuration to only get certain refs a
> "git fetch" will still call git-upload pack which will provide a list
> of all references.
>
> This is being done against a repository with tens of thousands of refs
> (it has a tag for each deployment), so it ends up burning a lot of CPU
> time on the uploader/receiver side.

If all refs are packed, will it still burn lots of CPU on server side?

> Has there been any work on extending the protocol so that the client
> tells the server what refs it's interested in?

It'll be a new protocol, not an extension for git protocol. Ref
advertising is step 1. Capababilities are advertised much later. The
client has to time to tell the server what protocol version it likes
to use for step 1. (I looked at this protcol extension from a
different angle. I wanted to compress the ref list for git protocol.
But git over http compresses well so I don't care much.)

On that git-over-http, I don't know, maybe git client can send
something as http headers, which are recognized by the server end, to
negotiate interested ref patterns?
-- 
Duy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 12:36 upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
  2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
@ 2012-10-03 18:03 ` Jeff King
  2012-10-03 18:53   ` Junio C Hamano
                     ` (2 more replies)
  2012-10-03 19:13 ` Junio C Hamano
  2 siblings, 3 replies; 38+ messages in thread
From: Jeff King @ 2012-10-03 18:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Wed, Oct 03, 2012 at 02:36:00PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I'm creating a system where a lot of remotes constantly fetch from a
> central repository for deployment purposes, but I've noticed that even
> with a remote.$name.fetch configuration to only get certain refs a
> "git fetch" will still call git-upload pack which will provide a list
> of all references.
> 
> This is being done against a repository with tens of thousands of refs
> (it has a tag for each deployment), so it ends up burning a lot of CPU
> time on the uploader/receiver side.

Where is the CPU being burned? Are your refs packed (that's a huge
savings)? What are the refs like? Are they .have refs from an alternates
repository, or real refs? Are they pointing to commits or tag objects?

What version of git are you using?  In the past year or so, I've made
several tweaks to speed up large numbers of refs, including:

  - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
    that this only helps if they are being pulled in by an alternates
    repo. And even then, it only helps if they are mostly duplicates;
    distinct ones are still O(n^2).

  - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
    a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
    Both in v1.7.11. I think there is still a potential quadratic loop
    in mark_complete()

  - 90108a2 (upload-pack: avoid parsing tag destinations)
    926f1dd (upload-pack: avoid parsing objects during ref advertisement)
    Both in v1.7.10. Note that tag objects are more expensive to
    advertise than commits, because we have to load and peel them.

Even with those patches, though, I found that it was something like ~2s
to advertise 100,000 refs.

> Has there been any work on extending the protocol so that the client
> tells the server what refs it's interested in?

I don't think so. It would be hard to do in a backwards-compatible way,
because the advertisement is the first thing the server says, before it
has negotiated any capabilities with the client at all.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 18:03 ` Jeff King
@ 2012-10-03 18:53   ` Junio C Hamano
  2012-10-03 18:55     ` Jeff King
  2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
  2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-10-03 18:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

>> Has there been any work on extending the protocol so that the client
>> tells the server what refs it's interested in?
>
> I don't think so. It would be hard to do in a backwards-compatible way,
> because the advertisement is the first thing the server says, before it
> has negotiated any capabilities with the client at all.

That is being discussed but hasn't surfaced on the list.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 18:53   ` Junio C Hamano
@ 2012-10-03 18:55     ` Jeff King
  2012-10-03 19:41       ` Shawn Pearce
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-03 18:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

On Wed, Oct 03, 2012 at 11:53:35AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> Has there been any work on extending the protocol so that the client
> >> tells the server what refs it's interested in?
> >
> > I don't think so. It would be hard to do in a backwards-compatible way,
> > because the advertisement is the first thing the server says, before it
> > has negotiated any capabilities with the client at all.
> 
> That is being discussed but hasn't surfaced on the list.

Out of curiosity, how are you thinking about triggering such a new
behavior in a backwards-compatible way? Invoke git-upload-pack2, and
fall back to reconnecting to start git-upload-pack if it fails?

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 12:36 upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
  2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
  2012-10-03 18:03 ` Jeff King
@ 2012-10-03 19:13 ` Junio C Hamano
  2 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-10-03 19:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Jeff King, spearce

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I'm creating a system where a lot of remotes constantly fetch from a
> central repository for deployment purposes, but I've noticed that even
> with a remote.$name.fetch configuration to only get certain refs a
> "git fetch" will still call git-upload pack which will provide a list
> of all references.

It has been observed that the sender has to advertise megabytes of
refs because it has to speak first before knowing what the receiver
wants, even when the receiver is interested in getting updates from
only one of them, or worse yet, when the receiver is only trying to
peek the ref it is interested has been updated.

I do not think upload-pack that runs on the sender side with
millions refs, when asked for a single "want", feeds all the refs
that it has to the revision machinery, and if you observed it does,
I cannot explain why it happens.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 18:55     ` Jeff King
@ 2012-10-03 19:41       ` Shawn Pearce
  2012-10-03 20:13         ` Jeff King
  2012-10-05  6:24         ` Johannes Sixt
  0 siblings, 2 replies; 38+ messages in thread
From: Shawn Pearce @ 2012-10-03 19:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Ævar Arnfjörð, Git Mailing List

On Wed, Oct 3, 2012 at 11:55 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Oct 03, 2012 at 11:53:35AM -0700, Junio C Hamano wrote:
>> Jeff King <peff@peff.net> writes:
>>
>> >> Has there been any work on extending the protocol so that the client
>> >> tells the server what refs it's interested in?
>> >
>> > I don't think so. It would be hard to do in a backwards-compatible way,
>> > because the advertisement is the first thing the server says, before it
>> > has negotiated any capabilities with the client at all.
>>
>> That is being discussed but hasn't surfaced on the list.
>
> Out of curiosity, how are you thinking about triggering such a new
> behavior in a backwards-compatible way? Invoke git-upload-pack2, and
> fall back to reconnecting to start git-upload-pack if it fails?

Basically, yes. New clients connect for git-upload-pack2. Over git://
the remote peer will just close the TCP socket with no messages. The
client can fallback to git-upload-pack and try again. Over SSH a
similar thing will happen in the sense there is no data output from
the remote side, so the client can try again. This has the downside of
authentication twice over SSH, which may prompt for a password twice.
But the user can get out of this by setting remote.NAME.uploadpack =
git-upload-pack and thus force the Git client to use the current
protocol if they have a new client and must continue to work over SSH
with an old server, and don't use an ssh-agent.

Over HTTP we can request ?service=git-upload-pack2 and retry just like
git:// would, or be a bit smarter and say
?service=git-upload-pack&v=2, and determine the protocol support of
the remote peer based on the response we get. If we see an immediate
advertisement its still the "v1" protocol, if we get back the "yes I
speak v2" response like git:// would see, we can continue the
conversation from there.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 19:41       ` Shawn Pearce
@ 2012-10-03 20:13         ` Jeff King
  2012-10-04 21:52           ` Sascha Cunz
  2012-10-05  6:24         ` Johannes Sixt
  1 sibling, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-03 20:13 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, Ævar Arnfjörð, Git Mailing List

On Wed, Oct 03, 2012 at 12:41:38PM -0700, Shawn O. Pearce wrote:

> > Out of curiosity, how are you thinking about triggering such a new
> > behavior in a backwards-compatible way? Invoke git-upload-pack2, and
> > fall back to reconnecting to start git-upload-pack if it fails?
> 
> Basically, yes. New clients connect for git-upload-pack2. Over git://
> the remote peer will just close the TCP socket with no messages. The
> client can fallback to git-upload-pack and try again. Over SSH a
> similar thing will happen in the sense there is no data output from
> the remote side, so the client can try again. This has the downside of
> authentication twice over SSH, which may prompt for a password twice.
> But the user can get out of this by setting remote.NAME.uploadpack =
> git-upload-pack and thus force the Git client to use the current
> protocol if they have a new client and must continue to work over SSH
> with an old server, and don't use an ssh-agent.

It's a shame that we have to reestablish the TCP or ssh connection to do
the retry. The password thing is annoying, but also it just wastes a
round-trip. It means we'd probably want to default the v2 probe to off
(and let the user turn it on for a specific remote) until v2 is much
more common than v1. Otherwise everyone pays the price.

It may also be worth designing v2 to handle more graceful capability
negotiation so this doesn't come up again.

Another alternative would be to tweak git-daemon to allow more graceful
fallback. That wouldn't help us now, but it would if we ever wanted a
v3. For stock ssh, you could send:

  sh -c 'git upload-pack2; test $? = 127 && git-upload-pack'

which would work if you have an unrestricted shell on the other side.
But it would break for a restricted shell or other "fake" ssh
environment. It's probably too ugly to have restricted shells recognize
that as a magic token (well, I could maybe even live with the ugliness,
but it is not strictly backwards compatible).

I was hoping we could do something like "git upload-pack --v2", but I'm
pretty sure current git-daemon would reject that.

> Over HTTP we can request ?service=git-upload-pack2 and retry just like
> git:// would, or be a bit smarter and say
> ?service=git-upload-pack&v=2, and determine the protocol support of
> the remote peer based on the response we get. If we see an immediate
> advertisement its still the "v1" protocol, if we get back the "yes I
> speak v2" response like git:// would see, we can continue the
> conversation from there.

Yeah, I would think "&v=2" would be better simply to avoid the
round-trip if we fail. It should be safe to turn the new protocol on by
default for http, then.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 18:03 ` Jeff King
  2012-10-03 18:53   ` Junio C Hamano
@ 2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
  2012-10-03 21:20     ` Jeff King
  2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 20:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Oct 03, 2012 at 02:36:00PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I'm creating a system where a lot of remotes constantly fetch from a
>> central repository for deployment purposes, but I've noticed that even
>> with a remote.$name.fetch configuration to only get certain refs a
>> "git fetch" will still call git-upload pack which will provide a list
>> of all references.
>>
>> This is being done against a repository with tens of thousands of refs
>> (it has a tag for each deployment), so it ends up burning a lot of CPU
>> time on the uploader/receiver side.
>
> Where is the CPU being burned? Are your refs packed (that's a huge
> savings)? What are the refs like? Are they .have refs from an alternates
> repository, or real refs? Are they pointing to commits or tag objects?
>
> What version of git are you using?  In the past year or so, I've made
> several tweaks to speed up large numbers of refs, including:
>
>   - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
>     that this only helps if they are being pulled in by an alternates
>     repo. And even then, it only helps if they are mostly duplicates;
>     distinct ones are still O(n^2).
>
>   - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
>     a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
>     Both in v1.7.11. I think there is still a potential quadratic loop
>     in mark_complete()
>
>   - 90108a2 (upload-pack: avoid parsing tag destinations)
>     926f1dd (upload-pack: avoid parsing objects during ref advertisement)
>     Both in v1.7.10. Note that tag objects are more expensive to
>     advertise than commits, because we have to load and peel them.
>
> Even with those patches, though, I found that it was something like ~2s
> to advertise 100,000 refs.

I can't provide all the details now (not with access to that machine
now), but briefly:

 * The git client/server version is 1.7.8

 * The repository has around 50k refs, they're "real" refs, almost all
   of them (say all but 0.5k-1k) are annotated tags, the rest are
   branches.

 * >99% of them are packed, there's a weekly cronjob that packs them
   all up, there were a few newly pushed branches and tags outside of
   the

 * I tried "echo -n | git upload-pack <repo>" on both that 50k
   repository and a repository with <100 refs, the former took around
   ~1-2s to run on a 24 core box and the latter ~500ms.

 * When I ran git-upload-pack with GNU parallel I managed around 20/s
   packs on the 24 core box on the 50k ref one, 40/s on the 100 ref
   one.

 * A co-worker who was working on this today tried it on 1.7.12 and
   claimed that it had the same performance characteristics.

 * I tried to profile it under gcc -pg && echo -n | ./git-upload-pack
   <repo> but it doesn't produce a profile like that, presumably
   because the process exits unsuccessfully.

   Maybe someone here knows offhand what mock data I could feed
   git-upload-pack to make it happy to just list the refs, or better
   yet do a bit more work which it would do if it were actually doing
   the fetch (I suppose I could just do a fetch, but I wanted to do
   this from a locally compiled checkout).

>> Has there been any work on extending the protocol so that the client
>> tells the server what refs it's interested in?
>
> I don't think so. It would be hard to do in a backwards-compatible way,
> because the advertisement is the first thing the server says, before it
> has negotiated any capabilities with the client at all.

I suppose at least for the ssh protocol we could just do:

    ssh server "(git upload-pack <repo> --refs=* || git upload-pack <repo>)"

And something similar with HTTP headers, but that of course leaves the
git:// protocol.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
@ 2012-10-03 21:20     ` Jeff King
  2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-03 21:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Wed, Oct 03, 2012 at 10:16:56PM +0200, Ævar Arnfjörð Bjarmason wrote:

> I can't provide all the details now (not with access to that machine
> now), but briefly:
> 
>  * The git client/server version is 1.7.8
> 
>  * The repository has around 50k refs, they're "real" refs, almost all
>    of them (say all but 0.5k-1k) are annotated tags, the rest are
>    branches.

I'd definitely try upgrading, then; I got measurable speedups from this
exact case using the patches in v1.7.10.

>  * >99% of them are packed, there's a weekly cronjob that packs them
>    all up, there were a few newly pushed branches and tags outside of
>    the

A few strays shouldn't make a big difference. The killer is calling
open(2) 50,000 times, but having most of it packed should prevent that.
I suspect Michael Haggerty's work on the ref cache may help, too
(otherwise we have to try each packed ref in the filesystem to make sure
nobody has written it since we packed).

>  * I tried "echo -n | git upload-pack <repo>" on both that 50k
>    repository and a repository with <100 refs, the former took around
>    ~1-2s to run on a 24 core box and the latter ~500ms.

More cores won't help, of course, as dumping the refs is single-threaded.

With v1.7.12, my ~400K test repository takes about 0.8s to run (on my
2-year-old 1.8 GHz i7, though it is probably turbo-boosting to 3 GHz).
So I'm surprised it is so slow.

Your 100-ref case is slow, too. Upload-pack's initial advertisement on
my linux-2.6 repository (without about 900 refs) is more like 20ms. I'd

>  * A co-worker who was working on this today tried it on 1.7.12 and
>    claimed that it had the same performance characteristics.

That's surprising to me. Can you try to verify those numbers?

>  * I tried to profile it under gcc -pg && echo -n | ./git-upload-pack
>    <repo> but it doesn't produce a profile like that, presumably
>    because the process exits unsuccessfully.

If it's a recent version of Linux, you'll get much nicer results with
perf. Here's what my 400K-ref case looks like:

  $ time echo 0000 | perf record git-upload-pack . >/dev/null
  real    0m0.808s
  user    0m0.660s
  sys     0m0.136s

  $ perf report | grep -v ^# | head
  11.40%  git-upload-pack  libc-2.13.so        [.] vfprintf
   9.70%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
   7.64%  git-upload-pack  git-upload-pack     [.] check_refname_format
   6.81%  git-upload-pack  libc-2.13.so        [.] __memcmp_sse4_1
   5.79%  git-upload-pack  libc-2.13.so        [.] getenv
   4.20%  git-upload-pack  libc-2.13.so        [.] __strlen_sse42
   3.72%  git-upload-pack  git-upload-pack     [.] ref_entry_cmp_sslice
   3.15%  git-upload-pack  git-upload-pack     [.] read_packed_refs
   2.65%  git-upload-pack  git-upload-pack     [.] sha1_to_hex
   2.44%  git-upload-pack  libc-2.13.so        [.] _IO_default_xsputn

So nothing too surprising, though there is some room for improvement
(e.g., it looks like we are calling getenv in a tight loop, which could
be hoisted out to a single call).

Do note that this version of git was compiled with -O3. Compiling with
-O0 produces very different results (it's more like 1.3s, and the
hotspots are check_refname_component and sha1_to_hex).

>    Maybe someone here knows offhand what mock data I could feed
>    git-upload-pack to make it happy to just list the refs, or better
>    yet do a bit more work which it would do if it were actually doing
>    the fetch (I suppose I could just do a fetch, but I wanted to do
>    this from a locally compiled checkout).

If you feed "0000" as I did above, that is the flush signal for "I have
no more lines to send you", which means that we are not actually
fetching anything. I.e., this is the exact same conversation a no-op
"git fetch" would produce.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 21:20     ` Jeff King
@ 2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
  2012-10-03 23:15         ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 22:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Wed, Oct 3, 2012 at 11:20 PM, Jeff King <peff@peff.net> wrote:

Thanks for all that info, it's really useful.

>>  * A co-worker who was working on this today tried it on 1.7.12 and
>>    claimed that it had the same performance characteristics.
>
> That's surprising to me. Can you try to verify those numbers?

I think he was wrong, I tested this on git.git by first creating a lot
of tags:

     parallel --eta "git tag -a -m"{}" test-again-{}" ::: $(git rev-list HEAD)

Then doing:

    git pack-refs --all
    git repack -A -d

And compiled with -g -O3 I get around 1.55 runs/s of git-upload-pack
on 1.7.8 and 2.59/s on the master branch.

>>  * I tried to profile it under gcc -pg && echo -n | ./git-upload-pack
>>    <repo> but it doesn't produce a profile like that, presumably
>>    because the process exits unsuccessfully.
>
> If it's a recent version of Linux, you'll get much nicer results with
> perf. Here's what my 400K-ref case looks like:
>
>   $ time echo 0000 | perf record git-upload-pack . >/dev/null
>   real    0m0.808s
>   user    0m0.660s
>   sys     0m0.136s
>
>   $ perf report | grep -v ^# | head
>   11.40%  git-upload-pack  libc-2.13.so        [.] vfprintf
>    9.70%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
>    7.64%  git-upload-pack  git-upload-pack     [.] check_refname_format
>    6.81%  git-upload-pack  libc-2.13.so        [.] __memcmp_sse4_1
>    5.79%  git-upload-pack  libc-2.13.so        [.] getenv
>    4.20%  git-upload-pack  libc-2.13.so        [.] __strlen_sse42
>    3.72%  git-upload-pack  git-upload-pack     [.] ref_entry_cmp_sslice
>    3.15%  git-upload-pack  git-upload-pack     [.] read_packed_refs
>    2.65%  git-upload-pack  git-upload-pack     [.] sha1_to_hex
>    2.44%  git-upload-pack  libc-2.13.so        [.] _IO_default_xsputn

FWIW here are my results on the above pathological git.git

    $ uname -r; perf --version; echo 0000 | perf record
./git-upload-pack .>/dev/null; perf report | grep -v ^# | head
    3.2.0-2-amd64
    perf version 3.2.17
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.026 MB perf.data (~1131 samples) ]
        29.08%  git-upload-pack  libz.so.1.2.7       [.] inflate
        17.99%  git-upload-pack  libz.so.1.2.7       [.] 0xaec1
         6.21%  git-upload-pack  libc-2.13.so        [.] 0x117503
         5.69%  git-upload-pack  libcrypto.so.1.0.0  [.] 0x82c3d
         4.87%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
         3.18%  git-upload-pack  ld-2.13.so          [.] 0x886e
         2.96%  git-upload-pack  libc-2.13.so        [.] vfprintf
         2.83%  git-upload-pack  git-upload-pack     [.] search_for_subdir
         1.56%  git-upload-pack  [kernel.kallsyms]   [k] do_raw_spin_lock
         1.36%  git-upload-pack  libc-2.13.so        [.] vsnprintf

I wonder why your report doesn't note any time in libz. This is on
Debian testing, maybe your OS uses different strip settings so it
doesn't show up?

    $ ldd -r ./git-upload-pack
            linux-vdso.so.1 =>  (0x00007fff621ff000)
            libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f768feee000)
            libcrypto.so.1.0.0 =>
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f768fb0a000)
            libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f768f8ed000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f768f566000)
            libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f768f362000)
            /lib64/ld-linux-x86-64.so.2 (0x00007f7690117000

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 18:03 ` Jeff King
  2012-10-03 18:53   ` Junio C Hamano
  2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
@ 2012-10-03 22:32   ` Ævar Arnfjörð Bjarmason
  2012-10-03 23:21     ` Jeff King
  2 siblings, 1 reply; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 22:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@peff.net> wrote:
> What version of git are you using?  In the past year or so, I've made
> several tweaks to speed up large numbers of refs, including:
>
>   - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
>     that this only helps if they are being pulled in by an alternates
>     repo. And even then, it only helps if they are mostly duplicates;
>     distinct ones are still O(n^2).
>
>   - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
>     a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
>     Both in v1.7.11. I think there is still a potential quadratic loop
>     in mark_complete()
>
>   - 90108a2 (upload-pack: avoid parsing tag destinations)
>     926f1dd (upload-pack: avoid parsing objects during ref advertisement)
>     Both in v1.7.10. Note that tag objects are more expensive to
>     advertise than commits, because we have to load and peel them.
>
> Even with those patches, though, I found that it was something like ~2s
> to advertise 100,000 refs.

FWIW I bisected between 1.7.9 and 1.7.10 and found that the point at
which it went from 1.5/s to 2.5/s upload-pack runs on the pathological
git.git repository was none of those, but:

    ccdc6037fe - parse_object: try internal cache before reading object db

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
@ 2012-10-03 23:15         ` Jeff King
  2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-03 23:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Thu, Oct 04, 2012 at 12:15:47AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I think he was wrong, I tested this on git.git by first creating a lot
> of tags:
> 
>      parallel --eta "git tag -a -m"{}" test-again-{}" ::: $(git rev-list HEAD)
> 
> Then doing:
> 
>     git pack-refs --all
>     git repack -A -d
> 
> And compiled with -g -O3 I get around 1.55 runs/s of git-upload-pack
> on 1.7.8 and 2.59/s on the master branch.

Thanks for the update, that's more like what I expected.

> FWIW here are my results on the above pathological git.git
> 
>     $ uname -r; perf --version; echo 0000 | perf record
> ./git-upload-pack .>/dev/null; perf report | grep -v ^# | head
>     3.2.0-2-amd64
>     perf version 3.2.17
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 0.026 MB perf.data (~1131 samples) ]
>         29.08%  git-upload-pack  libz.so.1.2.7       [.] inflate
>         17.99%  git-upload-pack  libz.so.1.2.7       [.] 0xaec1
>          6.21%  git-upload-pack  libc-2.13.so        [.] 0x117503
>          5.69%  git-upload-pack  libcrypto.so.1.0.0  [.] 0x82c3d
>          4.87%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
>          3.18%  git-upload-pack  ld-2.13.so          [.] 0x886e
>          2.96%  git-upload-pack  libc-2.13.so        [.] vfprintf
>          2.83%  git-upload-pack  git-upload-pack     [.] search_for_subdir
>          1.56%  git-upload-pack  [kernel.kallsyms]   [k] do_raw_spin_lock
>          1.36%  git-upload-pack  libc-2.13.so        [.] vsnprintf
> 
> I wonder why your report doesn't note any time in libz. This is on
> Debian testing, maybe your OS uses different strip settings so it
> doesn't show up?

Mine was on Debian unstable. The difference is probably that I have 400K
refs, but only 12K unique ones (this is the master alternates repo
containing every ref from every fork of rails/rails on GitHub). So I
spend proportionally more time fiddling with refs and outputting than
I do actually inflating tag objects.

Hmm. It seems like we should not need to open the tags at all. The main
reason is to produce the "peeled" advertisement just after it. But for a
packed ref with a modern version of git that supports the "peeled"
extension, we should already have that information.

The hack-ish patch below tries to reuse that. The interface is terrible,
and we should probably just pass the peel information via for_each_ref
(peel_ref tries to do a similar thing, but it also has a bad interface;
if we don't have the information already, it will redo the ref lookup.
We could probably get away with a peel_sha1 which uses the same
optimization trick as peel_ref).

With this patch my 800ms upload-pack drops to 600ms. I suspect it will
have an even greater impact for you, since you are spending much more of
your time on object loading than I am.

And note of course that while these micro-optimizations are neat, we're
still going to end up shipping quite a lot of data over the wire. Moving
to a protocol where we are advertising fewer refs would solve a lot more
problems in the long run.

---
diff --git a/refs.c b/refs.c
index 551a0f9..68eca3a 100644
--- a/refs.c
+++ b/refs.c
@@ -510,6 +510,14 @@ static struct ref_entry *current_ref;
 
 static struct ref_entry *current_ref;
 
+/* XXX horrible interface due to implied argument. not for real use */
+const unsigned char *peel_current_ref(void)
+{
+	if (!current_ref || !(current_ref->flag & REF_KNOWS_PEELED))
+		return NULL;
+	return current_ref->u.value.peeled;
+}
+
 static int do_one_ref(const char *base, each_ref_fn fn, int trim,
 		      int flags, void *cb_data, struct ref_entry *entry)
 {
diff --git a/refs.h b/refs.h
index 9d14558..88c5445 100644
--- a/refs.h
+++ b/refs.h
@@ -14,6 +14,8 @@ struct ref_lock {
 #define REF_ISPACKED 0x02
 #define REF_ISBROKEN 0x04
 
+const unsigned char *peel_current_ref(void);
+
 /*
  * Calls the specified function for each ref file until it returns
  * nonzero, and returns the value.  Please note that it is not safe to
diff --git a/upload-pack.c b/upload-pack.c
index 8f4703b..cdf43b0 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -736,8 +736,9 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		" include-tag multi_ack_detailed";
 	struct object *o = lookup_unknown_object(sha1);
 	const char *refname_nons = strip_namespace(refname);
+	const unsigned char *peeled = peel_current_ref();
 
-	if (o->type == OBJ_NONE) {
+	if (!peeled && o->type == OBJ_NONE) {
 		o->type = sha1_object_info(sha1, NULL);
 		if (o->type < 0)
 		    die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
@@ -756,11 +757,13 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		o->flags |= OUR_REF;
 		nr_our_refs++;
 	}
-	if (o->type == OBJ_TAG) {
+	if (!peeled && o->type == OBJ_TAG) {
 		o = deref_tag_noverify(o);
 		if (o)
-			packet_write(1, "%s %s^{}\n", sha1_to_hex(o->sha1), refname_nons);
+			peeled = o->sha1;
 	}
+	if (peeled && !is_null_sha1(peeled))
+		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
@ 2012-10-03 23:21     ` Jeff King
  2012-10-03 23:47       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-03 23:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Thu, Oct 04, 2012 at 12:32:35AM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@peff.net> wrote:
> > What version of git are you using?  In the past year or so, I've made
> > several tweaks to speed up large numbers of refs, including:
> >
> >   - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
> >     that this only helps if they are being pulled in by an alternates
> >     repo. And even then, it only helps if they are mostly duplicates;
> >     distinct ones are still O(n^2).
> >
> >   - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
> >     a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
> >     Both in v1.7.11. I think there is still a potential quadratic loop
> >     in mark_complete()
> >
> >   - 90108a2 (upload-pack: avoid parsing tag destinations)
> >     926f1dd (upload-pack: avoid parsing objects during ref advertisement)
> >     Both in v1.7.10. Note that tag objects are more expensive to
> >     advertise than commits, because we have to load and peel them.
> >
> > Even with those patches, though, I found that it was something like ~2s
> > to advertise 100,000 refs.
> 
> FWIW I bisected between 1.7.9 and 1.7.10 and found that the point at
> which it went from 1.5/s to 2.5/s upload-pack runs on the pathological
> git.git repository was none of those, but:
> 
>     ccdc6037fe - parse_object: try internal cache before reading object db

Ah, yeah, I forgot about that one. That implies that you have a lot of
refs pointing to the same objects (since the benefit of that commit is
to avoid reading from disk when we have already seen it).

Out of curiosity, what does your repo contain? I saw a lot of speedup
with that commit because my repos are big object stores, where we have
the same duplicated tag refs for every fork of the repo.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 23:21     ` Jeff King
@ 2012-10-03 23:47       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 23:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, Oct 4, 2012 at 1:21 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Oct 04, 2012 at 12:32:35AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@peff.net> wrote:
>> > What version of git are you using?  In the past year or so, I've made
>> > several tweaks to speed up large numbers of refs, including:
>> >
>> >   - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note
>> >     that this only helps if they are being pulled in by an alternates
>> >     repo. And even then, it only helps if they are mostly duplicates;
>> >     distinct ones are still O(n^2).
>> >
>> >   - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates)
>> >     a0de288 (fetch-pack: avoid quadratic loop in filter_refs)
>> >     Both in v1.7.11. I think there is still a potential quadratic loop
>> >     in mark_complete()
>> >
>> >   - 90108a2 (upload-pack: avoid parsing tag destinations)
>> >     926f1dd (upload-pack: avoid parsing objects during ref advertisement)
>> >     Both in v1.7.10. Note that tag objects are more expensive to
>> >     advertise than commits, because we have to load and peel them.
>> >
>> > Even with those patches, though, I found that it was something like ~2s
>> > to advertise 100,000 refs.
>>
>> FWIW I bisected between 1.7.9 and 1.7.10 and found that the point at
>> which it went from 1.5/s to 2.5/s upload-pack runs on the pathological
>> git.git repository was none of those, but:
>>
>>     ccdc6037fe - parse_object: try internal cache before reading object db
>
> Ah, yeah, I forgot about that one. That implies that you have a lot of
> refs pointing to the same objects (since the benefit of that commit is
> to avoid reading from disk when we have already seen it).
>
> Out of curiosity, what does your repo contain? I saw a lot of speedup
> with that commit because my repos are big object stores, where we have
> the same duplicated tag refs for every fork of the repo.

Things are much faster with your monkeypatch, got up to around 10
runs/s.

The repository mainly contains a lot of git-deploy[1] generated tags
which are added for every rollout to several subsystems.

Of the ~50k references in the repo 75% point to a commit that no other
reference points to. Around 98% of the references are annotated tags,
the rest are branches.

1. https://github.com/git-deploy/git-deploy

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 23:15         ` Jeff King
@ 2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-03 23:54 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, Oct 4, 2012 at 1:15 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Oct 04, 2012 at 12:15:47AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I think he was wrong, I tested this on git.git by first creating a lot
>> of tags:
>>
>>      parallel --eta "git tag -a -m"{}" test-again-{}" ::: $(git rev-list HEAD)
>>
>> Then doing:
>>
>>     git pack-refs --all
>>     git repack -A -d
>>
>> And compiled with -g -O3 I get around 1.55 runs/s of git-upload-pack
>> on 1.7.8 and 2.59/s on the master branch.
>
> Thanks for the update, that's more like what I expected.
>
>> FWIW here are my results on the above pathological git.git
>>
>>     $ uname -r; perf --version; echo 0000 | perf record
>> ./git-upload-pack .>/dev/null; perf report | grep -v ^# | head
>>     3.2.0-2-amd64
>>     perf version 3.2.17
>>     [ perf record: Woken up 1 times to write data ]
>>     [ perf record: Captured and wrote 0.026 MB perf.data (~1131 samples) ]
>>         29.08%  git-upload-pack  libz.so.1.2.7       [.] inflate
>>         17.99%  git-upload-pack  libz.so.1.2.7       [.] 0xaec1
>>          6.21%  git-upload-pack  libc-2.13.so        [.] 0x117503
>>          5.69%  git-upload-pack  libcrypto.so.1.0.0  [.] 0x82c3d
>>          4.87%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
>>          3.18%  git-upload-pack  ld-2.13.so          [.] 0x886e
>>          2.96%  git-upload-pack  libc-2.13.so        [.] vfprintf
>>          2.83%  git-upload-pack  git-upload-pack     [.] search_for_subdir
>>          1.56%  git-upload-pack  [kernel.kallsyms]   [k] do_raw_spin_lock
>>          1.36%  git-upload-pack  libc-2.13.so        [.] vsnprintf
>>
>> I wonder why your report doesn't note any time in libz. This is on
>> Debian testing, maybe your OS uses different strip settings so it
>> doesn't show up?
>
> Mine was on Debian unstable. The difference is probably that I have 400K
> refs, but only 12K unique ones (this is the master alternates repo
> containing every ref from every fork of rails/rails on GitHub). So I
> spend proportionally more time fiddling with refs and outputting than
> I do actually inflating tag objects.

An updated profile with your patch:

    $ uname -r; perf --version; echo 0000 | perf record
./git-upload-pack .>/dev/null; perf report | grep -v ^# | head
    3.2.0-2-amd64
    perf version 3.2.17
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.015 MB perf.data (~662 samples) ]
        14.45%  git-upload-pack  libc-2.13.so        [.] 0x78140
        12.13%  git-upload-pack  [kernel.kallsyms]   [k] walk_component
        11.01%  git-upload-pack  libc-2.13.so        [.] _IO_getline_info
        10.74%  git-upload-pack  git-upload-pack     [.] find_pack_entry_one
         8.96%  git-upload-pack  [kernel.kallsyms]   [k] __mmdrop
         8.64%  git-upload-pack  git-upload-pack     [.] sha1_to_hex
         6.73%  git-upload-pack  libc-2.13.so        [.] vfprintf
         4.07%  git-upload-pack  libc-2.13.so        [.] strchrnul
         4.00%  git-upload-pack  libc-2.13.so        [.] getenv
         3.37%  git-upload-pack  git-upload-pack     [.] packet_write

> Hmm. It seems like we should not need to open the tags at all. The main
> reason is to produce the "peeled" advertisement just after it. But for a
> packed ref with a modern version of git that supports the "peeled"
> extension, we should already have that information.

B.t.w. do you plan to submit this as a non-hack, I'd like to have it
in git.git, so if you're not going to I could pick it up and clean it
up a bit. But I think it would be better coming from you.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 0/4] optimizing upload-pack ref peeling
  2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
@ 2012-10-04  7:56             ` Jeff King
  2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
                                 ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Jeff King @ 2012-10-04  7:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Thu, Oct 04, 2012 at 01:54:47AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Hmm. It seems like we should not need to open the tags at all. The main
> > reason is to produce the "peeled" advertisement just after it. But for a
> > packed ref with a modern version of git that supports the "peeled"
> > extension, we should already have that information.
> 
> B.t.w. do you plan to submit this as a non-hack, I'd like to have it
> in git.git, so if you're not going to I could pick it up and clean it
> up a bit. But I think it would be better coming from you.

Here's a cleaned-up version. I realized that peel_ref actually is quite
close to what we want; the key thing is that when you are calling it
from a for_each_ref loop it will not only use peeled information from
packed-refs (if available), but it will also short-circuit the ref
lookup for non-packed refs. So it really does what we want.

I put some timings into the final patch. The results are very satisfying
from a pure numbers standpoint. However, I'm not 100% sure this is all
that useful a benchmark. The repos where you get a noticeable benefit
are quite big, so I suspect the time to send the advertisements out over
the wire will dominate. Still, a CPU second saved is a CPU second
earned.

And the series actually ends up making the code cleaner and shares some
of the optimizations I put into upload-pack with other users of
peel_ref. So I think I'd be inclined to use it, even if the speedup
doesn't help that much in practice. We'll see what others think.

  [1/4]: peel_ref: use faster deref_tag_noverify
  [2/4]: peel_ref: do not return a null sha1
  [3/4]: peel_ref: check object type before loading
  [4/4]: upload-pack: use peel_ref for ref advertisements

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/4] peel_ref: use faster deref_tag_noverify
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
@ 2012-10-04  7:58               ` Jeff King
  2012-10-04 18:24                 ` Junio C Hamano
  2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
                                 ` (3 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-04  7:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

When we are asked to peel a ref to a sha1, we internally call
deref_tag, which will recursively parse each tagged object
until we reach a non-tag. This has the benefit that we will
verify our ability to load and parse the pointed-to object.

However, there is a performance downside: we may not need to
load that object at all (e.g., if we are listing peeled
simply listing peeled refs), or it may be a large object
that should follow a streaming code path (e.g., an annotated
tag of a large blob).

It makes more sense for peel_ref to choose the fast thing
rather than performing the extra check, for two reasons:

  1. We will already sometimes short-circuit the tag parsing
     in favor of a peeled entry from a packed-refs file. So
     we are already favoring speed in some cases, and it is
     not wise for a caller to rely on peel_ref to detect
     corruption.

  2. We already silently ignore much larger corruptions,
     like a ref that points to a non-existent object, or a
     tag object that exists but is corrupted.

  2. peel_ref is not the right place to check for such a
     database corruption. It is returning only the sha1
     anyway, not the actual object. Any callers which use
     that sha1 to load an object will soon discover the
     corruption anyway, so we are really just pushing back
     the discovery to later in the program.

Signed-off-by: Jeff King <peff@peff.net>
---
This is basically bringing the optimization from 90108a2 (upload-pack:
avoid parsing tag destinations, 2012-01-06) to users of peel_ref.

 refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index da74a2b..0a916a0 100644
--- a/refs.c
+++ b/refs.c
@@ -1225,7 +1225,7 @@ fallback:
 fallback:
 	o = parse_object(base);
 	if (o && o->type == OBJ_TAG) {
-		o = deref_tag(o, refname, 0);
+		o = deref_tag_noverify(o);
 		if (o) {
 			hashcpy(sha1, o->sha1);
 			return 0;
-- 
1.8.0.rc0.10.g8dd2a92

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 2/4] peel_ref: do not return a null sha1
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
  2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
@ 2012-10-04  8:00               ` Jeff King
  2012-10-04 18:32                 ` Junio C Hamano
  2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-04  8:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

The idea of the peel_ref function is to dereference tag
objects recursively until we hit a non-tag, and return the
sha1. Conceptually, it should return 0 if it is successful
(and fill in the sha1), or -1 if there was nothing to peel.

However, the current behavior is much more confusing. For a
regular loose ref, the behavior is as described above. But
there is an optimization to reuse the peeled-ref value for a
ref that came from a packed-refs file. If we have such a
ref, we return its peeled value, even if that peeled value
is null (indicating that we know the ref definitely does
_not_ peel).

It might seem like such information is useful to the caller,
who would then know not to bother loading and trying to peel
the object. Except that they should not bother loading and
trying to peel the object _anyway_, because that fallback is
already handled by peel_ref. In other words, the whole point
of calling this function is that it handles those details
internally, and you either get a sha1, or you know that it
is not peel-able.

This patch catches the null sha1 case internally and
converts it into a -1 return value (i.e., there is nothing
to peel). This simplifies callers, which do not need to
bother checking themselves.

Two callers are worth noting:

  - in pack-objects, a comment indicates that there is a
    difference between non-peelable tags and unannotated
    tags. But that is not the case (before or after this
    patch). Whether you get a null sha1 has to do with
    internal details of how peel_ref operated.

  - in show-ref, if peel_ref returns a failure, the caller
    tries to decide whether to try peeling manually based on
    whether the REF_ISPACKED flag is set. But this doesn't
    make any sense. If the flag is set, that does not
    necessarily mean the ref came from a packed-refs file
    with the "peeled" extension. But it doesn't matter,
    because even if it didn't, there's no point in trying to
    peel it ourselves, as peel_ref would already have done
    so. In other words, the fallback peeling is guaranteed
    to fail.

Signed-off-by: Jeff King <peff@peff.net>
---
Sorry for the long-winded explanation. I'd love it if somebody could
eyeball the code and make sure what I wrote above is correct. I was
mightily confused at first by the way that some of the callers treated
the return value. I think it is simply a case of code growing crufty as
features were added (originally the packed-refs optimization did not
exist, and was bolted on later), but I may have mis-read something.

 builtin/describe.c     |  2 +-
 builtin/pack-objects.c |  1 -
 builtin/show-ref.c     | 23 +++--------------------
 refs.c                 |  2 ++
 4 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 9fe11ed..04c185b 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -144,7 +144,7 @@ static int get_name(const char *path, const unsigned char *sha1, int flag, void
 	if (!all && !might_be_tag)
 		return 0;
 
-	if (!peel_ref(path, peeled) && !is_null_sha1(peeled)) {
+	if (!peel_ref(path, peeled)) {
 		is_tag = !!hashcmp(sha1, peeled);
 	} else {
 		hashcpy(peeled, sha1);
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5e14064..f069462 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2033,7 +2033,6 @@ static int add_ref_tag(const char *path, const unsigned char *sha1, int flag, vo
 
 	if (!prefixcmp(path, "refs/tags/") && /* is a tag? */
 	    !peel_ref(path, peeled)        && /* peelable? */
-	    !is_null_sha1(peeled)          && /* annotated tag? */
 	    locate_object_entry(peeled))      /* object packed? */
 		add_object_entry(sha1, OBJ_TAG, NULL, 0);
 	return 0;
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 4eb016d..8d9b76a 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -28,7 +28,6 @@ static int show_ref(const char *refname, const unsigned char *sha1, int flag, vo
 
 static int show_ref(const char *refname, const unsigned char *sha1, int flag, void *cbdata)
 {
-	struct object *obj;
 	const char *hex;
 	unsigned char peeled[20];
 
@@ -79,25 +78,9 @@ match:
 	if (!deref_tags)
 		return 0;
 
-	if ((flag & REF_ISPACKED) && !peel_ref(refname, peeled)) {
-		if (!is_null_sha1(peeled)) {
-			hex = find_unique_abbrev(peeled, abbrev);
-			printf("%s %s^{}\n", hex, refname);
-		}
-	}
-	else {
-		obj = parse_object(sha1);
-		if (!obj)
-			die("git show-ref: bad ref %s (%s)", refname,
-			    sha1_to_hex(sha1));
-		if (obj->type == OBJ_TAG) {
-			obj = deref_tag(obj, refname, 0);
-			if (!obj)
-				die("git show-ref: bad tag at ref %s (%s)", refname,
-				    sha1_to_hex(sha1));
-			hex = find_unique_abbrev(obj->sha1, abbrev);
-			printf("%s %s^{}\n", hex, refname);
-		}
+	if (!peel_ref(refname, peeled)) {
+		hex = find_unique_abbrev(peeled, abbrev);
+		printf("%s %s^{}\n", hex, refname);
 	}
 	return 0;
 }
diff --git a/refs.c b/refs.c
index 0a916a0..f672ad9 100644
--- a/refs.c
+++ b/refs.c
@@ -1202,6 +1202,8 @@ int peel_ref(const char *refname, unsigned char *sha1)
 	if (current_ref && (current_ref->name == refname
 		|| !strcmp(current_ref->name, refname))) {
 		if (current_ref->flag & REF_KNOWS_PEELED) {
+			if (is_null_sha1(current_ref->u.value.peeled))
+			    return -1;
 			hashcpy(sha1, current_ref->u.value.peeled);
 			return 0;
 		}
-- 
1.8.0.rc0.10.g8dd2a92

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 3/4] peel_ref: check object type before loading
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
  2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
  2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
@ 2012-10-04  8:02               ` Jeff King
  2012-10-04 19:06                 ` Junio C Hamano
  2012-10-04  8:03               ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King
  2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
  4 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-04  8:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

The point of peel_ref is to dereference tags; if the base
object is not a tag, then we can return early without even
loading the object into memory.

This patch accomplishes that by checking sha1_object_info
for the type. For a packed object, we can get away with just
looking in the pack index. For a loose object, we only need
to inflate the first couple of header bytes.

This is a bit of a gamble; if we do find a tag object, then
we will end up loading the content anyway, and the extra
lookup will have been wasteful. However, if it is not a tag
object, then we save loading the object entirely. Depending
on the ratio of non-tags to tags in the input, this can be a
minor win or minor loss.

However, it does give us one potential major win: if a ref
points to a large blob (e.g., via an unannotated tag), then
we can avoid looking at it entirely.

Signed-off-by: Jeff King <peff@peff.net>
---
This optimization is the one that gave me the most pause. While
upload-pack does call peel_ref on everything, the other callers all
constrain themselves to refs/tags/. So for many projects, we will be
calling it mostly on annotated tags, and it may be a very small net
loss. But in practice, it will not matter for most projects with a sane
number of normal tags, and saving even one accidental giant blob load
can have a huge impact.

 refs.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index f672ad9..02e47b1 100644
--- a/refs.c
+++ b/refs.c
@@ -1225,8 +1225,15 @@ fallback:
 	}
 
 fallback:
-	o = parse_object(base);
-	if (o && o->type == OBJ_TAG) {
+	o = lookup_unknown_object(base);
+	if (o->type == OBJ_NONE) {
+		int type = sha1_object_info(base, NULL);
+		if (type < 0)
+			return -1;
+		o->type = type;
+	}
+
+	if (o->type == OBJ_TAG) {
 		o = deref_tag_noverify(o);
 		if (o) {
 			hashcpy(sha1, o->sha1);
-- 
1.8.0.rc0.10.g8dd2a92

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 4/4] upload-pack: use peel_ref for ref advertisements
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
                                 ` (2 preceding siblings ...)
  2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
@ 2012-10-04  8:03               ` Jeff King
  2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
  4 siblings, 0 replies; 38+ messages in thread
From: Jeff King @ 2012-10-04  8:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.

Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.

This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:

  echo 0000 | git-upload-pack . >/dev/null

best-of-five, warm cache, objects and refs fully packed:

  [before]             [after]
  real    0m0.026s     real    0m0.013s
  user    0m0.024s     user    0m0.008s
  sys     0m0.000s     sys     0m0.000s

Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):

  [before]             [after]
  real    0m0.704s     real    0m0.596s
  user    0m0.600s     user    0m0.496s
  sys     0m0.096s     sys     0m0.092s

This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).

In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):

  [before]             [after]
  real    0m0.443s     real    0m0.097s
  user    0m0.416s     user    0m0.080s
  sys     0m0.024s     sys     0m0.012s

Signed-off-by: Jeff King <peff@peff.net>
---
 upload-pack.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2e90ccb..6142421 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -727,12 +727,7 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		" include-tag multi_ack_detailed";
 	struct object *o = lookup_unknown_object(sha1);
 	const char *refname_nons = strip_namespace(refname);
-
-	if (o->type == OBJ_NONE) {
-		o->type = sha1_object_info(sha1, NULL);
-		if (o->type < 0)
-		    die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
-	}
+	unsigned char peeled[20];
 
 	if (capabilities)
 		packet_write(1, "%s %s%c%s%s agent=%s\n",
@@ -747,11 +742,8 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		o->flags |= OUR_REF;
 		nr_our_refs++;
 	}
-	if (o->type == OBJ_TAG) {
-		o = deref_tag_noverify(o);
-		if (o)
-			packet_write(1, "%s %s^{}\n", sha1_to_hex(o->sha1), refname_nons);
-	}
+	if (!peel_ref(refname, peeled))
+		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
 	return 0;
 }
 
-- 
1.8.0.rc0.10.g8dd2a92

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] optimizing upload-pack ref peeling
  2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
                                 ` (3 preceding siblings ...)
  2012-10-04  8:03               ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King
@ 2012-10-04  8:04               ` Jeff King
  2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-04  8:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List

On Thu, Oct 04, 2012 at 03:56:09AM -0400, Jeff King wrote:

>   [1/4]: peel_ref: use faster deref_tag_noverify
>   [2/4]: peel_ref: do not return a null sha1
>   [3/4]: peel_ref: check object type before loading
>   [4/4]: upload-pack: use peel_ref for ref advertisements

I included my own timings in the final one, but my "pathological" case
at the end is a somewhat made-up attempt to emulate what you described.
Can you double-check that this series still has a nice impact on your
real-world repository?

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] optimizing upload-pack ref peeling
  2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
@ 2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
  2012-10-04 12:14                   ` Nazri Ramliy
  0 siblings, 1 reply; 38+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2012-10-04  9:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Thu, Oct 4, 2012 at 10:04 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Oct 04, 2012 at 03:56:09AM -0400, Jeff King wrote:
>
>>   [1/4]: peel_ref: use faster deref_tag_noverify
>>   [2/4]: peel_ref: do not return a null sha1
>>   [3/4]: peel_ref: check object type before loading
>>   [4/4]: upload-pack: use peel_ref for ref advertisements
>
> I included my own timings in the final one, but my "pathological" case
> at the end is a somewhat made-up attempt to emulate what you described.
> Can you double-check that this series still has a nice impact on your
> real-world repository?

It does, here's best of five for, all compiled with -g -O3:

v1.7.8:

    $ time (echo 0000 | ~/g/git/git-upload-pack . | pv >/dev/null)
    3.49MB 0:00:00 [ 5.3MB/s] [  <=>

           ]

    real    0m0.660s
    user    0m0.604s
    sys     0m0.248s

master without your patches:

    $ time (echo 0000 | ~/g/git/git-upload-pack . | pv >/dev/null)
    3.49MB 0:00:00 [10.2MB/s] [  <=>

           ]

    real    0m0.344s
    user    0m0.300s
    sys     0m0.172s

master with your patches:

    $ time (echo 0000 | ~/g/git/git-upload-pack . | pv >/dev/null)
    3.49MB 0:00:00 [31.8MB/s] [  <=>

           ]

    real    0m0.113s
    user    0m0.088s
    sys     0m0.088s

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/4] optimizing upload-pack ref peeling
  2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
@ 2012-10-04 12:14                   ` Nazri Ramliy
  0 siblings, 0 replies; 38+ messages in thread
From: Nazri Ramliy @ 2012-10-04 12:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Git Mailing List

On Thu, Oct 4, 2012 at 5:01 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>     $ time (echo 0000 | ~/g/git/git-upload-pack . | pv >/dev/null)

Totally off-topic, but ...

Thanks for making me look up what pv is. I remember checking it out
quiet sometime
ago and have forgotten about it altogether.

Now I can replace my poorly written, half-assed attempt at a shell script to do
something similar to what pv is already capable of!

Nazri.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/4] peel_ref: use faster deref_tag_noverify
  2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
@ 2012-10-04 18:24                 ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-10-04 18:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

> When we are asked to peel a ref to a sha1, we internally call
> deref_tag, which will recursively parse each tagged object
> until we reach a non-tag. This has the benefit that we will
> verify our ability to load and parse the pointed-to object.
>
> However, there is a performance downside: we may not need to
> load that object at all (e.g., if we are listing peeled
> simply listing peeled refs),...

Both correct.

I checked the existing callsites and nobody seems to assume that it
can use the pointed-to object without checking because peel_ref has
already checked, so it makes sense.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/4] peel_ref: do not return a null sha1
  2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
@ 2012-10-04 18:32                 ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-10-04 18:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

> The idea of the peel_ref function is to dereference tag
> objects recursively until we hit a non-tag, and return the
> sha1. Conceptually, it should return 0 if it is successful
> (and fill in the sha1), or -1 if there was nothing to peel.
>
> However, the current behavior is much more confusing. For a
> regular loose ref, the behavior is as described above. But
> there is an optimization to reuse the peeled-ref value for a
> ref that came from a packed-refs file. If we have such a
> ref, we return its peeled value, even if that peeled value
> is null (indicating that we know the ref definitely does
> _not_ peel).
>
> It might seem like such information is useful to the caller,
> who would then know not to bother loading and trying to peel
> the object. Except that they should not bother loading and
> trying to peel the object _anyway_, because that fallback is
> already handled by peel_ref. In other words, the whole point
> of calling this function is that it handles those details
> internally, and you either get a sha1, or you know that it
> is not peel-able.
>
> This patch catches the null sha1 case internally and
> converts it into a -1 return value (i.e., there is nothing
> to peel). This simplifies callers, which do not need to
> bother checking themselves.
>
> Two callers are worth noting:
>
>   - in pack-objects, a comment indicates that there is a
>     difference between non-peelable tags and unannotated
>     tags. But that is not the case (before or after this
>     patch). Whether you get a null sha1 has to do with
>     internal details of how peel_ref operated.

Yeah, this is what I was wondering while reviewing [1/4].

We traditionally said both HEAD^0 and HEAD^0^0 peel to HEAD, but
this at least at the internal API level redefines them to "these do
not peel at all and is a failure".  In other words, peel_ref(ref,
sha1) that returns 0 is a way to see if ref points at a real tag
object, and if the function returns non-zero, the caller cannot tell
if the failure is because it was a valid commit or a corrupt object.

The check !is_null_sha1(peeled) always looked fishy to me.  Good
riddance.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] peel_ref: check object type before loading
  2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
@ 2012-10-04 19:06                 ` Junio C Hamano
  2012-10-04 19:41                   ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-10-04 19:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

> The point of peel_ref is to dereference tags; if the base
> object is not a tag, then we can return early without even
> loading the object into memory.
>
> This patch accomplishes that by checking sha1_object_info
> for the type. For a packed object, we can get away with just
> looking in the pack index. For a loose object, we only need
> to inflate the first couple of header bytes.

We look at the pack index and have to follow its delta chain down to
the base to find its type; if the object is deeply deltified, this
certainly is an overall loss.

The only case sha1_object_info() could work well for an object near
the tip of a deep delta chain is to find its size, as the diff-delta
encodes the size of the base and the size of the result of applying
the delta to the base, so you do not have to follow the chain when
you are only interested in the final size.

But alas nobody calls sha1_object_info() for only size but not type
(there are some callers that are interested in only type but not
size).

> This is a bit of a gamble; if we do find a tag object, then
> we will end up loading the content anyway, and the extra
> lookup will have been wasteful. However, if it is not a tag
> object, then we save loading the object entirely. Depending
> on the ratio of non-tags to tags in the input, this can be a
> minor win or minor loss.
>
> However, it does give us one potential major win: if a ref
> points to a large blob (e.g., via an unannotated tag), then
> we can avoid looking at it entirely.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This optimization is the one that gave me the most pause. While
> upload-pack does call peel_ref on everything, the other callers all
> constrain themselves to refs/tags/. So for many projects, we will be
> calling it mostly on annotated tags, and it may be a very small net
> loss. But in practice, it will not matter for most projects with a sane
> number of normal tags, and saving even one accidental giant blob load
> can have a huge impact.

I may be missing something, but the above description is not
convincing to me.  When was the last time you pointed a blob
directly with a ref, whether large or small, and whether within
refs/tags or outside?

>
>  refs.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index f672ad9..02e47b1 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1225,8 +1225,15 @@ fallback:
>  	}
>  
>  fallback:
> -	o = parse_object(base);
> -	if (o && o->type == OBJ_TAG) {
> +	o = lookup_unknown_object(base);
> +	if (o->type == OBJ_NONE) {
> +		int type = sha1_object_info(base, NULL);
> +		if (type < 0)
> +			return -1;
> +		o->type = type;
> +	}
> +
> +	if (o->type == OBJ_TAG) {
>  		o = deref_tag_noverify(o);
>  		if (o) {
>  			hashcpy(sha1, o->sha1);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] peel_ref: check object type before loading
  2012-10-04 19:06                 ` Junio C Hamano
@ 2012-10-04 19:41                   ` Jeff King
  2012-10-04 20:41                     ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2012-10-04 19:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

On Thu, Oct 04, 2012 at 12:06:16PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The point of peel_ref is to dereference tags; if the base
> > object is not a tag, then we can return early without even
> > loading the object into memory.
> >
> > This patch accomplishes that by checking sha1_object_info
> > for the type. For a packed object, we can get away with just
> > looking in the pack index. For a loose object, we only need
> > to inflate the first couple of header bytes.
> 
> We look at the pack index and have to follow its delta chain down to
> the base to find its type; if the object is deeply deltified, this
> certainly is an overall loss.

Yes. Although in practice I would expect most things pointed to by refs
to not be delta'd. And indeed, comparing "git index-pack -v" to "git
for-each-ref" in my git.git repository yields 950 non-delta tips and
only 42 delta (and those are mostly rebased commit objects). In my
linux-2.6 repo, I have 0 delta ref tips (probably because I do not
actually do kernel work, but keep it around for querying).

The numbers I mentioned in 926f1dd (which did this exact same
optimization for upload-pack) showed a slight improvement in practice.
Because keep in mind that the alternative to this multi-step delta
lookup for the type is to load the object. Which will do the _same_
lookup, but will also have to actually compute the delta.

So the only real loss is that you have to repeat the lookup (or in the
case of a loose object, re-open and re-inflate the first 32 bytes). The
best way to do this would be to have sha1_object_info return some kind
of handle to the object where we could get the meta info cheaply, and
then reuse the handle to get the actual object data later (so for a
packed object, cache the actual pack location; for a loose object, keep
the descriptor and zlib stream open).

Of course, the speed up in 926f1dd probably won't matter anymore because
the whole point of this series is to avoid touching the object at all.
So hitting sha1_object_info in peel_ref is already the slow path. On
most objects, an optimization like this will not be noticeable either
way. The interesting thing is that we can avoid touching a large or
expensive object at all.

> > This optimization is the one that gave me the most pause. While
> > upload-pack does call peel_ref on everything, the other callers all
> > constrain themselves to refs/tags/. So for many projects, we will be
> > calling it mostly on annotated tags, and it may be a very small net
> > loss. But in practice, it will not matter for most projects with a sane
> > number of normal tags, and saving even one accidental giant blob load
> > can have a huge impact.
> 
> I may be missing something, but the above description is not
> convincing to me.  When was the last time you pointed a blob
> directly with a ref, whether large or small, and whether within
> refs/tags or outside?

Not often (though you have such a ref in git.git). I started down this
road because just applying patch 4/4 produces a regression in the test
suite. t1050 tries to clone a tag pointing to a large object with
GIT_ALLOC_LIMIT set. Using peel_ref regresses this to actually load the
whole object into memory.

That particular regression is fixed by 1/1 (because it's an annotated
tag pointing to the large object). This would fix the same case if it
were an unannotated tag.

Now is that a real-world test? No, I'm not sure it is. But I consider it
a win for large-object git[1] every time we can avoid loading an object,
because it only takes one unnecessary load to ruin the user experience.
So even if it's unlikely, if it isn't hurting the regular code path, I
think it's worth doing (and I'm not sure it is hurting; my numbers
showed that it be helping, though I suspect it is doing nothing after my
4/4).

-Peff

[1] One thing I've been toying is with "external alternates"; dumping
    your large objects in some realtively slow data store (e.g., a
    RESTful HTTP service). You could cache and cheaply query a list of
    "sha1 / size / type" for each object from the store, but getting the
    actual objects would be much more expensive. But again, it would
    depend on whether you would actually have such a store directly
    accessible by a ref.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] peel_ref: check object type before loading
  2012-10-04 19:41                   ` Jeff King
@ 2012-10-04 20:41                     ` Junio C Hamano
  2012-10-04 21:59                       ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-10-04 20:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Jeff King <peff@peff.net> writes:

> [1] One thing I've been toying is with "external alternates"; dumping
>     your large objects in some realtively slow data store (e.g., a
>     RESTful HTTP service). You could cache and cheaply query a list of
>     "sha1 / size / type" for each object from the store, but getting the
>     actual objects would be much more expensive. But again, it would
>     depend on whether you would actually have such a store directly
>     accessible by a ref.

Yeah, that actually has been another thing we were discussing
locally, without coming to something concrete enough to present to
the list.

The basic idea is to mark such paths with attributes, and use a
variant of smudge/clean filter that is _not_ a filter (as we do not
want to have the interface to this external helper to be "we feed
the whole big blob to you").  Instead, these smudgex/cleanx things
work on a pathname.

 - Your in-tree objects store a blob that records a description of
   the large thing.  Call such a blob a surrogate.  "clone", "fetch"
   and "push" all deal only with surrogates so your in-history data
   will stay small.

 - When checking out, the attributes mechanism kicks in and runs the
   "not filter" variant of smudge with the data in the surrogate.

   The surrogate records how to get the real thing from where, and
   how to validate what you got is correct.  A hand-wavy example may
   look like this:

   	get: download http://cdn.example.com/67def20
        sha1sum: f84667def209e4a84e37e8488a08e9eca3f208c1

   to tell you to download a single URL with whatever means suitable
   for your platform (perhaps curl or wget), and verify the result
   by running sha1sum.  Or it may involve

	get: git-fetch git://git.example.com/images.git/ master
        object: 85a094f22f02c54c740448f6716da608a5e89a80

   to tell you to "git fetch" from the given git-reachable resource
   into some place and grab the object via "git cat-file", possibly
   streaming it out.  The details do not matter at this point in the
   design process.

   The smudgex helper is responsible for caching previously fetched
   large contents, maintaining association between the surrogate
   blob and its real data, so that once the real thing is
   downloaded, and the contents of the path needs to change to
   something else (e.g. user checks out a different branch) and
   then change to the previous thing again (e.g. user comes back to
   the original branch), it does not download it again.

 - When checking if the working tree is clean relative to the index,
   the smudgex/cleanx helper will be consulted.  It will be given
   the surrogate data in the index and the path in the working tree.
   We may want to allow the helper implementation to give a read-only
   hardlink directly into helper's cache storage, so that it can
   consult its database of surrogate-to-real mapping and perform
   this verification cheaply by inode comparison, or something.

 - When running "git add" a modified large stuff prepared in the
   working tree, cleanx helper is called to prepare a new surrogate,
   and that is what is registered in the index.  The helper is also
   responsible for storing the new large stuff away and arrange it
   to be retrievable when others see and use this surrogate.

The initial scope of supporting something like that in core-git
would be to add the necessary infrastracture to arrange such smudgex
and cleanx helpers are called when a path is marked as a surrogate
in the attribute system, and supply a sample helper.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 20:13         ` Jeff King
@ 2012-10-04 21:52           ` Sascha Cunz
  2012-10-05  0:20             ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Sascha Cunz @ 2012-10-04 21:52 UTC (permalink / raw)
  To: Jeff King
  Cc: Shawn Pearce, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

Am Mittwoch, 3. Oktober 2012, 16:13:16 schrieb Jeff King:
> On Wed, Oct 03, 2012 at 12:41:38PM -0700, Shawn O. Pearce wrote:
> > > Out of curiosity, how are you thinking about triggering such a new
> > > behavior in a backwards-compatible way? Invoke git-upload-pack2, and
> > > fall back to reconnecting to start git-upload-pack if it fails?
> > 
> > Basically, yes. New clients connect for git-upload-pack2. Over git://
> > the remote peer will just close the TCP socket with no messages. The
> > client can fallback to git-upload-pack and try again. Over SSH a
> > similar thing will happen in the sense there is no data output from
> > the remote side, so the client can try again. This has the downside of
> > authentication twice over SSH, which may prompt for a password twice.
> > But the user can get out of this by setting remote.NAME.uploadpack =
> > git-upload-pack and thus force the Git client to use the current
> > protocol if they have a new client and must continue to work over SSH
> > with an old server, and don't use an ssh-agent.
> 
> It's a shame that we have to reestablish the TCP or ssh connection to do
> the retry. The password thing is annoying, but also it just wastes a
> round-trip. It means we'd probably want to default the v2 probe to off
> (and let the user turn it on for a specific remote) until v2 is much
> more common than v1. Otherwise everyone pays the price.

Would it be possible to use this workflow:

- Every client connects per default to v1

- If server is capable of v2, it sends a flag along with the usual response
  (A v1 server will obviously not send that flag)

- If client is also capable of v2 and gets the flag, it enables v2 for
  just that remote (probably unless the user said, "i never want to")

- Next time the client connects to that remote it will use v2.

I'm not sure, if this is possible, since I think to remember that I have read 
in the Documentation folder something along the line: Capabilities announced 
from the server mean "I want you to use exactly these flags".

Sascha

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/4] peel_ref: check object type before loading
  2012-10-04 20:41                     ` Junio C Hamano
@ 2012-10-04 21:59                       ` Jeff King
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff King @ 2012-10-04 21:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

On Thu, Oct 04, 2012 at 01:41:40PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > [1] One thing I've been toying is with "external alternates"; dumping
> >     your large objects in some realtively slow data store (e.g., a
> >     RESTful HTTP service). You could cache and cheaply query a list of
> >     "sha1 / size / type" for each object from the store, but getting the
> >     actual objects would be much more expensive. But again, it would
> >     depend on whether you would actually have such a store directly
> >     accessible by a ref.
> 
> Yeah, that actually has been another thing we were discussing
> locally, without coming to something concrete enough to present to
> the list.
> 
> The basic idea is to mark such paths with attributes, and use a
> variant of smudge/clean filter that is _not_ a filter (as we do not
> want to have the interface to this external helper to be "we feed
> the whole big blob to you").  Instead, these smudgex/cleanx things
> work on a pathname.

There are a few ways that smudge/clean filters are not up to the task
currently. One is definitely the efficiency of the interface.

Another is that this distinction is not "canonical in-repo versus
working tree representation".  Yes, part of it is about what you store
in the repo versus what you checkout. But you may also want to diff or
merge these large items. Almost certainly not with the internal text
tools, but you might want to feed the _real_ blobs (not the fake
surrogates) to external tools, and smudge and clean are not a natural
fit for that. If you teach the external tools to dereference the
surrogates, it would work. I think you could also get around it by
giving these smudgex/cleanx filters slightly different semantics.

But the thing I most dislike about a solution like this is that the
representation pollutes git's data model. That is, git stores a blob
representing the surrogate, and that is the official sha1 that goes into
the tree. That means your surrogate decisions are locked into history
forever, which has a few implications:

  1. If I store a surrogate and you store the real blob, we do not get
     the same tree (and in fact get conflicts).

  2. Once I store a blob, I can never revise my decision not to store
     that blob in every subsequent clone without rewriting history. I
     can convert it into a surrogate, but old versions of that blob will
     always be required for object connectivity.

  3. If your surrogate contains more than just the sha1 (e.g., if it
     points to http://cdn.example.com), your history is forever tied to
     that (and you are stuck with configuring run-time redirects if your
     URL changes).

My thinking was to make it all happen below the git object level, at the
same level as packs and loose objects. This is more invasive to git, but
much more flexible.

The basic implementation is pretty straightforward. You configure one or
more helpers which have two operations: provide a list of sha1s, and
fetch a single sha1. Whenever an object lookup fails in the packs and
loose objects, we check the helper lists.  We can cache the lists in
some mmap'able format similar to a pack index, so we can do quick checks
for object existence, type, and size. And if the lookup actually needs
the object, we fetch and cache (where the caching policy would all be
determined by the external script).

The real challenges are:

  1. It is expensive to meet normal reachability guarantees. So you
     would not want a remote to delta against a blob that you _could_
     get, but do not have.

  2. You need to tell remotes that you can access some objects in a
     different way, and not to send them as part of an object transfer.

  3. Many commands need to be taught not to load objects carelessly. For
     the most part, we do well with this because it's already expensive
     to load large objects from disk. I think we've got most of them on
     the diff code path, but I wouldn't be surprised if one or two more
     crop up. Fsck would need to learn to handle these objects
     differently.

Item (3) is really just about trying it and seeing where the problems
come up. For items (1) and (2), we'd need a protocol extension.

Having the receiver send something like "I have these blobs from my external
source, don't send them" is a nice idea, but it doesn't scale well (you
have to advertise the whole list for each transfer, because the receiver
doesn't know which ones are actually referenced).

Having the receiver say "I have access to external database $FOO, go check
the list of objects there and don't send anything it has" unnecessarily
ties the sender to the external database (i.e., they have to implement
$FOO for it to work).

Something simple like "do not send blobs larger than N bytes, nor make
deltas against such blobs" would work. It does mean that your value of N
really needs to match up with what goes into your external database, but
I don't think that will be a huge problem in practice (after the
transfer you can do a consistency check that between the fetched objects
and the external db, you have everything).

> [details on smudgex/cleanx]

All of what you wrote seems very sane; I think the real question is
whether this should be "above" or "below" git's user-visible data model
layer.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-04 21:52           ` Sascha Cunz
@ 2012-10-05  0:20             ` Jeff King
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff King @ 2012-10-05  0:20 UTC (permalink / raw)
  To: Sascha Cunz
  Cc: Shawn Pearce, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

On Thu, Oct 04, 2012 at 11:52:13PM +0200, Sascha Cunz wrote:

> Would it be possible to use this workflow:
> 
> - Every client connects per default to v1
> 
> - If server is capable of v2, it sends a flag along with the usual response
>   (A v1 server will obviously not send that flag)

That is more or less the strategy we use for existing extensions (your
"flag" is a space-separated list of capability strings). But in this
case, the idea would be to change what the "usual response" is. Since a
v1 client would be expecting the response, we must send it, but at that
point it is too late to make the change. So we need to see some flag
from the client before the server says anything.

And the problem is that the client sending that flag will break v1
servers, and the client would need to waste time doing a retry when
connecting to the (initially more common) v1 servers.

> - If client is also capable of v2 and gets the flag, it enables v2 for
>   just that remote (probably unless the user said, "i never want to")
> 
> - Next time the client connects to that remote it will use v2.

So yeah, that would work to help with the wasted time. We'd have
git-upload-pack2 to do the v2 protocol, but the v1 git-upload-pack for
the server would say "by the way, next time you connect, try v2 first".
So the client would have to store a version number for each remote.
Which is not too onerous.

Another way to think of it is phasing it in like this:

  1. Add v2 support to client and server. Initially, clients try only
     v1.

  2. Add a remote.*.preferProtocol config option, defaulting to v1. This
     lets people turn on v2 for remotes they know support it. If v2
     fails, still fall back to v1.

  3. Add a server upload-pack capability that says "by the way, try v2
     next time".  Have the client set the preferProtocol config option
     for a remote if we see that capability.

  4. Wait a while until v2 is very popular.

  5. Switch the default for preferProtocol to v2 (but still fall back to
     v1).

So always fall back and remain compatible, and let the config option
just be an optimization to avoid extra failed requests.

> I'm not sure, if this is possible, since I think to remember that I have read 
> in the Documentation folder something along the line: Capabilities announced 
> from the server mean "I want you to use exactly these flags".

No, the server capability says "I can do this", and the client should
respond with "I want you to do this". Because the server might be
talking to an older client that does not know what "this" is, it must
handle the case that the capability does not come back.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-03 19:41       ` Shawn Pearce
  2012-10-03 20:13         ` Jeff King
@ 2012-10-05  6:24         ` Johannes Sixt
  2012-10-05 16:57           ` Shawn Pearce
  1 sibling, 1 reply; 38+ messages in thread
From: Johannes Sixt @ 2012-10-05  6:24 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

Am 10/3/2012 21:41, schrieb Shawn Pearce:
> On Wed, Oct 3, 2012 at 11:55 AM, Jeff King <peff@peff.net> wrote:
>> On Wed, Oct 03, 2012 at 11:53:35AM -0700, Junio C Hamano wrote:
>>> Jeff King <peff@peff.net> writes:
>>>
>>>>> Has there been any work on extending the protocol so that the client
>>>>> tells the server what refs it's interested in?
>>>>
>>>> I don't think so. It would be hard to do in a backwards-compatible way,
>>>> because the advertisement is the first thing the server says, before it
>>>> has negotiated any capabilities with the client at all.
>>>
>>> That is being discussed but hasn't surfaced on the list.
>>
>> Out of curiosity, how are you thinking about triggering such a new
>> behavior in a backwards-compatible way? Invoke git-upload-pack2, and
>> fall back to reconnecting to start git-upload-pack if it fails?
> 
> Basically, yes. New clients connect for git-upload-pack2. Over git://
> the remote peer will just close the TCP socket with no messages. The
> client can fallback to git-upload-pack and try again. Over SSH a
> similar thing will happen in the sense there is no data output from
> the remote side, so the client can try again.

These connections are bidirectional. Upload-pack can just start
advertising refs in the "v1" way and announce a "v2" capability and listen
for response in parallel. A v2 capable client can start sending "wants" or
some other signal as soon as it sees the "v2" capability. Upload-pack,
which was listening for responses in parallel, can interrupt its
advertisements and continue with v2 protocol from here.

This sounds so simple (not the implementation, of course) - I must be
missing something.

-- Hannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-05  6:24         ` Johannes Sixt
@ 2012-10-05 16:57           ` Shawn Pearce
  2012-10-08 15:05             ` Johannes Sixt
  0 siblings, 1 reply; 38+ messages in thread
From: Shawn Pearce @ 2012-10-05 16:57 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

On Thu, Oct 4, 2012 at 11:24 PM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Am 10/3/2012 21:41, schrieb Shawn Pearce:
>> On Wed, Oct 3, 2012 at 11:55 AM, Jeff King <peff@peff.net> wrote:
>>> On Wed, Oct 03, 2012 at 11:53:35AM -0700, Junio C Hamano wrote:
>>>> Jeff King <peff@peff.net> writes:
>>>>
>>>>>> Has there been any work on extending the protocol so that the client
>>>>>> tells the server what refs it's interested in?
>>>>>
>>>>> I don't think so. It would be hard to do in a backwards-compatible way,
>>>>> because the advertisement is the first thing the server says, before it
>>>>> has negotiated any capabilities with the client at all.
>>>>
>>>> That is being discussed but hasn't surfaced on the list.
>>>
>>> Out of curiosity, how are you thinking about triggering such a new
>>> behavior in a backwards-compatible way? Invoke git-upload-pack2, and
>>> fall back to reconnecting to start git-upload-pack if it fails?
>>
>> Basically, yes. New clients connect for git-upload-pack2. Over git://
>> the remote peer will just close the TCP socket with no messages. The
>> client can fallback to git-upload-pack and try again. Over SSH a
>> similar thing will happen in the sense there is no data output from
>> the remote side, so the client can try again.
>
> These connections are bidirectional.

Smart HTTP is not bidirectional.

> Upload-pack can just start
> advertising refs in the "v1" way and announce a "v2" capability and listen
> for response in parallel. A v2 capable client can start sending "wants" or
> some other signal as soon as it sees the "v2" capability. Upload-pack,
> which was listening for responses in parallel, can interrupt its
> advertisements and continue with v2 protocol from here.
>
> This sounds so simple (not the implementation, of course) - I must be
> missing something.

Smart HTTP is not bidirectional. The client can't cut off the server.
Its also more complex to code the server to listen for a stop command
from the client at the same time the server is blasting out useless
references to the client.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-05 16:57           ` Shawn Pearce
@ 2012-10-08 15:05             ` Johannes Sixt
  2012-10-09  6:46               ` Shawn Pearce
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Sixt @ 2012-10-08 15:05 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

Am 05.10.2012 18:57, schrieb Shawn Pearce:
> On Thu, Oct 4, 2012 at 11:24 PM, Johannes Sixt <j.sixt@viscovery.net> wrote:
>> Upload-pack can just start
>> advertising refs in the "v1" way and announce a "v2" capability and listen
>> for response in parallel. A v2 capable client can start sending "wants" or
>> some other signal as soon as it sees the "v2" capability. Upload-pack,
>> which was listening for responses in parallel, can interrupt its
>> advertisements and continue with v2 protocol from here.
>>
>> This sounds so simple (not the implementation, of course) - I must be
>> missing something.
> 
> Smart HTTP is not bidirectional. The client can't cut off the server.

Smart HTTP does not need it: you already posted a better solution (I'm
refering to "&v=2").

> Its also more complex to code the server to listen for a stop command
> from the client at the same time the server is blasting out useless
> references to the client.

At least the server side does not seem to be that complex. See below.
Of course, the server blasted out some refs, but I'm confident that in
practice the client will be able to signal v2 capability after a few packets
of advertisements. You can switch on TCP_NODELAY for the first line with
the capabilities to ensure it goes out on the wire ASAP.

diff --git a/upload-pack.c b/upload-pack.c
index 2e90ccb..c29ae04 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -720,11 +720,20 @@ static void receive_needs(void)
 	free(shallows.objects);
 }
 
+static int client_spoke(void)
+{
+	struct pollfd pfd;
+	pfd.fd = 0;
+	pfd.events = POLLIN;
+	return poll(&pfd, 1, 0) > 0 &&
+		(pfd.revents & (POLLIN|POLLHUP));
+}
+
 static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
-		" include-tag multi_ack_detailed";
+		" include-tag multi_ack_detailed version2";
 	struct object *o = lookup_unknown_object(sha1);
 	const char *refname_nons = strip_namespace(refname);
 
@@ -752,7 +761,8 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		if (o)
 			packet_write(1, "%s %s^{}\n", sha1_to_hex(o->sha1), refname_nons);
 	}
-	return 0;
+
+	return client_spoke();
 }
 
 static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
@@ -771,8 +781,14 @@ static void upload_pack(void)
 {
 	if (advertise_refs || !stateless_rpc) {
 		reset_timeout();
-		head_ref_namespaced(send_ref, NULL);
-		for_each_namespaced_ref(send_ref, NULL);
+		if (head_ref_namespaced(send_ref, NULL) ||
+		    for_each_namespaced_ref(send_ref, NULL)) {
+			/*
+			 * TODO: continue with protocol version 2
+			 * optimization: do not send refs
+			 * that were already sent
+			 */
+		}
 		packet_flush(1);
 	} else {
 		head_ref_namespaced(mark_our_ref, NULL);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-08 15:05             ` Johannes Sixt
@ 2012-10-09  6:46               ` Shawn Pearce
  2012-10-09 20:30                 ` Johannes Sixt
  0 siblings, 1 reply; 38+ messages in thread
From: Shawn Pearce @ 2012-10-09  6:46 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

On Mon, Oct 8, 2012 at 8:05 AM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 05.10.2012 18:57, schrieb Shawn Pearce:
>> On Thu, Oct 4, 2012 at 11:24 PM, Johannes Sixt <j.sixt@viscovery.net> wrote:
>>> Upload-pack can just start
>>> advertising refs in the "v1" way and announce a "v2" capability and listen
>>> for response in parallel. A v2 capable client can start sending "wants" or
>>> some other signal as soon as it sees the "v2" capability. Upload-pack,
>>> which was listening for responses in parallel, can interrupt its
>>> advertisements and continue with v2 protocol from here.
>>>
>>> This sounds so simple (not the implementation, of course) - I must be
>>> missing something.
>>
>> Smart HTTP is not bidirectional. The client can't cut off the server.
>
> Smart HTTP does not need it: you already posted a better solution (I'm
> refering to "&v=2").

Yes but then it diverges even further from the native bidirectional protocol.

>> Its also more complex to code the server to listen for a stop command
>> from the client at the same time the server is blasting out useless
>> references to the client.
>
> At least the server side does not seem to be that complex. See below.
> Of course, the server blasted out some refs, but I'm confident that in
> practice the client will be able to signal v2 capability after a few packets
> of advertisements. You can switch on TCP_NODELAY for the first line with
> the capabilities to ensure it goes out on the wire ASAP.
...
> +static int client_spoke(void)
> +{
> +       struct pollfd pfd;
> +       pfd.fd = 0;
> +       pfd.events = POLLIN;
> +       return poll(&pfd, 1, 0) > 0 &&
> +               (pfd.revents & (POLLIN|POLLHUP));

Except doing this in Java is harder on an arbitrary InputStream type.
I guess we really only care about basic TCP, in which case we can use
NIO to implement an emulation of poll, and SSH, where MINA SSHD
probably doesn't provide a way to see if the client has given us data
without blocking. That makes supporting v2 really hard in e.g. Gerrit
Code Review. You could argue that its improper to attempt to implement
a network protocol in a language whose standard libraries have gone
out of their way to prevent you from polling to see if data is
immediately available, but I prefer to ignore such arguments.

As it turns out we don't really have this problem with git://. Clients
can bury a v2 request in the extended headers where the host line
appears today. Its a bit tricky because of that \0 bug causing
infinite looping, but IIRC using \0\0 is safe even against ancient
servers. So git:// and http:// both have a way where the client can
ask for v2 support before the server speaks, and have it transparently
be ignored by ancient servers.


The only place we have a problem is SSH. That exec of the remote
binary is just super-strict. Its good to be paranoid, but its also
locked out any chance we have at doing the upgrade over SSH without
having to run two SSH commands in the worst case. I guess the best
approach is to try the v1 protocol by default, have the remote
advertise it supports v2, and remember this on a per-host basis in
~/.gitconfig for future requests. Users could always force a specific
preference with remote.NAME.uploadpack variable or --uploadpack
command line flag.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-09  6:46               ` Shawn Pearce
@ 2012-10-09 20:30                 ` Johannes Sixt
  2012-10-09 20:46                   ` Johannes Sixt
  0 siblings, 1 reply; 38+ messages in thread
From: Johannes Sixt @ 2012-10-09 20:30 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

Am 09.10.2012 08:46, schrieb Shawn Pearce:
> On Mon, Oct 8, 2012 at 8:05 AM, Johannes Sixt <j6t@kdbg.org> wrote:
>> Am 05.10.2012 18:57, schrieb Shawn Pearce:
>>> Smart HTTP is not bidirectional. The client can't cut off the server.
>>
>> Smart HTTP does not need it: you already posted a better solution (I'm
>> refering to "&v=2").
> 
> Yes but then it diverges even further from the native bidirectional protocol.

I won't argue here because I know next to nothing about Smart HTTP. But
it sounds like you either have compatibility, but a diverging protocol
or at least implementation, or no compatibility.

>> +static int client_spoke(void)
>> +{
>> +       struct pollfd pfd;
>> +       pfd.fd = 0;
>> +       pfd.events = POLLIN;
>> +       return poll(&pfd, 1, 0) > 0 &&
>> +               (pfd.revents & (POLLIN|POLLHUP));
> 
> Except doing this in Java is harder on an arbitrary InputStream type.
> I guess we really only care about basic TCP, in which case we can use
> NIO to implement an emulation of poll, and SSH, where MINA SSHD
> probably doesn't provide a way to see if the client has given us data
> without blocking. That makes supporting v2 really hard in e.g. Gerrit
> Code Review. You could argue that its improper to attempt to implement
> a network protocol in a language whose standard libraries have gone
> out of their way to prevent you from polling to see if data is
> immediately available, but I prefer to ignore such arguments.

Can't you read the inbound stream in a second thread while the first
thread writes the advertisements to the outbound stream? Then you don't
even need to poll; you can just read the 4-byte length header, stash it
away and set a flag. The implementation of client_spoke() would only
amount to check that flag.

> As it turns out we don't really have this problem with git://. Clients
> can bury a v2 request in the extended headers where the host line
> appears today.

I tried, but it seems that todays git-daemons are too strict and accept
only \0host=foo\0, nothing else :-(

-- Hannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: upload-pack is slow with lots of refs
  2012-10-09 20:30                 ` Johannes Sixt
@ 2012-10-09 20:46                   ` Johannes Sixt
  0 siblings, 0 replies; 38+ messages in thread
From: Johannes Sixt @ 2012-10-09 20:46 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Git Mailing List

Am 09.10.2012 22:30, schrieb Johannes Sixt:
> Am 09.10.2012 08:46, schrieb Shawn Pearce:
>> As it turns out we don't really have this problem with git://. Clients
>> can bury a v2 request in the extended headers where the host line
>> appears today.
> 
> I tried, but it seems that todays git-daemons are too strict and accept
> only \0host=foo\0, nothing else :-(

I take that back: Modern git-daemons accept "\0host=foo\0\0version=2\0",
as you said.

It looks like SSH is the only stubborn protocol.

-- Hannes

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-10-09 20:46 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-03 12:36 upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
2012-10-03 13:06 ` Nguyen Thai Ngoc Duy
2012-10-03 18:03 ` Jeff King
2012-10-03 18:53   ` Junio C Hamano
2012-10-03 18:55     ` Jeff King
2012-10-03 19:41       ` Shawn Pearce
2012-10-03 20:13         ` Jeff King
2012-10-04 21:52           ` Sascha Cunz
2012-10-05  0:20             ` Jeff King
2012-10-05  6:24         ` Johannes Sixt
2012-10-05 16:57           ` Shawn Pearce
2012-10-08 15:05             ` Johannes Sixt
2012-10-09  6:46               ` Shawn Pearce
2012-10-09 20:30                 ` Johannes Sixt
2012-10-09 20:46                   ` Johannes Sixt
2012-10-03 20:16   ` Ævar Arnfjörð Bjarmason
2012-10-03 21:20     ` Jeff King
2012-10-03 22:15       ` Ævar Arnfjörð Bjarmason
2012-10-03 23:15         ` Jeff King
2012-10-03 23:54           ` Ævar Arnfjörð Bjarmason
2012-10-04  7:56             ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  7:58               ` [PATCH 1/4] peel_ref: use faster deref_tag_noverify Jeff King
2012-10-04 18:24                 ` Junio C Hamano
2012-10-04  8:00               ` [PATCH 2/4] peel_ref: do not return a null sha1 Jeff King
2012-10-04 18:32                 ` Junio C Hamano
2012-10-04  8:02               ` [PATCH 3/4] peel_ref: check object type before loading Jeff King
2012-10-04 19:06                 ` Junio C Hamano
2012-10-04 19:41                   ` Jeff King
2012-10-04 20:41                     ` Junio C Hamano
2012-10-04 21:59                       ` Jeff King
2012-10-04  8:03               ` [PATCH 4/4] upload-pack: use peel_ref for ref advertisements Jeff King
2012-10-04  8:04               ` [PATCH 0/4] optimizing upload-pack ref peeling Jeff King
2012-10-04  9:01                 ` Ævar Arnfjörð Bjarmason
2012-10-04 12:14                   ` Nazri Ramliy
2012-10-03 22:32   ` upload-pack is slow with lots of refs Ævar Arnfjörð Bjarmason
2012-10-03 23:21     ` Jeff King
2012-10-03 23:47       ` Ævar Arnfjörð Bjarmason
2012-10-03 19:13 ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).