Huge performance bottleneck reading packs

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Huge performance bottleneck reading packs
@ 2016-10-12 22:30 Vegard Nossum
  2016-10-12 22:45 ` Junio C Hamano
  2016-10-12 23:01 ` Jeff King
  0 siblings, 2 replies; 22+ messages in thread
From: Vegard Nossum @ 2016-10-12 22:30 UTC (permalink / raw)
  To: git
  Cc: Quentin Casasnovas, Shawn Pearce, Jeff King,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

Hi all,

I've bisected a performance regression (noticed by Quentin and myself)
which caused a 'git fetch' to go from ~1m30s to ~2m40s:

commit 47bf4b0fc52f3ad5823185a85f5f82325787c84b
Author: Jeff King <peff@peff.net>
Date:   Mon Jun 30 13:04:03 2014 -0400

     prepare_packed_git_one: refactor duplicate-pack check

Reverting this commit from a recent mainline master brings the time back
down from ~2m24s to ~1m19s.

The bisect log:

v2.8.1 -- 2m41s, 2m50s (bad)
v1.9.0 -- 1m39s, 1m46s (good)

2.3.4.312.gea1fd48 -- 2m40s
2.1.0.18.gc285171 -- 2m42s
2.0.0.140.g6753d8a -- 1m27s
2.0.1.480.g60e2f5a -- 1m34s
2.0.2.631.gad25da0 -- 2m39s
2.0.1.565.ge0a064a -- 1m30s
2.0.1.622.g2e42338 -- 2m29s
2.0.0.rc1.32.g5165dd5 -- 1m30s
2.0.1.607.g5418212 -- 1m32s
2.0.1.7.g6dda4e6 -- 1m28s
2.0.1.619.g6e40947 -- 2m25s
2.0.1.9.g47bf4b0 -- 2m18s
2.0.1.8.gd6cd00c -- 1m36.542s

However, the commit found by 'git blame' above appears just fine to me,
I haven't been able to spot a bug in it.

A closer inspection reveals the problem to really be that this is an
extremely hot path with more than -- holy cow -- 4,106,756,451
iterations on the 'packed_git' list for a single 'git fetch' on my
repository. I'm guessing the patch above just made the inner loop
ever so slightly slower.

My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
and 4 tmp_pack_* files).

I am convinced that it is not necessary to rescan the entire pack
directory 11,348 times or do all 4 _BILLION_ memcmp() calls for a single
'git fetch', even for a large repository like mine.

I could try to write a patch to reduce the number of times we rescan the
pack directory. However, I've never even looked at the file before
today, so any hints regarding what would need to be done would be
appreciated.

Thanks,

(Cced some people with changes in the area.)

Vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 22:30 Huge performance bottleneck reading packs Vegard Nossum
@ 2016-10-12 22:45 ` Junio C Hamano
  2016-10-13  7:17   ` Vegard Nossum
  2016-10-12 23:01 ` Jeff King
  1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2016-10-12 22:45 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce, Jeff King,
	Nguyễn Thái Ngọc Duy

Vegard Nossum <vegard.nossum@oracle.com> writes:

> A closer inspection reveals the problem to really be that this is an
> extremely hot path with more than -- holy cow -- 4,106,756,451
> iterations on the 'packed_git' list for a single 'git fetch' on my
> repository. I'm guessing the patch above just made the inner loop
> ever so slightly slower.

Very plausible, and this ...

> My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
> and 4 tmp_pack_* files).

... may explain why nobody else has seen a difference.  

Is there a reason why your repository has that many pack files?  Is
automatic GC not working for some reason?

"gc" would try to make sure that you have reasonably low number of
packs, as having too many packs is detrimental for performance for
multiple reasons, including:

 * All objects in a single pack expressed in delta format (i.e. only
   the difference from another object is stored) must eventually
   have another object that its difference is based on recorded in
   the full format in the same packfile.

 * A single packfile records a single object only once, but it is
   normal (and often required because of the point above) that the
   same object appears in multiple packfiles.

 * Locating of objects from a single packfile uses its .idx file by
   binary search of sorted list of object names, which is efficient,
   but this cost is multiplied linearly as the number of packs you
   have in your repository.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 22:30 Huge performance bottleneck reading packs Vegard Nossum
  2016-10-12 22:45 ` Junio C Hamano
@ 2016-10-12 23:01 ` Jeff King
  2016-10-12 23:18   ` Jeff King
  2016-10-13  7:20   ` Vegard Nossum
  1 sibling, 2 replies; 22+ messages in thread
From: Jeff King @ 2016-10-12 23:01 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 12:30:52AM +0200, Vegard Nossum wrote:

> However, the commit found by 'git blame' above appears just fine to me,
> I haven't been able to spot a bug in it.
> 
> A closer inspection reveals the problem to really be that this is an
> extremely hot path with more than -- holy cow -- 4,106,756,451
> iterations on the 'packed_git' list for a single 'git fetch' on my
> repository. I'm guessing the patch above just made the inner loop
> ever so slightly slower.
> 
> My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
> and 4 tmp_pack_* files).

Yeah. I agree that the commit you found makes the check a little more
expensive, but I think the root of the problem is calling
prepare_packed_git_one many times. This _should_ happen once for each
pack at program startup, and possibly again if we need to re-scan the
pack directory to account for racing with a simultaneous repack.

The latter is generally triggered when we fail to look up an object we
expect to exist. So I'd suspect 45e8a74 (has_sha1_file: re-check pack
directory before giving up, 2013-08-30) is playing a part. We dealt with
that to some degree in 0eeb077 (index-pack: avoid excessive re-reading
of pack directory, 2015-06-09), but it would not surprise me if there is
another spot that needs similar treatment.

Does the patch below help?

diff --git a/builtin/fetch.c b/builtin/fetch.c
index d5329f9..c0f3c2c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -243,7 +243,7 @@ static void find_non_local_tags(struct transport *transport,
 		if (ends_with(ref->name, "^{}")) {
 			if (item && !has_object_file(&ref->old_oid) &&
 			    !will_fetch(head, ref->old_oid.hash) &&
-			    !has_sha1_file(item->util) &&
+			    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 			    !will_fetch(head, item->util))
 				item->util = NULL;
 			item = NULL;
@@ -256,7 +256,8 @@ static void find_non_local_tags(struct transport *transport,
 		 * to check if it is a lightweight tag that we want to
 		 * fetch.
 		 */
-		if (item && !has_sha1_file(item->util) &&
+		if (item &&
+		    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 		    !will_fetch(head, item->util))
 			item->util = NULL;
 
@@ -276,7 +277,8 @@ static void find_non_local_tags(struct transport *transport,
 	 * We may have a final lightweight tag that needs to be
 	 * checked to see if it needs fetching.
 	 */
-	if (item && !has_sha1_file(item->util) &&
+	if (item &&
+	    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 	    !will_fetch(head, item->util))
 		item->util = NULL;
 

-Peff

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 23:01 ` Jeff King
@ 2016-10-12 23:18   ` Jeff King
  2016-10-12 23:47     ` Jeff King
  2016-10-13  7:20   ` Vegard Nossum
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-12 23:18 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Wed, Oct 12, 2016 at 07:01:43PM -0400, Jeff King wrote:

> On Thu, Oct 13, 2016 at 12:30:52AM +0200, Vegard Nossum wrote:
> 
> > However, the commit found by 'git blame' above appears just fine to me,
> > I haven't been able to spot a bug in it.
> > 
> > A closer inspection reveals the problem to really be that this is an
> > extremely hot path with more than -- holy cow -- 4,106,756,451
> > iterations on the 'packed_git' list for a single 'git fetch' on my
> > repository. I'm guessing the patch above just made the inner loop
> > ever so slightly slower.
> > 
> > My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
> > and 4 tmp_pack_* files).
> 
> Yeah. I agree that the commit you found makes the check a little more
> expensive, but I think the root of the problem is calling
> prepare_packed_git_one many times. This _should_ happen once for each
> pack at program startup, and possibly again if we need to re-scan the
> pack directory to account for racing with a simultaneous repack.
> 
> The latter is generally triggered when we fail to look up an object we
> expect to exist. So I'd suspect 45e8a74 (has_sha1_file: re-check pack
> directory before giving up, 2013-08-30) is playing a part. We dealt with
> that to some degree in 0eeb077 (index-pack: avoid excessive re-reading
> of pack directory, 2015-06-09), but it would not surprise me if there is
> another spot that needs similar treatment.
> 
> Does the patch below help?

Also, is it possible to make the repository in question available? I
might be able to reproduce based on your description, but it would save
time if I could directly run gdb on your example.

Specifically, I'm interested in the call graph that leads to calling
reprepare_packed_git().

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 23:18   ` Jeff King
@ 2016-10-12 23:47     ` Jeff King
  2016-10-13  9:04       ` Vegard Nossum
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-12 23:47 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Wed, Oct 12, 2016 at 07:18:07PM -0400, Jeff King wrote:

> Also, is it possible to make the repository in question available? I
> might be able to reproduce based on your description, but it would save
> time if I could directly run gdb on your example.

I tried this by making a bunch of packs in linux.git (my standard "this
is pretty big" repo), like so:

  for i in $(seq 1000); do
    git rev-list --objects HEAD~$((i+1))..HEAD~$i |
    git pack-objects --delta-base-offset .git/objects/pack/pack
  done

and then doing a 25,000-object fetch from upstream (no significance to
the number; that's just how far behind upstream I happened to be).

However, I didn't notice any regression. In fact, it was much _slower_
than v1.9.0, because that older version didn't have threaded index-pack.

If you can't share the repo directly, can you tell us more about your
fetch? How many objects are in your repository? How many objects are
fetched? How many refs are there on the remote side?

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 22:45 ` Junio C Hamano
@ 2016-10-13  7:17   ` Vegard Nossum
  2016-10-13 14:50     ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Vegard Nossum @ 2016-10-13  7:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Quentin Casasnovas, Shawn Pearce, Jeff King,
	Nguyễn Thái Ngọc Duy

On 10/13/2016 12:45 AM, Junio C Hamano wrote:
 > Vegard Nossum <vegard.nossum@oracle.com> writes:
 >
 >> A closer inspection reveals the problem to really be that this is an
 >> extremely hot path with more than -- holy cow -- 4,106,756,451
 >> iterations on the 'packed_git' list for a single 'git fetch' on my
 >> repository. I'm guessing the patch above just made the inner loop
 >> ever so slightly slower.
 >
 > Very plausible, and this ...
 >
 >> My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
 >> and 4 tmp_pack_* files).
 >
 > ... may explain why nobody else has seen a difference.
 >
 > Is there a reason why your repository has that many pack files?  Is
 > automatic GC not working for some reason?

Oops. I disabled gc a while ago; one reason I did that is that it takes
a long time to run and it has a tendency to kick in at the worst time. I
guess I should really put it in cron then.

I'm not sure if this is related, but I also had a problem with GitPython
and large pack files in the past (" ValueError: Couldn't obtain fanout
table or warning: packfile ./objects/pack/....pack cannot be accessed")
and I have pack.packSizeLimit set to 512m to fix that.
Although the whole repo is 17G so I guess it shouldn't be necessary to
have that many pack files.

Will try Jeff's patch, then a gc. Thanks!

Vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 23:01 ` Jeff King
  2016-10-12 23:18   ` Jeff King
@ 2016-10-13  7:20   ` Vegard Nossum
  2016-10-13 15:26     ` Jeff King
  1 sibling, 1 reply; 22+ messages in thread
From: Vegard Nossum @ 2016-10-13  7:20 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On 10/13/2016 01:01 AM, Jeff King wrote:
> On Thu, Oct 13, 2016 at 12:30:52AM +0200, Vegard Nossum wrote:
>
>> However, the commit found by 'git blame' above appears just fine to me,
>> I haven't been able to spot a bug in it.
>>
>> A closer inspection reveals the problem to really be that this is an
>> extremely hot path with more than -- holy cow -- 4,106,756,451
>> iterations on the 'packed_git' list for a single 'git fetch' on my
>> repository. I'm guessing the patch above just made the inner loop
>> ever so slightly slower.
>>
>> My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
>> and 4 tmp_pack_* files).
>
> Yeah. I agree that the commit you found makes the check a little more
> expensive, but I think the root of the problem is calling
> prepare_packed_git_one many times. This _should_ happen once for each
> pack at program startup, and possibly again if we need to re-scan the
> pack directory to account for racing with a simultaneous repack.
>
> The latter is generally triggered when we fail to look up an object we
> expect to exist. So I'd suspect 45e8a74 (has_sha1_file: re-check pack
> directory before giving up, 2013-08-30) is playing a part. We dealt with
> that to some degree in 0eeb077 (index-pack: avoid excessive re-reading
> of pack directory, 2015-06-09), but it would not surprise me if there is
> another spot that needs similar treatment.
>
> Does the patch below help?

Yes, ~2m10s -> ~1m25s when I test a git fetch this morning (the other
variation in time may be due to different CPU usage by other programs,
but I ran with/without the patch multiple times and the difference is
consistent).

Thanks,


Vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-12 23:47     ` Jeff King
@ 2016-10-13  9:04       ` Vegard Nossum
  2016-10-14  9:35         ` Jakub Narębski
  0 siblings, 1 reply; 22+ messages in thread
From: Vegard Nossum @ 2016-10-13  9:04 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On 10/13/2016 01:47 AM, Jeff King wrote:
> On Wed, Oct 12, 2016 at 07:18:07PM -0400, Jeff King wrote:
>
>> Also, is it possible to make the repository in question available? I
>> might be able to reproduce based on your description, but it would save
>> time if I could directly run gdb on your example.

I won't be able to make the repository available, sorry.

> I tried this by making a bunch of packs in linux.git (my standard "this
> is pretty big" repo), like so:
>
>   for i in $(seq 1000); do
>     git rev-list --objects HEAD~$((i+1))..HEAD~$i |
>     git pack-objects --delta-base-offset .git/objects/pack/pack
>   done
>
> and then doing a 25,000-object fetch from upstream (no significance to
> the number; that's just how far behind upstream I happened to be).
>
> However, I didn't notice any regression. In fact, it was much _slower_
> than v1.9.0, because that older version didn't have threaded index-pack.
>
> If you can't share the repo directly, can you tell us more about your
> fetch? How many objects are in your repository? How many objects are
> fetched? How many refs are there on the remote side?

The fetch doesn't actually get anything from the remote as everything is
already up to date (that makes the 2m40s times even more frustrating in
a way :-)). Here's count-objects:

$ git count-objects -v
warning: garbage found: .git/objects/pack/tmp_pack_pAZcu4
warning: garbage found: .git/objects/pack/tmp_pack_KhzrrI
warning: garbage found: .git/objects/pack/tmp_pack_mycfro
warning: garbage found: .git/objects/pack/tmp_pack_2kxKOn
count: 51609
size: 288768
in-pack: 23902336
packs: 1044
size-pack: 16588157
prune-packable: 48628
garbage: 4
size-garbage: 84792

There are some 20k refs on the remote, closer to 25k locally.

I'll try to get a profile (without your patch) before doing a gc run.


Vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13  7:17   ` Vegard Nossum
@ 2016-10-13 14:50     ` Jeff King
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff King @ 2016-10-13 14:50 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Junio C Hamano, git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

On Thu, Oct 13, 2016 at 09:17:34AM +0200, Vegard Nossum wrote:

> Oops. I disabled gc a while ago; one reason I did that is that it takes
> a long time to run and it has a tendency to kick in at the worst time. I
> guess I should really put it in cron then.
> 
> I'm not sure if this is related, but I also had a problem with GitPython
> and large pack files in the past (" ValueError: Couldn't obtain fanout
> table or warning: packfile ./objects/pack/....pack cannot be accessed")

Sounds like they didn't correctly implement the extra index fanout that
happens for pack above 2G. The old Grit library had a similar bug.

> and I have pack.packSizeLimit set to 512m to fix that.
> Although the whole repo is 17G so I guess it shouldn't be necessary to
> have that many pack files.

Using packSizeLimit does "solve" that problem, but it comes with its own
set of issues. There is a very good chance that your repository would be
much smaller than 17G as a single packfile, because Git does not allow
deltas across packs, and it does not optimize the placement of objects
to keep delta-related objects in a single pack. So you'll quite often be
storing full copies of objects that could otherwise be stored as a tiny
delta.

You might want to compare the resulting size for a full repack with and
without pack.packSizeLimit.

But I agree that is not the cause of your thousand packs. They are more
likely the accumulated cruft of a thousand fetches.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13  7:20   ` Vegard Nossum
@ 2016-10-13 15:26     ` Jeff King
  2016-10-13 16:53       ` [PATCH] fetch: use "quick" has_sha1_file for tag following Jeff King
  2016-10-13 18:18       ` Huge performance bottleneck reading packs Vegard Nossum
  0 siblings, 2 replies; 22+ messages in thread
From: Jeff King @ 2016-10-13 15:26 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 09:20:07AM +0200, Vegard Nossum wrote:

> > Does the patch below help?
> 
> Yes, ~2m10s -> ~1m25s when I test a git fetch this morning (the other
> variation in time may be due to different CPU usage by other programs,
> but I ran with/without the patch multiple times and the difference is
> consistent).
> [...]
> There are some 20k refs on the remote, closer to 25k locally.

OK, that makes some sense. For whatever reason, your remote has a bunch
of tags that point to objects you do not already have. That could
happen, I think, if the remote added a lot of tags since you cloned
(because clone will grab all of the tags), but those tags do not point
to history that you are otherwise fetching (since fetch by default will
"auto-follow" such tags).

It might be interesting to see the results of:

  # all the objects we have
  git cat-file --batch-all-objects --batch-check='%(objectname)' >us

  # all the objects the remote is advertising
  git ls-remote origin | cut -f1 | sort -u >them

  # what they are advertising that we don't have
  comm -13 us them | wc -l

My guess is that the number is relatively high. And that would explain
why nobody else has really complained much; such a pattern is probably
uncommon.

> The fetch doesn't actually get anything from the remote as everything is
> already up to date (that makes the 2m40s times even more frustrating in
> a way :-)). Here's count-objects:

If the fetch is largely a noop, then that makes me wonder why we are
spending even a minute in the "good" case. I wonder if there is some
other spot that is wasting CPU on some inefficient data structure
related to the number of refs you have. If you can do any profiling that
points to a hot spot, that would be interesting to see (and also whether
a gc improves things).

I see in find_non_local_tags() that we build up a sorted string_list via
repeated calls to string_list_insert(), which will keep the thing sorted
at each stage. That's not as efficient as just sorting at the end, but I
think it's probably OK in practice because we actually feed it via
for_each_ref(), whose output is sorted, and so we'd always just be
appending to the end.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-13 15:26     ` Jeff King
@ 2016-10-13 16:53       ` Jeff King
  2016-10-13 17:04         ` Jeff King
  2016-10-13 18:18       ` Huge performance bottleneck reading packs Vegard Nossum
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-13 16:53 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 11:26:32AM -0400, Jeff King wrote:

> On Thu, Oct 13, 2016 at 09:20:07AM +0200, Vegard Nossum wrote:
> 
> > > Does the patch below help?
> > 
> > Yes, ~2m10s -> ~1m25s when I test a git fetch this morning (the other
> > variation in time may be due to different CPU usage by other programs,
> > but I ran with/without the patch multiple times and the difference is
> > consistent).
> > [...]
> > There are some 20k refs on the remote, closer to 25k locally.
> 
> OK, that makes some sense. For whatever reason, your remote has a bunch
> of tags that point to objects you do not already have. That could
> happen, I think, if the remote added a lot of tags since you cloned
> (because clone will grab all of the tags), but those tags do not point
> to history that you are otherwise fetching (since fetch by default will
> "auto-follow" such tags).

Armed with this information, I was able to reproduce the issue locally.
However, once my patch is applied, it's now quite fast. So I still don't
know where your other 1m25s is going.

So here's that same patch wrapped up with a commit message. Note that I
converted one more call site to the "quick" form; it would trigger when
the candidate tags are real tag objects, not just pointers to commits.
That might improve your runtime more, depending on what is actually in
your repository.

-- >8 --
Subject: [PATCH] fetch: use "quick" has_sha1_file for tag following

When we auto-follow tags in a fetch, we look at all of the
tags advertised by the remote and fetch ones where we don't
already have the tag, but we do have the object it peels to.
This involves a lot of calls to has_sha1_file(), some of
which we can reasonably expect to fail. Since 45e8a74
(has_sha1_file: re-check pack directory before giving up,
2013-08-30), this may cause many calls to
reprepare_packed_git(), which is potentially expensive.

This has gone unnoticed for several years because it
requires a fairly unique setup to matter:

  1. You need to have a lot of packs on the client side to
     make reprepare_packed_git() expensive (the most
     expensive part is finding duplicates in an unsorted
     list, which is currently quadratic).

  2. You need a large number of tag refs on the server side
     that are candidates for auto-following (i.e., that the
     client doesn't have). Each one triggers a re-read of
     the pack directory.

  3. Under normal circumstances, the client would
     auto-follow those tags and after one large fetch, (2)
     would no longer be true. But if those tags point to
     history which is disconnected from what the client
     otherwise fetches, then it will never auto-follow, and
     those candidates will impact it on every fetch.

So when all three are true, each fetch pays an extra
O(nr_tags * nr_packs^2) cost, mostly in string comparisons
on the pack names. This was exacerbated by 47bf4b0
(prepare_packed_git_one: refactor duplicate-pack check,
2014-06-30) which uses a slightly more expensive string
check, under the assumption that the duplicate check doesn't
happen very often (and it shouldn't; the real problem here
is how often we are calling reprepare_packed_git()).

This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
accuracy for speed, in cases where we might be racy with a
simultaneous repack. This is similar to the fix in 0eeb077
(index-pack: avoid excessive re-reading of pack directory,
2015-06-09). As with that case, it's OK for has_sha1_file()
occasionally say "no I don't have it" when we do, because
the worst case is not a corruption, but simply that we may
fail to auto-follow a tag that points to it.

Here are results from the included perf script, which sets
up a situation similar to the one described above:

Test            HEAD^               HEAD
----------------------------------------------------------
5550.4: fetch   11.21(10.42+0.78)   0.08(0.04+0.02) -99.3%

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fetch.c            | 11 ++++--
 cache.h                    |  1 +
 sha1_file.c                |  5 +++
 t/perf/p5550-fetch-tags.sh | 99 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 112 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p5550-fetch-tags.sh

diff --git a/builtin/fetch.c b/builtin/fetch.c
index d5329f9..74c0546 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -241,9 +241,10 @@ static void find_non_local_tags(struct transport *transport,
 		 * as one to ignore by setting util to NULL.
 		 */
 		if (ends_with(ref->name, "^{}")) {
-			if (item && !has_object_file(&ref->old_oid) &&
+			if (item &&
+			    !has_object_file_with_flags(&ref->old_oid, HAS_SHA1_QUICK) &&
 			    !will_fetch(head, ref->old_oid.hash) &&
-			    !has_sha1_file(item->util) &&
+			    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 			    !will_fetch(head, item->util))
 				item->util = NULL;
 			item = NULL;
@@ -256,7 +257,8 @@ static void find_non_local_tags(struct transport *transport,
 		 * to check if it is a lightweight tag that we want to
 		 * fetch.
 		 */
-		if (item && !has_sha1_file(item->util) &&
+		if (item &&
+		    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 		    !will_fetch(head, item->util))
 			item->util = NULL;
 
@@ -276,7 +278,8 @@ static void find_non_local_tags(struct transport *transport,
 	 * We may have a final lightweight tag that needs to be
 	 * checked to see if it needs fetching.
 	 */
-	if (item && !has_sha1_file(item->util) &&
+	if (item &&
+	    !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
 	    !will_fetch(head, item->util))
 		item->util = NULL;
 
diff --git a/cache.h b/cache.h
index 2cfb1ca..ec791a6 100644
--- a/cache.h
+++ b/cache.h
@@ -1154,6 +1154,7 @@ static inline int has_sha1_file(const unsigned char *sha1)
 
 /* Same as the above, except for struct object_id. */
 extern int has_object_file(const struct object_id *oid);
+extern int has_object_file_with_flags(const struct object_id *oid, int flags);
 
 /*
  * Return true iff an alternate object database has a loose object
diff --git a/sha1_file.c b/sha1_file.c
index 309e87d..c652cb6 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3309,6 +3309,11 @@ int has_object_file(const struct object_id *oid)
 	return has_sha1_file(oid->hash);
 }
 
+int has_object_file_with_flags(const struct object_id *oid, int flags)
+{
+	return has_sha1_file_with_flags(oid->hash, flags);
+}
+
 static void check_tree(const void *buf, size_t size)
 {
 	struct tree_desc desc;
diff --git a/t/perf/p5550-fetch-tags.sh b/t/perf/p5550-fetch-tags.sh
new file mode 100755
index 0000000..a5dc39f
--- /dev/null
+++ b/t/perf/p5550-fetch-tags.sh
@@ -0,0 +1,99 @@
+#!/bin/sh
+
+test_description='performance of tag-following with many tags
+
+This tests a fairly pathological case, so rather than rely on a real-world
+case, we will construct our own repository. The situation is roughly as
+follows.
+
+The parent repository has a large number of tags which are disconnected from
+the rest of history. That makes them candidates for tag-following, but we never
+actually grab them (and thus they will impact each subsequent fetch).
+
+The child repository is a clone of parent, without the tags, and is at least
+one commit behind the parent (meaning that we will fetch one object and then
+examine the tags to see if they need followed). Furthermore, it has a large
+number of packs.
+
+The exact values of "large" here are somewhat arbitrary; I picked values that
+start to show a noticeable performance problem on my machine, but without
+taking too long to set up and run the tests.
+'
+. ./perf-lib.sh
+
+# make a long nonsense history on branch $1, consisting of $2 commits, each
+# with a unique file pointing to the blob at $2.
+create_history () {
+	perl -le '
+		my ($branch, $n, $blob) = @ARGV;
+		for (1..$n) {
+			print "commit refs/heads/$branch";
+			print "committer nobody <nobody@example.com> now";
+			print "data 4";
+			print "foo";
+			print "M 100644 $blob $_";
+		}
+	' "$@" |
+	git fast-import --date-format=now
+}
+
+# make a series of tags, one per commit in the revision range given by $@
+create_tags () {
+	git rev-list "$@" |
+	perl -lne 'print "create refs/tags/$. $_"' |
+	git update-ref --stdin
+}
+
+# create $1 nonsense packs, each with a single blob
+create_packs () {
+	perl -le '
+		my ($n) = @ARGV;
+		for (1..$n) {
+			print "blob";
+			print "data <<EOF";
+			print "$_";
+			print "EOF";
+		}
+	' "$@" |
+	git fast-import &&
+
+	git cat-file --batch-all-objects --batch-check='%(objectname)' |
+	while read sha1
+	do
+		echo $sha1 | git pack-objects .git/objects/pack/pack
+	done
+}
+
+test_expect_success 'create parent and child' '
+	git init parent &&
+	git -C parent commit --allow-empty -m base &&
+	git clone parent child &&
+	git -C parent commit --allow-empty -m trigger-fetch
+'
+
+test_expect_success 'populate parent tags' '
+	(
+		cd parent &&
+		blob=$(echo content | git hash-object -w --stdin) &&
+		create_history cruft 3000 $blob &&
+		create_tags cruft &&
+		git branch -D cruft
+	)
+'
+
+test_expect_success 'create child packs' '
+	(
+		cd child &&
+		git config gc.auto 0 &&
+		git config gc.autopacklimit 0 &&
+		create_packs 500
+	)
+'
+
+test_perf 'fetch' '
+	# make sure there is something to fetch on each iteration
+	git -C child update-ref -d refs/remotes/origin/master &&
+	git -C child fetch
+'
+
+test_done
-- 
2.10.1.589.gca16299


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-13 16:53       ` [PATCH] fetch: use "quick" has_sha1_file for tag following Jeff King
@ 2016-10-13 17:04         ` Jeff King
  2016-10-13 20:06           ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-13 17:04 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 12:53:44PM -0400, Jeff King wrote:

> -- >8 --
> Subject: [PATCH] fetch: use "quick" has_sha1_file for tag following

A few comments on my own patch...

> This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
> accuracy for speed, in cases where we might be racy with a
> simultaneous repack. This is similar to the fix in 0eeb077
> (index-pack: avoid excessive re-reading of pack directory,
> 2015-06-09). As with that case, it's OK for has_sha1_file()
> occasionally say "no I don't have it" when we do, because
> the worst case is not a corruption, but simply that we may
> fail to auto-follow a tag that points to it.

Failing in this direction doesn't make me feel great. I had hoped it
would fail the _other_ way, and request an object that we might already
have.

There are two alternatives that might be worth pursuing:

  1. The thing that makes this really painful is the quadratic
     duplicate-search in prepare_packed_git_one(). We could speed that
     up pretty easily by keeping a hash of the packs indexed by their
     names, and looking up in that.

     That drops the quadratic factor, but it's still going to be
     O(nr_tags * nr_packs), as we have to at the very least readdir()
     all of objects/pack to see that nothing new showed up.

     I wonder if we could trust the timestamp on the objects/pack
     directory to avoid re-reading it at all. That turns it into a
     single stat() call.

  2. We could stop bothering to reprepare_packed_git() only after the
     nth time it is called. This basically turns on the "sacrifice
     accuracy for speed" behavior automatically, but it means that most
     cases would never do so, because it wouldn't be creating a
     performance issue in the first place.

     It feels weird and flaky that git might behave differently based on
     the number of unfetched tags the remote happens to have, though.

> Here are results from the included perf script, which sets
> up a situation similar to the one described above:
> 
> Test            HEAD^               HEAD
> ----------------------------------------------------------
> 5550.4: fetch   11.21(10.42+0.78)   0.08(0.04+0.02) -99.3%

The situation in this perf script is obviously designed to show off this
one specific optimization. It feels a bit overly specific, and I doubt
anybody will be all that interested in running it again once the fix is
in. OTOH, I did want to document my reproduction steps, and this seemed
like the only reasonable place to do so. And as the perf suite is
already pretty expensive, perhaps nobody minds adding to it too much.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13 15:26     ` Jeff King
  2016-10-13 16:53       ` [PATCH] fetch: use "quick" has_sha1_file for tag following Jeff King
@ 2016-10-13 18:18       ` Vegard Nossum
  2016-10-13 20:43         ` Jeff King
  1 sibling, 1 reply; 22+ messages in thread
From: Vegard Nossum @ 2016-10-13 18:18 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On 10/13/2016 05:26 PM, Jeff King wrote:
> On Thu, Oct 13, 2016 at 09:20:07AM +0200, Vegard Nossum wrote:
>
>>> Does the patch below help?
>>
>> Yes, ~2m10s -> ~1m25s when I test a git fetch this morning (the other
>> variation in time may be due to different CPU usage by other programs,
>> but I ran with/without the patch multiple times and the difference is
>> consistent).
>> [...]
>> There are some 20k refs on the remote, closer to 25k locally.
>
> OK, that makes some sense. For whatever reason, your remote has a bunch
> of tags that point to objects you do not already have. That could
> happen, I think, if the remote added a lot of tags since you cloned
> (because clone will grab all of the tags), but those tags do not point
> to history that you are otherwise fetching (since fetch by default will
> "auto-follow" such tags).
>
> It might be interesting to see the results of:
>
>   # all the objects we have
>   git cat-file --batch-all-objects --batch-check='%(objectname)' >us
>
>   # all the objects the remote is advertising
>   git ls-remote origin | cut -f1 | sort -u >them
>
>   # what they are advertising that we don't have
>   comm -13 us them | wc -l
>
> My guess is that the number is relatively high. And that would explain
> why nobody else has really complained much; such a pattern is probably
> uncommon.

I get ~3,700 objects "they are advertising that we don't have".

They are all indeed tags which I don't have (nor want) locally.

>> The fetch doesn't actually get anything from the remote as everything is
>> already up to date (that makes the 2m40s times even more frustrating in
>> a way :-)). Here's count-objects:
>
> If the fetch is largely a noop, then that makes me wonder why we are
> spending even a minute in the "good" case. I wonder if there is some

It was user time I reported and this is over network, so some of it
might be IO wait on the network? With the (first) patch:

$ time git fetch

real    1m9.248s
user    0m40.808s
sys     0m2.996s

Also see the profile data below.

> other spot that is wasting CPU on some inefficient data structure
> related to the number of refs you have. If you can do any profiling that
> points to a hot spot, that would be interesting to see (and also whether
> a gc improves things).
>
> I see in find_non_local_tags() that we build up a sorted string_list via
> repeated calls to string_list_insert(), which will keep the thing sorted
> at each stage. That's not as efficient as just sorting at the end, but I
> think it's probably OK in practice because we actually feed it via
> for_each_ref(), whose output is sorted, and so we'd always just be
> appending to the end.

So without your patch I get these numbers:

   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  37.34     29.83    29.83 1948265116     0.00     0.00  strip_suffix_mem
  27.56     51.85    22.02    11358     0.00     0.01 
prepare_packed_git_one
  14.24     63.23    11.38 1924457702     0.00     0.00  strip_suffix
   4.88     67.13     3.90 26045319     0.00     0.00  find_pack_entry_one
   4.17     70.46     3.33 84828149     0.00     0.00  hashcmp
   3.28     73.08     2.62 79760199     0.00     0.00  hashcmp
   2.44     75.03     1.95                             const_error

the call graph data shows all the reprepare_packed_git() calls to come
from a chain like this:

do_for_each_ref
do_for_each_ref_iterator
ref_iterator_advance
files_ref_iterator_advance
ref_resolves_to_object
has_sha1_file
has_sha1_file_with_flags
reprepare_packed_git

the do_for_each_ref() calls come from prepare_replace_object(),
do_fetch(), find_common(), everything_local(),
check_for_new_submodule_commits(), and find_non_local_tags().

With your (first) patch I get this instead:

   %   cumulative   self              self     total
  time   seconds   seconds    calls  ms/call  ms/call  name
  29.41      0.25     0.25  4490399     0.00     0.00  hashcmp
  16.47      0.39     0.14   843447     0.00     0.00  find_pack_entry_one
  10.59      0.48     0.09    60490     0.00     0.00 
unpack_object_header_buffer
   4.71      0.52     0.04   167441     0.00     0.00  lookup_object
   3.53      0.55     0.03   843447     0.00     0.00  fill_pack_entry
   3.53      0.58     0.03   213169     0.00     0.00 
check_refname_component
   2.35      0.60     0.02   550723     0.00     0.00  strip_suffix_mem
   2.35      0.62     0.02   214607     0.00     0.00  insert_obj_hash
   2.35      0.64     0.02   158866     0.00     0.00  hashcmp
   2.35      0.66     0.02   153936     0.00     0.00 
nth_packed_object_offset

Do you have all the profile data you were interested in before I try a
'git gc'?

I really appreciate the quick response and the work you put into fixing
this, even when it's my fault for disabling gc, thanks!


Vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-13 17:04         ` Jeff King
@ 2016-10-13 20:06           ` Jeff King
  2016-10-14 17:39             ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-13 20:06 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 01:04:43PM -0400, Jeff King wrote:

> > This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
> > accuracy for speed, in cases where we might be racy with a
> > simultaneous repack. This is similar to the fix in 0eeb077
> > (index-pack: avoid excessive re-reading of pack directory,
> > 2015-06-09). As with that case, it's OK for has_sha1_file()
> > occasionally say "no I don't have it" when we do, because
> > the worst case is not a corruption, but simply that we may
> > fail to auto-follow a tag that points to it.
> 
> Failing in this direction doesn't make me feel great. I had hoped it
> would fail the _other_ way, and request an object that we might already
> have.
> 
> There are two alternatives that might be worth pursuing:
> 
>   1. The thing that makes this really painful is the quadratic
>      duplicate-search in prepare_packed_git_one(). We could speed that
>      up pretty easily by keeping a hash of the packs indexed by their
>      names, and looking up in that.
> 
>      That drops the quadratic factor, but it's still going to be
>      O(nr_tags * nr_packs), as we have to at the very least readdir()
>      all of objects/pack to see that nothing new showed up.

That hash patch would look something like what is below.

Here are numbers for the perf script ("quick" is the one that skips
reprepare entirely, "hash" is this hash table patch, and "quick+hash" is
the combination of the two):

origin              quick                    hash                     quick+hash            
--------------------------------------------------------------------------------------------
11.09(10.38+0.70)   0.08(0.06+0.01) -99.3%   1.41(0.65+0.75) -87.3%   0.07(0.05+0.01) -99.4%

So yes, it does improve this case, but not nearly as much as the "quick"
approach. Putting it on top of the "quick" approach is barely noticeable
(it is speeding up the initial prepare_packed_git() call, but it's just
not expensive enough in the first place to be worth it).

The "hash" patch does fix a potentially quadratic behavior, though, so I
guess in theory it is worth it. But I had to go up to quite a large
number of packs to make it worthwhile. Here it is at 7500 packs, running
"git cat-file -e $some_object_that_exists":

  [origin]
  real    0m0.417s
  user    0m0.376s
  sys     0m0.040s

  [hash]
  real    0m0.054s
  user    0m0.008s
  sys     0m0.044s

So it's certainly better. But 7500 packs is just silly, and squeezing
out ~400ms there is hardly worth it. If you repack this same case into a
single pack, the command drops to 5ms. So yes, there's close to an order
of magnitude speedup here, but you get that _and_ another order of
magnitude just by repacking.

So I think it's only worth pursuing if we wanted to scrap my original
patch, and continue to aggressively reprepare_packed_git(). I'd worry
that it's still expensive on some systems, even with the hash, because
the opendir/readdir might cost a lot (OTOH, we'll by definition have
just done a stat() on the loose version of the object, so there's a
limit to how fast we can make it).

I dunno.

---
 cache.h     |  2 ++
 sha1_file.c | 44 +++++++++++++++++++++++++++++++++++---------
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/cache.h b/cache.h
index ec791a6..0d8b4e8 100644
--- a/cache.h
+++ b/cache.h
@@ -1411,6 +1411,7 @@ struct pack_window {
 };
 
 extern struct packed_git {
+	struct hashmap_entry name_entry;
 	struct packed_git *next;
 	struct pack_window *windows;
 	off_t pack_size;
@@ -1428,6 +1429,7 @@ extern struct packed_git {
 		 do_not_close:1;
 	unsigned char sha1[20];
 	struct revindex_entry *revindex;
+	size_t basename_len;
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
 } *packed_git;
diff --git a/sha1_file.c b/sha1_file.c
index c652cb6..eb6a5f3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -602,6 +602,8 @@ struct packed_git *packed_git;
 static struct mru packed_git_mru_storage;
 struct mru *packed_git_mru = &packed_git_mru_storage;
 
+static struct hashmap pack_name_index;
+
 void pack_report(void)
 {
 	fprintf(stderr,
@@ -1260,6 +1262,30 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 	return p;
 }
 
+static int pack_name_hashcmp(const void *va, const void *vb, const void *name)
+{
+	const struct packed_git *a = va, *b = vb;
+
+	if (a->basename_len != b->basename_len)
+		return -1;
+	else if (name)
+		return memcmp(a->pack_name, name, a->basename_len);
+	else
+		return memcmp(a->pack_name, b->pack_name, a->basename_len);
+}
+
+static int pack_loaded(const char *name, size_t len)
+{
+	struct packed_git key;
+
+	if (!pack_name_index.tablesize)
+		return 0;
+
+	hashmap_entry_init(&key, memhash(name, len));
+	key.basename_len = len;
+	return !!hashmap_get(&pack_name_index, &key, name);
+}
+
 void install_packed_git(struct packed_git *pack)
 {
 	if (pack->pack_fd != -1)
@@ -1267,6 +1293,12 @@ void install_packed_git(struct packed_git *pack)
 
 	pack->next = packed_git;
 	packed_git = pack;
+
+	if (!pack_name_index.tablesize)
+		hashmap_init(&pack_name_index, pack_name_hashcmp, 0);
+	strip_suffix(pack->pack_name, ".pack", &pack->basename_len);
+	hashmap_entry_init(pack, memhash(pack->pack_name, pack->basename_len));
+	hashmap_add(&pack_name_index, pack);
 }
 
 void (*report_garbage)(unsigned seen_bits, const char *path);
@@ -1347,15 +1379,9 @@ static void prepare_packed_git_one(char *objdir, int local)
 
 		base_len = path.len;
 		if (strip_suffix_mem(path.buf, &base_len, ".idx")) {
-			/* Don't reopen a pack we already have. */
-			for (p = packed_git; p; p = p->next) {
-				size_t len;
-				if (strip_suffix(p->pack_name, ".pack", &len) &&
-				    len == base_len &&
-				    !memcmp(p->pack_name, path.buf, len))
-					break;
-			}
-			if (p == NULL &&
+			if (
+			    /* Don't reopen a pack we already have. */
+			    !pack_loaded(path.buf, base_len) &&
 			    /*
 			     * See if it really is a valid .idx file with
 			     * corresponding .pack file that we can map.
-- 
2.10.1.589.gca16299


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13 18:18       ` Huge performance bottleneck reading packs Vegard Nossum
@ 2016-10-13 20:43         ` Jeff King
  2016-10-14  6:55           ` Vegard Nossum
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-13 20:43 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Thu, Oct 13, 2016 at 08:18:11PM +0200, Vegard Nossum wrote:

> > My guess is that the number is relatively high. And that would explain
> > why nobody else has really complained much; such a pattern is probably
> > uncommon.
> 
> I get ~3,700 objects "they are advertising that we don't have".
> 
> They are all indeed tags which I don't have (nor want) locally.

Thanks. That's about what I expected.

> So without your patch I get these numbers:
> 
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  37.34     29.83    29.83 1948265116     0.00     0.00  strip_suffix_mem
>  27.56     51.85    22.02    11358     0.00     0.01 prepare_packed_git_one
>  14.24     63.23    11.38 1924457702     0.00     0.00  strip_suffix
>   4.88     67.13     3.90 26045319     0.00     0.00  find_pack_entry_one
>   4.17     70.46     3.33 84828149     0.00     0.00  hashcmp
>   3.28     73.08     2.62 79760199     0.00     0.00  hashcmp
>   2.44     75.03     1.95                             const_error
> 
> the call graph data shows all the reprepare_packed_git() calls to come
> from a chain like this:
> 
> do_for_each_ref
> do_for_each_ref_iterator
> ref_iterator_advance
> files_ref_iterator_advance
> ref_resolves_to_object
> has_sha1_file
> has_sha1_file_with_flags
> reprepare_packed_git

Hrm. It seems weird that we'd hit reprepare_packed_git() via
do_for_each_ref(), because that's looking at _your_ refs. So any time
that code path hits reprepare_packed_git(), it should be complaining
about repository corruption to stderr.

And that also wouldn't make sense that my patch would improve it. Are
you sure you've read the perf output correctly (I'll admit that I am
often confused by the way it reports call graphs)?

> With your (first) patch I get this instead:
> 
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  29.41      0.25     0.25  4490399     0.00     0.00  hashcmp
>  16.47      0.39     0.14   843447     0.00     0.00  find_pack_entry_one

These functions appearing at the top are probably a sign that you have
too many packs (these are just object lookup, which has to linearly try
each pack).

So I'd expect things to improve after a git-gc (and especially if you
turn off the packSizeLimit).

> Do you have all the profile data you were interested in before I try a
> 'git gc'?

Yes, I think so. At least I'm satisfied that there's not another
HAS_SHA1_QUICK case that I'm missing.

> I really appreciate the quick response and the work you put into fixing
> this, even when it's my fault for disabling gc, thanks!

No problem. I do think you'll benefit a lot from packing everything into
a single pack, but it's also clear that Git was doing more work than it
needed to be. This was a known issue when we added the racy-check to
has_sha1_file(), and knew that we might have to field reports like this
one.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13 20:43         ` Jeff King
@ 2016-10-14  6:55           ` Vegard Nossum
  2016-10-14 19:00             ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Vegard Nossum @ 2016-10-14  6:55 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On 10/13/2016 10:43 PM, Jeff King wrote:
> No problem. I do think you'll benefit a lot from packing everything into
> a single pack, but it's also clear that Git was doing more work than it
> needed to be. This was a known issue when we added the racy-check to
> has_sha1_file(), and knew that we might have to field reports like this
> one.

After 'git gc' (with the .5G limit) I have 23 packs and with your patch
I now get:

$ time git fetch

real    0m26.520s
user    0m3.580s
sys     0m0.636s

Thanks!


vegard

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-13  9:04       ` Vegard Nossum
@ 2016-10-14  9:35         ` Jakub Narębski
  0 siblings, 0 replies; 22+ messages in thread
From: Jakub Narębski @ 2016-10-14  9:35 UTC (permalink / raw)
  To: Vegard Nossum, Jeff King
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

W dniu 13.10.2016 o 11:04, Vegard Nossum pisze:
> On 10/13/2016 01:47 AM, Jeff King wrote:
>> On Wed, Oct 12, 2016 at 07:18:07PM -0400, Jeff King wrote:
>>
>>> Also, is it possible to make the repository in question available? I
>>> might be able to reproduce based on your description, but it would save
>>> time if I could directly run gdb on your example.
> 
> I won't be able to make the repository available, sorry.

We have 'git fast-export --anonymize ...', but it would obviously
not preserve the pack structure (though it can help in other cases
when one cannot make repository available).

-- 
Jakub Narębski


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-13 20:06           ` Jeff King
@ 2016-10-14 17:39             ` Junio C Hamano
  2016-10-14 18:59               ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2016-10-14 17:39 UTC (permalink / raw)
  To: Jeff King
  Cc: Vegard Nossum, git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

Jeff King <peff@peff.net> writes:

> So it's certainly better. But 7500 packs is just silly, and squeezing
> out ~400ms there is hardly worth it. If you repack this same case into a
> single pack, the command drops to 5ms. So yes, there's close to an order
> of magnitude speedup here, but you get that _and_ another order of
> magnitude just by repacking.

"7500 is silly" equally applies to the "quick" (and sloppy, if I am
reading your "Failing in this direction doesn't make me feel great."
correctly) approach, I think, which argues for not taking either
change X-<.

I agree that the fallout from the inaccuracy of "quick" approach is
probably acceptable and the next "fetch" will correct it anyway, so
let's do the "quick but inaccurate" for now and perhaps cook it in
'next' for a bit longer than other topics?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-14 17:39             ` Junio C Hamano
@ 2016-10-14 18:59               ` Jeff King
  2016-10-17 17:30                 ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Jeff King @ 2016-10-14 18:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Vegard Nossum, git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

On Fri, Oct 14, 2016 at 10:39:52AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So it's certainly better. But 7500 packs is just silly, and squeezing
> > out ~400ms there is hardly worth it. If you repack this same case into a
> > single pack, the command drops to 5ms. So yes, there's close to an order
> > of magnitude speedup here, but you get that _and_ another order of
> > magnitude just by repacking.
> 
> "7500 is silly" equally applies to the "quick" (and sloppy, if I am
> reading your "Failing in this direction doesn't make me feel great."
> correctly) approach, I think, which argues for not taking either
> change X-<.

I wouldn't quite agree that they're the same. 7500 packs is silly
because a bunch of _other_ things are going to get equally or more slow
before the prepare_packed_git() slowdown is noticeable. And it's in your
power to clean up, and we should encourage users to do so.

Whereas having a bunch of unfetched tags is a state that may linger
indefinitely, and be outside the user's control.

I _am_ open to the argument that calling reprepare_packed_git() over and
over doesn't really matter much if you have a sane number of packs. If
you tweak my perf test like so:

diff --git a/t/perf/p5550-fetch-tags.sh b/t/perf/p5550-fetch-tags.sh
index a5dc39f..7e7ae24 100755
--- a/t/perf/p5550-fetch-tags.sh
+++ b/t/perf/p5550-fetch-tags.sh
@@ -86,7 +86,7 @@ test_expect_success 'create child packs' '
 		cd child &&
 		git config gc.auto 0 &&
 		git config gc.autopacklimit 0 &&
-		create_packs 500
+		create_packs 10
 	)
 '

you get:

Test            origin            quick                 
--------------------------------------------------------
5550.4: fetch   0.06(0.02+0.02)   0.02(0.01+0.00) -66.7%

Still an impressive speedup as a percentage, but negligible in absolute
terms. But that's on a local filesystem on a Linux machine. I'd worry
much more about a system with a slow readdir(), e.g., due to NFS.
Somebody's real-world NFS case[1] was what prompted us to do 0eeb077
(index-pack: avoid excessive re-reading of pack directory, 2015-06-09).

It looks like I _did_ look into optimizing this into a single stat()
call in the thread at [1]. I completely forgot about that. I did find
there that naively using stat_validity() on a directory is racy, though
I wonder if we could do something clever with gettimeofday() instead.
IOW, the patches there only bothered to re-read when they saw the mtime
on the directory jump, which suffers from 1-second precision problems.
But if we instead compared the mtime to the current time, we could err
in favor of re-reading the packs, and get false positives for at most 1
second.

[1] http://public-inbox.org/git/7FAE15F0A93C0144AD8B5FBD584E1C5519758FC3@C111KXTEMBX51.ERF.thomson.com/

> I agree that the fallout from the inaccuracy of "quick" approach is
> probably acceptable and the next "fetch" will correct it anyway, so
> let's do the "quick but inaccurate" for now and perhaps cook it in
> 'next' for a bit longer than other topics?

I doubt that cooking in 'next' for longer will turn up anything useful.
The case we care about is the race between a repack and a fetch. We
lived with the "quick" version of has_sha1_file() everywhere for 8
years. I only noticed the race on a hosting cluster which puts through
millions of operations per day (I could in theory test the patch on that
same cluster, but we actually don't do very many fetches).

-Peff

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Huge performance bottleneck reading packs
  2016-10-14  6:55           ` Vegard Nossum
@ 2016-10-14 19:00             ` Jeff King
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff King @ 2016-10-14 19:00 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy, Junio C Hamano

On Fri, Oct 14, 2016 at 08:55:29AM +0200, Vegard Nossum wrote:

> On 10/13/2016 10:43 PM, Jeff King wrote:
> > No problem. I do think you'll benefit a lot from packing everything into
> > a single pack, but it's also clear that Git was doing more work than it
> > needed to be. This was a known issue when we added the racy-check to
> > has_sha1_file(), and knew that we might have to field reports like this
> > one.
> 
> After 'git gc' (with the .5G limit) I have 23 packs and with your patch
> I now get:
> 
> $ time git fetch
> 
> real    0m26.520s
> user    0m3.580s
> sys     0m0.636s

Cool. That's about what I'd expect based on the size numbers you gave
earlier. The other 23 seconds are likely being spent on passing the ref
advertisement over the network.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-14 18:59               ` Jeff King
@ 2016-10-17 17:30                 ` Junio C Hamano
  2016-10-18 10:28                   ` Jeff King
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2016-10-17 17:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Vegard Nossum, git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

Jeff King <peff@peff.net> writes:

> Still an impressive speedup as a percentage, but negligible in absolute
> terms. But that's on a local filesystem on a Linux machine. I'd worry
> much more about a system with a slow readdir(), e.g., due to NFS.
> Somebody's real-world NFS case[1] was what prompted us to do 0eeb077
> (index-pack: avoid excessive re-reading of pack directory, 2015-06-09).

Yes.

> It looks like I _did_ look into optimizing this into a single stat()
> call in the thread at [1]. I completely forgot about that. I did find
> there that naively using stat_validity() on a directory is racy, though
> I wonder if we could do something clever with gettimeofday() instead.

It feels funny to hear an idea to compare fs timestamp with gettimeofday
immedately after hearing the word NFS, though ;-).

>> I agree that the fallout from the inaccuracy of "quick" approach is
>> probably acceptable and the next "fetch" will correct it anyway, so
>> let's do the "quick but inaccurate" for now and perhaps cook it in
>> 'next' for a bit longer than other topics?
>
> I doubt that cooking in 'next' for longer will turn up anything useful.
> The case we care about is the race between a repack and a fetch. We
> lived with the "quick" version of has_sha1_file() everywhere for 8
> years.

A very convincing argument.  I stand corrected.

Thanks.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] fetch: use "quick" has_sha1_file for tag following
  2016-10-17 17:30                 ` Junio C Hamano
@ 2016-10-18 10:28                   ` Jeff King
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff King @ 2016-10-18 10:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Vegard Nossum, git, Quentin Casasnovas, Shawn Pearce,
	Nguyễn Thái Ngọc Duy

On Mon, Oct 17, 2016 at 10:30:28AM -0700, Junio C Hamano wrote:

> > It looks like I _did_ look into optimizing this into a single stat()
> > call in the thread at [1]. I completely forgot about that. I did find
> > there that naively using stat_validity() on a directory is racy, though
> > I wonder if we could do something clever with gettimeofday() instead.
> 
> It feels funny to hear an idea to compare fs timestamp with gettimeofday
> immedately after hearing the word NFS, though ;-).

Yeah, I had a funny feeling in my stomach as I wrote that.

What you really want to know is the current filesystem time. You'd
probably have to do something gross like creating a new file and then
comparing its timestamp. In theory you'd only have to do that _once_,
and then as long as the pack directory wasn't changing, you could say "I
don't know what time it is now, but I know it is at least time X, and I
know that X is greater than Y, the pack directory timestamp, therefore
the pack directory hasn't changed since I last looked".

That assumes monotonic clocks, but we basically already do so for the
racy-git checks, I think.

I dunno. It feels...complicated. And bad to require writing to the
repository for what would otherwise be a read-only operation. But I
don't see any fundamental reason it could not work.

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-10-18 10:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-12 22:30 Huge performance bottleneck reading packs Vegard Nossum
2016-10-12 22:45 ` Junio C Hamano
2016-10-13  7:17   ` Vegard Nossum
2016-10-13 14:50     ` Jeff King
2016-10-12 23:01 ` Jeff King
2016-10-12 23:18   ` Jeff King
2016-10-12 23:47     ` Jeff King
2016-10-13  9:04       ` Vegard Nossum
2016-10-14  9:35         ` Jakub Narębski
2016-10-13  7:20   ` Vegard Nossum
2016-10-13 15:26     ` Jeff King
2016-10-13 16:53       ` [PATCH] fetch: use "quick" has_sha1_file for tag following Jeff King
2016-10-13 17:04         ` Jeff King
2016-10-13 20:06           ` Jeff King
2016-10-14 17:39             ` Junio C Hamano
2016-10-14 18:59               ` Jeff King
2016-10-17 17:30                 ` Junio C Hamano
2016-10-18 10:28                   ` Jeff King
2016-10-13 18:18       ` Huge performance bottleneck reading packs Vegard Nossum
2016-10-13 20:43         ` Jeff King
2016-10-14  6:55           ` Vegard Nossum
2016-10-14 19:00             ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).