git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Slow git pushes: sitting 1 minute in pack-objects
@ 2015-03-05 21:03 Stephen Morton
  2015-03-06  0:25 ` Duy Nguyen
  2015-03-09  7:53 ` Jeff King
  0 siblings, 2 replies; 8+ messages in thread
From: Stephen Morton @ 2015-03-05 21:03 UTC (permalink / raw)
  To: git@vger.kernel.org

(Apologies, after a day I'm cross-posting from git.users. I think the question
is maybe too technical for that group.)

I'm experiencing very slow git pushes. On the order of 1 minute to push a
trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
taking a lot of time in the pack-objects phase.

Others are not seeing this with the same repo, but I'm the only one working
in a VM.

```
~/ws/git/repo.1/repo > date; git push mortons; date
Wed Mar  4 15:03:11 EST 2015
15:03:11.086758 git.c:349               trace: built-in: git 'push' 'mortons'
15:03:11.126665 run-command.c:341       trace: run_command: 'ssh' '-p'
'7999' 'git@privacy.privacy' 'git-receive-pack
'\''~mortons/repo.git'\'''
15:03:20.383341 run-command.c:341       trace: run_command:
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
15:03:20.383945 exec_cmd.c:134          trace: exec: 'git'
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
15:03:20.385168 git.c:349               trace: built-in: git
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 20.86 KiB | 0 bytes/s, done.
Total 4 (delta 0), reused 0 (delta 0)
To ssh://git@privacy.privacy:7999/~mortons/repo.git
   5fe662f..a137bda  my_branch -> my_branch
Wed Mar  4 15:04:22 EST 2015

```

After it was slow at first, I tried setting these which did not help

repack.writebitmaps=true
pack.windowmemory=100m



Details:
git version 2.1.4
OS: CentOS 6.6 64-bit in a VM.
repo size: huge. 6 GB .git directory, around 800 MB working tree.
VM has 8 MB RAM and 8 cores.
CPU: i7, 8 core (4 cores hyperthreaded)

It is an ext4 filesystem on the guest linux drive.
    On the host side, it is a .vmdk file and the virtualization software used is
virtualbox. While the drive is dynamically allocated, after I ran into
this issue,
    I used fallocate to create a 50GB dummy file and then delete it to
ensure that
there was headroom in the drive and that dynamic allocation slowness was not the
issue, and subsequent pushes were still slow.
    I have not experienced any filesystem slowness issues in the
months I've been using this vm.


Any ideas? I'm evaluating a move to git and this is the kind of thing
that could derail it.
Thanks,

Stephen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-05 21:03 Slow git pushes: sitting 1 minute in pack-objects Stephen Morton
@ 2015-03-06  0:25 ` Duy Nguyen
  2015-03-06  0:37   ` Stephen Morton
  2015-03-09  7:53 ` Jeff King
  1 sibling, 1 reply; 8+ messages in thread
From: Duy Nguyen @ 2015-03-06  0:25 UTC (permalink / raw)
  To: Stephen Morton; +Cc: git@vger.kernel.org

On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton
<stephen.c.morton@gmail.com> wrote:
> I'm experiencing very slow git pushes. On the order of 1 minute to push a
> trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
> taking a lot of time in the pack-objects phase.
>
> Others are not seeing this with the same repo, but I'm the only one working
> in a VM.
>
> ...
>
> Details:
> git version 2.1.4
> OS: CentOS 6.6 64-bit in a VM.
> repo size: huge. 6 GB .git directory, around 800 MB working tree.
> VM has 8 MB RAM and 8 cores.

Is it 8 GB or MB RAM?

> CPU: i7, 8 core (4 cores hyperthreaded)
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-06  0:25 ` Duy Nguyen
@ 2015-03-06  0:37   ` Stephen Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Stephen Morton @ 2015-03-06  0:37 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git@vger.kernel.org

8GB of RAM.

Sorry, typo.

Steve


On Thu, Mar 5, 2015 at 7:25 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Fri, Mar 6, 2015 at 4:03 AM, Stephen Morton
> <stephen.c.morton@gmail.com> wrote:
>> I'm experiencing very slow git pushes. On the order of 1 minute to push a
>> trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
>> taking a lot of time in the pack-objects phase.
>>
>> Others are not seeing this with the same repo, but I'm the only one working
>> in a VM.
>>
>> ...
>>
>> Details:
>> git version 2.1.4
>> OS: CentOS 6.6 64-bit in a VM.
>> repo size: huge. 6 GB .git directory, around 800 MB working tree.
>> VM has 8 MB RAM and 8 cores.
>
> Is it 8 GB or MB RAM?
>
>> CPU: i7, 8 core (4 cores hyperthreaded)
> --
> Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-05 21:03 Slow git pushes: sitting 1 minute in pack-objects Stephen Morton
  2015-03-06  0:25 ` Duy Nguyen
@ 2015-03-09  7:53 ` Jeff King
  2015-03-10  1:37   ` Stephen Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff King @ 2015-03-09  7:53 UTC (permalink / raw)
  To: Stephen Morton; +Cc: git@vger.kernel.org

On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote:

> I'm experiencing very slow git pushes. On the order of 1 minute to push a
> trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
> taking a lot of time in the pack-objects phase.

Can you tell what pack-objects is doing? Perhaps "strace -f -tt" might
help. Or even just "time git push", which will tell us whether we're
spinning on CPU or something else.

If you watch the progress meter, where does it spend its time? In
"counting objects", "compressing objects", or "writing objects"? Or
after that?

You said the repo is big. Is it reasonably packed (try running `git
gc`). Pack-objects may look at objects that you know the other side has
in order to create deltas against them. If you have some crazy situation
(e.g., you have several thousand packfiles) that can slow down object
access quite a bit. A minute is more than I would expect, but if VFS
operations in your VM are slow, that could be it.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-09  7:53 ` Jeff King
@ 2015-03-10  1:37   ` Stephen Morton
  2015-03-16 22:15     ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Morton @ 2015-03-10  1:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org

Thanks Peff,

I've done an strace and here's what I see. I'll try to put relevant
information in as legible a form as possible. The operation is
cpu-bound on a single core (note that yes, delta compression is using
8 threads. so that's obviously not the bottleneck) for the duration of
the pack-objects.

Timing is tough, because it's slower when straced.

I don't know if it matters, but this push is going to a fork on the
remote end most of whose objects are in the objects/info/alternatives
location. (Which is still on the local drive of the remote, not a
network drive or anything like that.)

1.
 > GIT_TRACE=1 time time git push
12:21:28.980410 git.c:349               trace: built-in: git 'push'
12:21:29.099547 run-command.c:341       trace: run_command: 'ssh' '-p'
'7999' 'git@server.privacy' 'git-receive-pack
'\''~smorton/panos.git'\'''
12:21:42.863968 run-command.c:341       trace: run_command:
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:21:42.871170 exec_cmd.c:134          trace: exec: 'git'
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'
12:21:42.907783 git.c:349               trace: built-in: git
'pack-objects' '--all-progress-implied' '--revs' '--stdout' '--thin'
'--delta-base-offset' '--progress'

2. At this point, "git pack-objects" is forked

3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
when straced, but I think it's much less, as little as 10s when not
straced.
It then reads a bunch of what look like objects from filehandle 0
(presumably stdin, read from the remote end?)
It then tries to lstat() these filenames in .git/<sha1>
./git/refs/<sha1>, .git/heads/<sha>, etc. It always fails ENOENT.
It fails some 120,000 times. This could be a problem. Though I haven't
checked to see if this happens on a fast push on another machine.

This takes a lot of time when straced, but I think less time when not straced.

4. Then it just sits there for almost 1 minute. It uses up almost 100%
of a single core during this period. It spends a lot of time running
lots of brk() (memory allocation) commands and then getting (polled
by?) a SIGALRM every 1s.
About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only
about 400M is ever RES.


5. Then we get to the "Counting objects" phase.
Note the number of threads specified: so it

Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
...


6. Time. Note the 121,000 pagefaults

46.08user 0.67system 0:49.47elapsed 94%CPU (0avgtext+0avgdata
1720240maxresident)k
0inputs+16outputs (0major+121199minor)pagefaults 0swaps

Steve



On Mon, Mar 9, 2015 at 3:53 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Mar 05, 2015 at 04:03:07PM -0500, Stephen Morton wrote:
>
>> I'm experiencing very slow git pushes. On the order of 1 minute to push a
>> trivial one-line change. When I set GIT_TRACE=1, I see that it seems to be
>> taking a lot of time in the pack-objects phase.
>
> Can you tell what pack-objects is doing? Perhaps "strace -f -tt" might
> help. Or even just "time git push", which will tell us whether we're
> spinning on CPU or something else.
>
> If you watch the progress meter, where does it spend its time? In
> "counting objects", "compressing objects", or "writing objects"? Or
> after that?
>
> You said the repo is big. Is it reasonably packed (try running `git
> gc`). Pack-objects may look at objects that you know the other side has
> in order to create deltas against them. If you have some crazy situation
> (e.g., you have several thousand packfiles) that can slow down object
> access quite a bit. A minute is more than I would expect, but if VFS
> operations in your VM are slow, that could be it.
>
> -Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-10  1:37   ` Stephen Morton
@ 2015-03-16 22:15     ` Jeff King
  2015-03-19 20:31       ` Stephen Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2015-03-16 22:15 UTC (permalink / raw)
  To: Stephen Morton; +Cc: git@vger.kernel.org

On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote:

> 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
> when straced, but I think it's much less, as little as 10s when not
> straced.
> It then reads a bunch of what look like objects from filehandle 0
> (presumably stdin, read from the remote end?)
> It then tries to lstat() these filenames in .git/<sha1>
> ./git/refs/<sha1>, .git/heads/<sha>, etc. It always fails ENOENT.
> It fails some 120,000 times. This could be a problem. Though I haven't
> checked to see if this happens on a fast push on another machine.

Hmm. The "push" process must feed the set of object boundaries to
"pack-objects" so it knows what to pack (i.e., what we want to send, and
what the other side has).

120,000 is an awfully large number of objects to be pass there, though.
Does the repo you are pushing to by any chance have an extremely large
number of refs (e.g., on the order of 120K tags)?

Can you try building git with this patch which would tell us more about
where those objects are coming from?

diff --git a/send-pack.c b/send-pack.c
index 9d2b0c5..17ace1f 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -74,10 +74,19 @@ static int pack_objects(int fd, struct ref *refs, struct sha1_array *extra, stru
 	 * We feed the pack-objects we just spawned with revision
 	 * parameters by writing to the pipe.
 	 */
+	warning("feeding %d .have objects", extra->nr);
 	for (i = 0; i < extra->nr; i++)
 		if (!feed_object(extra->sha1[i], po.in, 1))
 			break;
 
+	{
+		struct ref *p;
+		int count = 0;
+		for (p = refs; p; p = p->next)
+			count++;
+		warning("feeding %d refs", count);
+	}
+
 	while (refs) {
 		if (!is_null_sha1(refs->old_sha1) &&
 		    !feed_object(refs->old_sha1, po.in, 1))

> 4. Then it just sits there for almost 1 minute. It uses up almost 100%
> of a single core during this period. It spends a lot of time running
> lots of brk() (memory allocation) commands and then getting (polled
> by?) a SIGALRM every 1s.
> About 5.5+ GB (about the repo size) of VIRT memory is allocated. Only
> about 400M is ever RES.

The SIGALRM is normal. That's the progress code checking in every 2
seconds to see if there's anything to print. My guess is that the
allocation is probably coming as part of the "preferred base" code. This
is a fancy term for "stuff we are not going to send, but which we could
use as delta bases for objects we are sending".

Does the patch below speed things up (it is not a good thing to do in
general, but would let us know if that is the problematic area):

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d816587..c90a352 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2307,7 +2307,6 @@ static void show_object(struct object *obj,
 
 static void show_edge(struct commit *commit)
 {
-	add_preferred_base(commit->object.sha1);
 }
 
 struct in_pack_object {

If not, then the slowdown may come before we even get there, and
possibly this patch would help:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d816587..473c0a3 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2531,6 +2531,8 @@ static void get_object_list(int ac, const char **av)
 			}
 			die("not a rev '%s'", line);
 		}
+		if (*line == '^')
+			continue;
 		if (handle_revision_arg(line, &revs, flags, REVARG_CANNOT_BE_FILENAME))
 			die("bad revision '%s'", line);
 	}

Those are all rather blunt debugging methods, but hopefully it can get
us a sense of where the time is going.

-Peff

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-16 22:15     ` Jeff King
@ 2015-03-19 20:31       ` Stephen Morton
  2015-03-19 21:07         ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Morton @ 2015-03-19 20:31 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org

On Mon, Mar 16, 2015 at 6:15 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Mar 09, 2015 at 09:37:25PM -0400, Stephen Morton wrote:
>
>> 3. Not sure how long this part takes. It takes 1/3 - 1/2 of the time
>> when straced, but I think it's much less, as little as 10s when not
>> straced.
>> It then reads a bunch of what look like objects from filehandle 0
>> (presumably stdin, read from the remote end?)
>> It then tries to lstat() these filenames in .git/<sha1>
>> ./git/refs/<sha1>, .git/heads/<sha>, etc. It always fails ENOENT.
>> It fails some 120,000 times. This could be a problem. Though I haven't
>> checked to see if this happens on a fast push on another machine.
>
> Hmm. The "push" process must feed the set of object boundaries to
> "pack-objects" so it knows what to pack (i.e., what we want to send, and
> what the other side has).
>
> 120,000 is an awfully large number of objects to be pass there, though.
> Does the repo you are pushing to by any chance have an extremely large
> number of refs (e.g., on the order of 120K tags)?


No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k.



>
> Can you try building git with this patch which would tell us more about
> where those objects are coming from?


    <patch snipped>

> Those are all rather blunt debugging methods, but hopefully it can get
> us a sense of where the time is going.
>
> -Peff


Thanks Peff,


I haven't tried your patch, but I tried to backtrack a bit and
double-check that the problem always happened in a fresh clone with no
other modifications.

It did _not_ happen in a new clone --a push took just 5s -- and I
think the culprit could have been "repack.writebitmaps=true". Although
I had thought writebitmaps was not originally enabled, I now suspect
that it was. Let me follow up on that first, before I recompile git
with your changes.


Thanks again, I really appreciate the help.

Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Slow git pushes: sitting 1 minute in pack-objects
  2015-03-19 20:31       ` Stephen Morton
@ 2015-03-19 21:07         ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2015-03-19 21:07 UTC (permalink / raw)
  To: Stephen Morton; +Cc: git@vger.kernel.org

On Thu, Mar 19, 2015 at 04:31:36PM -0400, Stephen Morton wrote:

> > Hmm. The "push" process must feed the set of object boundaries to
> > "pack-objects" so it knows what to pack (i.e., what we want to send, and
> > what the other side has).
> >
> > 120,000 is an awfully large number of objects to be pass there, though.
> > Does the repo you are pushing to by any chance have an extremely large
> > number of refs (e.g., on the order of 120K tags)?
> 
> No. There are on the order of 9,500 refs (mostly tags) but nowhere near 120k.

I think you mentioned that it uses alternates to share objects with
other repos. Does the repository (or repositories) pointed to by the
alternates file have a large number of refs (especially distinct refs,
as I think modern git will squelch duplicate sha1s).

> It did _not_ happen in a new clone --a push took just 5s -- and I
> think the culprit could have been "repack.writebitmaps=true". Although
> I had thought writebitmaps was not originally enabled, I now suspect
> that it was. Let me follow up on that first, before I recompile git
> with your changes.

It's certainly possible that bitmaps have an impact here. They should
not contribute to the 120K objects being passed to pack-objects, but
it's possible that size is a red herring (or possibly the number of
objects is choking something in the bitmap code path that does not have
problems otherwise).

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-19 21:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-05 21:03 Slow git pushes: sitting 1 minute in pack-objects Stephen Morton
2015-03-06  0:25 ` Duy Nguyen
2015-03-06  0:37   ` Stephen Morton
2015-03-09  7:53 ` Jeff King
2015-03-10  1:37   ` Stephen Morton
2015-03-16 22:15     ` Jeff King
2015-03-19 20:31       ` Stephen Morton
2015-03-19 21:07         ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).