git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Surprising use of memory and time when repacking mozilla's gecko repository
@ 2019-07-04 10:05 Mike Hommey
  2019-07-04 12:04 ` Eric Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Mike Hommey @ 2019-07-04 10:05 UTC (permalink / raw)
  To: git

Hi,

I was looking at the disk size of the gecko repository on github[1],
which started at 4.7GB, and `git gc --aggressive`'d it, which made that
into 2.0G. But to achieve that required quite some resources.

My first attempt failed with OOM, on an AWS instance with 16 cores and
32GB RAM. I then went to another AWS instance, with 36 cores and 96GB
RAM. And that went through after a while... with a peak memory usage
above 60GB!

Since then, Peff kindly repacked the repo on the github end, so it
doesn't really need repacking locally anymore, but I can still reproduce
the > 60GB memory usage with the packed repository.

I gathered some data[2], all on the same 36 cores, 96GB RAM instance, with
36, 16 and 1 threads, and here's what can be observed:

With 36 threads, the overall process takes 45 minutes:
- 50 seconds enumerating and counting objects.
- ~22 minutes compressing objects
- ~22 minutes writing objects

Of the 22 minutes compressing objects, more than 15 minutes are spent on
the last percent of objects, and only during that part the memory usage
balloons above 20GB.

Memory usage goes back to 2.4G after finishing to compress.

With 16 threads, the overall process takes about the same time as above,
with about the same repartition.

But less time is spent on compressing the last percent of objects, and
memory usage goes above 20GB later than with 36 threads.

Finally, with 1 thread, the picture changes greatly. The overall process
takes 2.5h:
- 50 seconds enumerating and counting objects.
- ~2.5h compressing objects.
- 3 minutes and 25 seconds writing objects!

Memory usage stays reasonable, except at some point after 47 minutes,
where it starts to increase up to 12.7GB, and then goes back down about
half an hour later, all while stalling around the 13% progress mark.

My guess is all those stalls are happening when processing the files I
already had problems with in the past[3], except there are more of them
now (thankfully, they were removed, so there won't be more, but that
doesn't make the existing ones go away).

I never ended up working on trying to make that diff faster, maybe that
would help a little here, but that would probably not help much wrt the
memory usage. I wonder what git could reasonably do to avoid OOMing in
this case. Reduce the window size temporarily? Trade memory with time,
by not keeping the objects in memory?

I'm puzzled by the fact writing objects is so much faster with 1 thread.

It's worth noting that the AWS instances don't have swap by default,
which is actually good in this case, because if it had started to swap,
it would have taken forever.

1. https://github.com/mozilla/gecko
2. https://docs.google.com/spreadsheets/d/1IE8E3BhKurXsXgwBYFXs4mRBT_512v--ip6Vhxc3o-Y/edit?usp=sharing
3. https://public-inbox.org/git/20180703223823.qedmoy2imp4dcvkp@glandium.org/T/

Any thoughts?

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
@ 2019-07-04 12:04 ` Eric Wong
  2019-07-04 13:13   ` Mike Hommey
  2019-07-05  0:22 ` Mike Hommey
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Wong @ 2019-07-04 12:04 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

Mike Hommey <mh@glandium.org> wrote:
> I'm puzzled by the fact writing objects is so much faster with 1 thread.

I/O contention in the multi-threaded cases?

"public-inbox-index" (reading from git, writing to Xapian+SQLite)
on a dev machine got slow because core count exceeded what SATA
could handle and had to cap the default Xapian shard count to 3
by default for v2 inboxes.


As for memory use, does git use the producer/consumer pattern
for malloc+free so free happens in a different thread from the
malloc?

Because current versions of glibc gets pathological with that case
(but I have a long-term fix for it):

https://public-inbox.org/libc-alpha/20180731084936.g4yw6wnvt677miti@dcvr/T/
Only locklessinc malloc seemed OK in the IPC department
(But I haven't kept up with jemalloc, tcmalloc developments in years)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 12:04 ` Eric Wong
@ 2019-07-04 13:13   ` Mike Hommey
  2019-07-05  5:14     ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Hommey @ 2019-07-04 13:13 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On Thu, Jul 04, 2019 at 12:04:11PM +0000, Eric Wong wrote:
> Mike Hommey <mh@glandium.org> wrote:
> > I'm puzzled by the fact writing objects is so much faster with 1 thread.
> 
> I/O contention in the multi-threaded cases?
> 
> "public-inbox-index" (reading from git, writing to Xapian+SQLite)
> on a dev machine got slow because core count exceeded what SATA
> could handle and had to cap the default Xapian shard count to 3
> by default for v2 inboxes.
 
AFAICT, git doesn't write from multiple threads.
 
> As for memory use, does git use the producer/consumer pattern
> for malloc+free so free happens in a different thread from the
> malloc?
> 
> Because current versions of glibc gets pathological with that case
> (but I have a long-term fix for it):
> 
> https://public-inbox.org/libc-alpha/20180731084936.g4yw6wnvt677miti@dcvr/T/
> Only locklessinc malloc seemed OK in the IPC department
> (But I haven't kept up with jemalloc, tcmalloc developments in years)

Oh right, I forgot to mention:
- I thought this memory usage thing was [1] but it turns out it was real
  memory usage.
- glibc's mallinfo stores values as int, so it's useless to know how
  much memory was allocated when it's more than 4GB.
- glibc's malloc_stats relies on the same int data, so while it does
  print "in use" data, it can't print values above 4GB correctly.
- glibc has a malloc_stats function that, according to its manual page
  "addresses the deficiencies in malloc_stats and mallinfo", but while
  it outputs a large XML dump, it doesn't contain anything that looks
  remotely like the "in use" from malloc_stats.
- So all in all, I used jemalloc to gather the "allocated" stats.

Mike

1. https://sourceware.org/bugzilla/show_bug.cgi?id=23416

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
  2019-07-04 12:04 ` Eric Wong
@ 2019-07-05  0:22 ` Mike Hommey
  2019-07-05  4:45 ` Mike Hommey
  2019-07-05  5:09 ` Jeff King
  3 siblings, 0 replies; 11+ messages in thread
From: Mike Hommey @ 2019-07-05  0:22 UTC (permalink / raw)
  To: git

On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:
> Hi,
> 
> I was looking at the disk size of the gecko repository on github[1],
> which started at 4.7GB, and `git gc --aggressive`'d it, which made that
> into 2.0G. But to achieve that required quite some resources.
> 
> My first attempt failed with OOM, on an AWS instance with 16 cores and
> 32GB RAM. I then went to another AWS instance, with 36 cores and 96GB
> RAM. And that went through after a while... with a peak memory usage
> above 60GB!
> 
> Since then, Peff kindly repacked the repo on the github end, so it
> doesn't really need repacking locally anymore, but I can still reproduce
> the > 60GB memory usage with the packed repository.
> 
> I gathered some data[2], all on the same 36 cores, 96GB RAM instance, with
> 36, 16 and 1 threads, and here's what can be observed:
> 
> With 36 threads, the overall process takes 45 minutes:
> - 50 seconds enumerating and counting objects.
> - ~22 minutes compressing objects
> - ~22 minutes writing objects
> 
> Of the 22 minutes compressing objects, more than 15 minutes are spent on
> the last percent of objects, and only during that part the memory usage
> balloons above 20GB.
> 
> Memory usage goes back to 2.4G after finishing to compress.
> 
> With 16 threads, the overall process takes about the same time as above,
> with about the same repartition.
> 
> But less time is spent on compressing the last percent of objects, and
> memory usage goes above 20GB later than with 36 threads.
> 
> Finally, with 1 thread, the picture changes greatly. The overall process
> takes 2.5h:
> - 50 seconds enumerating and counting objects.
> - ~2.5h compressing objects.
> - 3 minutes and 25 seconds writing objects!
> 
> Memory usage stays reasonable, except at some point after 47 minutes,
> where it starts to increase up to 12.7GB, and then goes back down about
> half an hour later, all while stalling around the 13% progress mark.
> 
> My guess is all those stalls are happening when processing the files I
> already had problems with in the past[3], except there are more of them
> now (thankfully, they were removed, so there won't be more, but that
> doesn't make the existing ones go away).
> 
> I never ended up working on trying to make that diff faster, maybe that
> would help a little here, but that would probably not help much wrt the
> memory usage. I wonder what git could reasonably do to avoid OOMing in
> this case. Reduce the window size temporarily? Trade memory with time,
> by not keeping the objects in memory?
> 
> I'm puzzled by the fact writing objects is so much faster with 1 thread.

Here's a perf report from the portion of "Writing" that is particularly
slow with compression having happened on 36 threads:
  100.00%     0.00%  git      [unknown]           [k] 0xffffffffffffffff                    
   99.97%     0.02%  git      git                 [.] write_one                             
   99.97%     0.00%  git      git                 [.] write_pack_file                       
   99.97%     0.00%  git      git                 [.] cmd_pack_objects                      
   99.96%     0.00%  git      git                 [.] write_object (inlined)                
   99.96%     0.00%  git      git                 [.] write_reuse_object (inlined)          
   99.92%     0.00%  git      git                 [.] write_no_reuse_object                 
   98.12%     0.00%  git      git                 [.] get_delta (inlined)                   
   72.36%     0.00%  git      git                 [.] diff_delta (inlined)                  
   64.86%    64.20%  git      git                 [.] create_delta_index                    
   26.32%     0.00%  git      git                 [.] repo_read_object_file (inlined)       
   26.32%     0.00%  git      git                 [.] read_object_file_extended             
   26.32%     0.00%  git      git                 [.] read_object                           
   26.32%     0.00%  git      git                 [.] oid_object_info_extended              
   26.25%     0.00%  git      git                 [.] packed_object_info                    
   26.24%     0.00%  git      git                 [.] cache_or_unpack_entry (inlined)       
   24.30%     0.01%  git      git                 [.] unpack_entry                          
   17.62%     0.00%  git      git                 [.] memcpy (inlined)                      
   17.52%    17.46%  git      libc-2.27.so        [.] __memmove_avx_unaligned_erms          
   15.98%     0.22%  git      git                 [.] patch_delta                           
    7.60%     0.00%  git      git                 [.] unpack_compressed_entry               
    7.49%     7.42%  git      git                 [.] create_delta                          
    7.29%     0.00%  git      git                 [.] git_inflate                           
    7.29%     0.23%  git      libz.so.1.2.11      [.] inflate                               
    1.94%     0.00%  git      git                 [.] xmemdupz                              
    1.14%     0.00%  git      git                 [.] do_compress                           
    0.98%     0.98%  git      libz.so.1.2.11      [.] adler32_z                             
    0.95%     0.00%  git      libz.so.1.2.11      [.] deflate                               

... that's a large portion of time spent on deltas...

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
  2019-07-04 12:04 ` Eric Wong
  2019-07-05  0:22 ` Mike Hommey
@ 2019-07-05  4:45 ` Mike Hommey
  2019-07-05  5:09 ` Jeff King
  3 siblings, 0 replies; 11+ messages in thread
From: Mike Hommey @ 2019-07-05  4:45 UTC (permalink / raw)
  To: git

On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:
> My guess is all those stalls are happening when processing the files I
> already had problems with in the past[3], except there are more of them
> now (thankfully, they were removed, so there won't be more, but that
> doesn't make the existing ones go away).
>
> 3. https://public-inbox.org/git/20180703223823.qedmoy2imp4dcvkp@glandium.org/T/

I've more or less confirmed that's the cause of the long stalls during
the compression phase. It can be reproduced to some extent with:

$ git clone https://github.com/mozilla/gecko
$ cd gecko
$ git rev-list --all -- testing/web-platform/meta/MANIFEST.json | xargs -I '{}' git ls-tree '{}' testing/web-platform/meta/MANIFEST.json | sort -u | awk '{print $3}' | git cat-file --batch-check | sort -n -k 3 | awk '{print $1}' | git pack-objects --window=250 --depth=50 --no-reuse-delta my_pack

There might be some other file than testing/web-platform/meta/MANIFEST.json 
involved, because this uses "only" 40GB RAM, but that file is the one I know
of.

This however doesn't reproduce the "writing takes forever" part of the
problem.

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
                   ` (2 preceding siblings ...)
  2019-07-05  4:45 ` Mike Hommey
@ 2019-07-05  5:09 ` Jeff King
  2019-07-05  5:45   ` Mike Hommey
  3 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2019-07-05  5:09 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git

On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:

> With 36 threads, the overall process takes 45 minutes:
> - 50 seconds enumerating and counting objects.
> - ~22 minutes compressing objects
> - ~22 minutes writing objects

I noticed the long writing phase when I repacked as well. The main cost
there is going to be reconstructing deltas that wouldn't fit in the
cache. During the compression phase we generated a bunch of candidate
deltas. If we have space in our in-memory cache, we then store the
actual delta there so we can write it out immediately during the writing
phase. Otherwise, we just note the base object we used, and during the
writing phase we regenerate the delta on the fly.

So spending time there implies (and the perf output you posted in the
followup supports this) that we have a lot of entries that could not be
cached.

You might try poking at pack.deltaCacheSize and pack.deltaCacheLimit to
see if they help.

But...

> Finally, with 1 thread, the picture changes greatly. The overall process
> takes 2.5h:
> - 50 seconds enumerating and counting objects.
> - ~2.5h compressing objects.
> - 3 minutes and 25 seconds writing objects!

That's weird. I'd expect us to find similar amounts of deltas, but we
don't have the writing slow-down. I wonder if there is some bad
interaction between the multi-threaded code and the delta cache.

Did you run the second, single-thread run against the exact same
original repository you had? Or did you re-run it on the result of the
multi-thread run? Another explanation is that the original repository
had some poor patterns that made objects expensive to access (say, a ton
of really deep delta chains). And so the difference between the two runs
was not the threads, but just the on-disk repository state.

Kind of a long shot, but if that is what happened, try running another
multi-threaded "repack -f" and see if its writing phase is faster.

> Of the 22 minutes compressing objects, more than 15 minutes are spent on
> the last percent of objects, and only during that part the memory usage
> balloons above 20GB.

That's not surprising. To find deltas, we make an array with all of the
objects, sort it into a list in a way that puts "like" objects near to
each other, and then walk over that list with a sliding window, looking
for candidates. With multiple threads, each thread gets its own segment
of the array to look at. So two implications are:

  - one of the sorting criteria is size. So you'll often end up
    comparing a bunch of large objects to each other at the end, and the
    last few percent will take much more than the earlier parts.

  - each thread is holding $window_size objects in memory at once. There
    are a bunch of biggish objects, up to 17MB, in the mozilla/gecko
    repository. So during that last part with the large objects, the
    worst case is something like window*threads*17 = 250*36*17 = 150GB.

    It's not nearly that bad because you don't have enough large
    objects. ;) The 9000 (256*36) largest objects in that repo are only
    55GB. And we don't hit even that at peak because some threads are
    probably working on parts of the array with smaller objects.

You can compute that largest-9000 objects like:

  git cat-file --batch-all-objects --unordered --batch-check='%(objectsize)' |
  sort -rn | head -9000 |
  perl -lne '$total += $_; END { print $total }'

The simplest solution if you're running into RAM problems is to turn
down the number of threads. You can also disable deltas for certain
objects (over a certain size with core.bigFileThreshold or for certain
paths with the "delta" gitattribute). But in your case those big objects
really do make good deltas:

  $ git cat-file --batch-all-objects --unordered \
      --batch-check='%(objectsize) %(objectsize:disk) %(deltabase)' |
      sort -nr | head -10
  17977953 17963093 0000000000000000000000000000000000000000
  17817405 153050 a4ef0cf1db10163141b827863671050a58612408
  17816806 733 529a3ffd873183abf85c78e61e50da61ccb85976
  17816776 17939 410cc404992142e68dc53c9ecafca14adf21a839
  17815901 571 529a3ffd873183abf85c78e61e50da61ccb85976
  17815698 617 529a3ffd873183abf85c78e61e50da61ccb85976
  17815668 797 d687483fbb3982bb9953ea6e67e78a5c496e9b08
  17815668 560 10a7c4fbd756c1d961f71b7d55034011dbc0ec28
  17815379 539 f50422a9701a018dd5c57f29d70e37c637cd8e4b
  17815379 518 59303aa24bbbf7074ccef5dbabf0906e1c82f980

So those objects are all deltas (except the first, with all zeroes,
which is the root of the base tree for all the others). And we're
getting lots of savings (a few hundred bytes instead of 17MB).

So I wouldn't recommend disabling deltas. Most repacks will just reuse
those deltas verbatim, and you won't pay any cost for them. You only do
here because of the aggressive, --no-delta-reuse repack.

> Memory usage goes back to 2.4G after finishing to compress.
> 
> With 16 threads, the overall process takes about the same time as above,
> with about the same repartition.
> 
> But less time is spent on compressing the last percent of objects, and
> memory usage goes above 20GB later than with 36 threads.

I think most of this follows from the explanations above.

> Memory usage stays reasonable, except at some point after 47 minutes,
> where it starts to increase up to 12.7GB, and then goes back down about
> half an hour later, all while stalling around the 13% progress mark.

Again, sounds like that's where one (or more) of the threads hits that
chunk of big objects.

> My guess is all those stalls are happening when processing the files I
> already had problems with in the past[3], except there are more of them
> now (thankfully, they were removed, so there won't be more, but that
> doesn't make the existing ones go away).

Yep, I think that's right.

> I never ended up working on trying to make that diff faster, maybe that
> would help a little here, but that would probably not help much wrt the
> memory usage. I wonder what git could reasonably do to avoid OOMing in
> this case. Reduce the window size temporarily? Trade memory with time,
> by not keeping the objects in memory?

There is a config option, pack.windowMemory, which tries to do this.
I've had mixed results experimenting with it, and never really used it
beyond experiments.

> I'm puzzled by the fact writing objects is so much faster with 1 thread.

Yes, that part puzzles me too. If your answer to my earlier question is
negative, then it might be worth digging more into the delta-base-cache
(perhaps starting with instrumenting it to print out which deltas did or
didn't get cached, along with their sizes).

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-04 13:13   ` Mike Hommey
@ 2019-07-05  5:14     ` Jeff King
  2019-07-05  5:47       ` Mike Hommey
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2019-07-05  5:14 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Eric Wong, git

On Thu, Jul 04, 2019 at 10:13:20PM +0900, Mike Hommey wrote:

> > "public-inbox-index" (reading from git, writing to Xapian+SQLite)
> > on a dev machine got slow because core count exceeded what SATA
> > could handle and had to cap the default Xapian shard count to 3
> > by default for v2 inboxes.
>  
> AFAICT, git doesn't write from multiple threads.

Right. That's always single threaded, and the main difference there is
going to be what's in the delta base cache.

> Oh right, I forgot to mention:
> - I thought this memory usage thing was [1] but it turns out it was real
>   memory usage.
> - glibc's mallinfo stores values as int, so it's useless to know how
>   much memory was allocated when it's more than 4GB.
> - glibc's malloc_stats relies on the same int data, so while it does
>   print "in use" data, it can't print values above 4GB correctly.
> - glibc has a malloc_stats function that, according to its manual page
>   "addresses the deficiencies in malloc_stats and mallinfo", but while
>   it outputs a large XML dump, it doesn't contain anything that looks
>   remotely like the "in use" from malloc_stats.
> - So all in all, I used jemalloc to gather the "allocated" stats.

I think I explained all of the memory-usage questions in my earlier
response, but just for reference: if you have access to it, valgrind's
"massif" tool is really good for this kind of profiling. Something like:

  valgrind --tool=massif git pack-objects ...
  ms_print massif.out.*

which shows heap usage at various times, points out the snapshot with
peak usage, and shows a backtrace of the main culprits at a few
snapshots.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-05  5:09 ` Jeff King
@ 2019-07-05  5:45   ` Mike Hommey
  2019-07-05 11:51     ` Mike Hommey
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Hommey @ 2019-07-05  5:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jul 05, 2019 at 01:09:55AM -0400, Jeff King wrote:
> On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:
> > Finally, with 1 thread, the picture changes greatly. The overall process
> > takes 2.5h:
> > - 50 seconds enumerating and counting objects.
> > - ~2.5h compressing objects.
> > - 3 minutes and 25 seconds writing objects!
> 
> That's weird. I'd expect us to find similar amounts of deltas, but we
> don't have the writing slow-down. I wonder if there is some bad
> interaction between the multi-threaded code and the delta cache.
> 
> Did you run the second, single-thread run against the exact same
> original repository you had? Or did you re-run it on the result of the
> multi-thread run? Another explanation is that the original repository
> had some poor patterns that made objects expensive to access (say, a ton
> of really deep delta chains). And so the difference between the two runs
> was not the threads, but just the on-disk repository state.
> 
> Kind of a long shot, but if that is what happened, try running another
> multi-threaded "repack -f" and see if its writing phase is faster.

I've run 36-threads, 16-threads and 1-thread in sequence on the same
repo, so 16-threads was repacking what was repacked by the 36-threads,
and 1-thread was repacking what was repacked by the 16-threads. I
assumed it didn't matter, but come to think of it, I guess it can.

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-05  5:14     ` Jeff King
@ 2019-07-05  5:47       ` Mike Hommey
  2019-07-05 11:29         ` Jakub Narebski
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Hommey @ 2019-07-05  5:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jul 05, 2019 at 01:14:13AM -0400, Jeff King wrote:
> On Thu, Jul 04, 2019 at 10:13:20PM +0900, Mike Hommey wrote:
> 
> > > "public-inbox-index" (reading from git, writing to Xapian+SQLite)
> > > on a dev machine got slow because core count exceeded what SATA
> > > could handle and had to cap the default Xapian shard count to 3
> > > by default for v2 inboxes.
> >  
> > AFAICT, git doesn't write from multiple threads.
> 
> Right. That's always single threaded, and the main difference there is
> going to be what's in the delta base cache.
> 
> > Oh right, I forgot to mention:
> > - I thought this memory usage thing was [1] but it turns out it was real
> >   memory usage.
> > - glibc's mallinfo stores values as int, so it's useless to know how
> >   much memory was allocated when it's more than 4GB.
> > - glibc's malloc_stats relies on the same int data, so while it does
> >   print "in use" data, it can't print values above 4GB correctly.
> > - glibc has a malloc_stats function that, according to its manual page
> >   "addresses the deficiencies in malloc_stats and mallinfo", but while
> >   it outputs a large XML dump, it doesn't contain anything that looks
> >   remotely like the "in use" from malloc_stats.
> > - So all in all, I used jemalloc to gather the "allocated" stats.
> 
> I think I explained all of the memory-usage questions in my earlier
> response, but just for reference: if you have access to it, valgrind's
> "massif" tool is really good for this kind of profiling. Something like:
> 
>   valgrind --tool=massif git pack-objects ...
>   ms_print massif.out.*
> 
> which shows heap usage at various times, points out the snapshot with
> peak usage, and shows a backtrace of the main culprits at a few
> snapshots.

At the expense of time ;) A run would likely last an entire day under
massif (by which I mean a full 24 hours, not a 9-5 day).

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-05  5:47       ` Mike Hommey
@ 2019-07-05 11:29         ` Jakub Narebski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2019-07-05 11:29 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Jeff King, git

Mike Hommey <mh@glandium.org> writes:

> On Fri, Jul 05, 2019 at 01:14:13AM -0400, Jeff King wrote:
>> On Thu, Jul 04, 2019 at 10:13:20PM +0900, Mike Hommey wrote:
[...]
>> I think I explained all of the memory-usage questions in my earlier
>> response, but just for reference: if you have access to it, valgrind's
>> "massif" tool is really good for this kind of profiling. Something like:
>> 
>>   valgrind --tool=massif git pack-objects ...
>>   ms_print massif.out.*
>> 
>> which shows heap usage at various times, points out the snapshot with
>> peak usage, and shows a backtrace of the main culprits at a few
>> snapshots.
>
> At the expense of time ;) A run would likely last an entire day under
> massif (by which I mean a full 24 hours, not a 9-5 day).

Valgrind, as I understand it, runs the program under emulation.  I
wonder if perf / OProfile based solution could help here (gathering
memory-based events and metrics).

There is also trace2 built-in into Git, but I don't know if it could be
used for this purpose or not.

Best,
--
Jakub Narębski

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Surprising use of memory and time when repacking mozilla's gecko repository
  2019-07-05  5:45   ` Mike Hommey
@ 2019-07-05 11:51     ` Mike Hommey
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Hommey @ 2019-07-05 11:51 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jul 05, 2019 at 02:45:16PM +0900, Mike Hommey wrote:
> On Fri, Jul 05, 2019 at 01:09:55AM -0400, Jeff King wrote:
> > On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:
> > > Finally, with 1 thread, the picture changes greatly. The overall process
> > > takes 2.5h:
> > > - 50 seconds enumerating and counting objects.
> > > - ~2.5h compressing objects.
> > > - 3 minutes and 25 seconds writing objects!
> > 
> > That's weird. I'd expect us to find similar amounts of deltas, but we
> > don't have the writing slow-down. I wonder if there is some bad
> > interaction between the multi-threaded code and the delta cache.
> > 
> > Did you run the second, single-thread run against the exact same
> > original repository you had? Or did you re-run it on the result of the
> > multi-thread run? Another explanation is that the original repository
> > had some poor patterns that made objects expensive to access (say, a ton
> > of really deep delta chains). And so the difference between the two runs
> > was not the threads, but just the on-disk repository state.
> > 
> > Kind of a long shot, but if that is what happened, try running another
> > multi-threaded "repack -f" and see if its writing phase is faster.
> 
> I've run 36-threads, 16-threads and 1-thread in sequence on the same
> repo, so 16-threads was repacking what was repacked by the 36-threads,
> and 1-thread was repacking what was repacked by the 16-threads. I
> assumed it didn't matter, but come to think of it, I guess it can.

I tried:
- fresh clone -> 36-threads
- fresh clone -> 1-thread -> 36-threads

The 36-threads gc in the latter was only marginally faster than in the
former (between 19 and 20 minutes instead of 22 for both "Compressing"
and "Writing").

Mike

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-07-05 11:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
2019-07-04 12:04 ` Eric Wong
2019-07-04 13:13   ` Mike Hommey
2019-07-05  5:14     ` Jeff King
2019-07-05  5:47       ` Mike Hommey
2019-07-05 11:29         ` Jakub Narebski
2019-07-05  0:22 ` Mike Hommey
2019-07-05  4:45 ` Mike Hommey
2019-07-05  5:09 ` Jeff King
2019-07-05  5:45   ` Mike Hommey
2019-07-05 11:51     ` Mike Hommey

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).