From: Mike Hommey <mh@glandium.org>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: Surprising use of memory and time when repacking mozilla's gecko repository
Date: Fri, 5 Jul 2019 20:51:44 +0900 [thread overview]
Message-ID: <20190705115144.7jqvc3qzanrpvpxq@glandium.org> (raw)
In-Reply-To: <20190705054516.mke7aqk2cdsffkpd@glandium.org>
On Fri, Jul 05, 2019 at 02:45:16PM +0900, Mike Hommey wrote:
> On Fri, Jul 05, 2019 at 01:09:55AM -0400, Jeff King wrote:
> > On Thu, Jul 04, 2019 at 07:05:30PM +0900, Mike Hommey wrote:
> > > Finally, with 1 thread, the picture changes greatly. The overall process
> > > takes 2.5h:
> > > - 50 seconds enumerating and counting objects.
> > > - ~2.5h compressing objects.
> > > - 3 minutes and 25 seconds writing objects!
> >
> > That's weird. I'd expect us to find similar amounts of deltas, but we
> > don't have the writing slow-down. I wonder if there is some bad
> > interaction between the multi-threaded code and the delta cache.
> >
> > Did you run the second, single-thread run against the exact same
> > original repository you had? Or did you re-run it on the result of the
> > multi-thread run? Another explanation is that the original repository
> > had some poor patterns that made objects expensive to access (say, a ton
> > of really deep delta chains). And so the difference between the two runs
> > was not the threads, but just the on-disk repository state.
> >
> > Kind of a long shot, but if that is what happened, try running another
> > multi-threaded "repack -f" and see if its writing phase is faster.
>
> I've run 36-threads, 16-threads and 1-thread in sequence on the same
> repo, so 16-threads was repacking what was repacked by the 36-threads,
> and 1-thread was repacking what was repacked by the 16-threads. I
> assumed it didn't matter, but come to think of it, I guess it can.
I tried:
- fresh clone -> 36-threads
- fresh clone -> 1-thread -> 36-threads
The 36-threads gc in the latter was only marginally faster than in the
former (between 19 and 20 minutes instead of 22 for both "Compressing"
and "Writing").
Mike
prev parent reply other threads:[~2019-07-05 11:51 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-04 10:05 Surprising use of memory and time when repacking mozilla's gecko repository Mike Hommey
2019-07-04 12:04 ` Eric Wong
2019-07-04 13:13 ` Mike Hommey
2019-07-05 5:14 ` Jeff King
2019-07-05 5:47 ` Mike Hommey
2019-07-05 11:29 ` Jakub Narebski
2019-07-05 0:22 ` Mike Hommey
2019-07-05 4:45 ` Mike Hommey
2019-07-05 5:09 ` Jeff King
2019-07-05 5:45 ` Mike Hommey
2019-07-05 11:51 ` Mike Hommey [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190705115144.7jqvc3qzanrpvpxq@glandium.org \
--to=mh@glandium.org \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).