git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: Martin Koegler <mkoegler@auto.tuwien.ac.at>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: performance on repack
Date: Mon, 13 Aug 2007 23:12:36 -0400	[thread overview]
Message-ID: <20070814031236.GC27913@spearce.org> (raw)
In-Reply-To: <9e4733910708120649g5a5e0f48pa71bd983f2bc2945@mail.gmail.com>

Jon Smirl <jonsmirl@gmail.com> wrote:
> On 8/12/07, Martin Koegler <mkoegler@auto.tuwien.ac.at> wrote:
> > Have you considered the impact on memory usage, if there are large
> > blobs in the repository?
> 
> The process size maxed at 650MB. I'm in 64b mode so there is no
> virtual memory limit.
> 
> On 32b there's windowing code for accessing the packfile since we can
> run out of address space, does this code get turned off for 64b?

The windowing code you are talking about defaults as follows:

  Parameter                  32b      64b
  -----------------------------------------
  core.packedGitWindowSize    32M     1G
  core.packedGitLimit        256M     8G

So I doubt you are having issues with the windowing code on a 64b
system, unless your repository is just *huge*.  I did not think that
anyone had a Git repository that exceeded 8G, though the window
size of 1G might be a tad too small if there are many packfiles
and they are each larger than 1G.
 
> > * On the other hand, we could run all try_delta operations for one object
> >   parallel. This way, we would need not very much more memory, but
> >   require more synchronization (and more complex code).
> 
> This solution was my first thought too. Use the main thread to get
> everything needed for the object into RAM, then multi-thread the
> compute bound, in-memory delta search operation. Shared CPU caches
> might make this very fast.

I have been thinking about doing this, especially now that the
default window size is much larger.  I think the default is up as
high as 50, which means we'd keep that shiny new UltraSPARC T2 busy.
Not that I have one...  so anyone from Sun is welcome to send me
one if they want.  ;-)

I'm not sure its that complex to run all try_delta calls of the
current window in parallel.  Might be a simple enough change that
its actually worth the extra complexity, especially with these
multi-core systems being so readily available.  Repacking is the
most CPU intensive operation Git performs, and the one that is also
the easiest to make parallel.

Maybe someone else will beat me to it, but if not I might give such
a patch a shot in a few weeks.

-- 
Shawn.

  reply	other threads:[~2007-08-14  3:12 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-11 21:12 performance on repack Jon Smirl
2007-08-11 22:09 ` David Kastrup
2007-08-11 22:34   ` Linus Torvalds
2007-08-11 23:21     ` Jon Smirl
2007-08-12 10:33 ` Martin Koegler
2007-08-12 13:49   ` Jon Smirl
2007-08-14  3:12     ` Shawn O. Pearce [this message]
2007-08-14  4:10       ` Jon Smirl
2007-08-14  5:13         ` Shawn O. Pearce
2007-08-14  5:57           ` Jon Smirl
2007-08-14 14:52       ` Nicolas Pitre
2007-08-14 21:41       ` Nicolas Pitre
2007-08-15  1:20         ` Jon Smirl
2007-08-15  1:59           ` Nicolas Pitre
2007-08-15  5:32         ` Shawn O. Pearce
2007-08-15 15:08           ` Jon Smirl
2007-08-15 17:11             ` Martin Koegler
2007-08-15 18:38               ` Jon Smirl
2007-08-15 19:00                 ` Nicolas Pitre
2007-08-15 19:42                   ` Jon Smirl
2007-08-16  8:10                   ` David Kastrup
2007-08-16 15:34                     ` Nicolas Pitre
2007-08-16 16:13                       ` Jon Smirl
2007-08-16 16:21                         ` Nicolas Pitre
2007-08-15 21:05             ` Nicolas Pitre
2007-08-15 20:49           ` Nicolas Pitre
2007-08-30  4:27             ` Nicolas Pitre
2007-08-30  4:36               ` Nicolas Pitre
2007-08-30 16:17                 ` Jon Smirl
2007-09-01 21:54                 ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070814031236.GC27913@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=mkoegler@auto.tuwien.ac.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).