git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: Martin Koegler <mkoegler@auto.tuwien.ac.at>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: performance on repack
Date: Tue, 14 Aug 2007 01:13:21 -0400	[thread overview]
Message-ID: <20070814051321.GG27913@spearce.org> (raw)
In-Reply-To: <9e4733910708132110u6cdf5e6bg10417317c70b82f1@mail.gmail.com>

Jon Smirl <jonsmirl@gmail.com> wrote:
> On 8/13/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> > > On 32b there's windowing code for accessing the packfile since we can
> > > run out of address space, does this code get turned off for 64b?
> >
> > The windowing code you are talking about defaults as follows:
> >
> >   Parameter                  32b      64b
> >   -----------------------------------------
> >   core.packedGitWindowSize    32M     1G
> >   core.packedGitLimit        256M     8G
> >
> > So I doubt you are having issues with the windowing code on a 64b
> > system, unless your repository is just *huge*.  I did not think that
> > anyone had a Git repository that exceeded 8G, though the window
> > size of 1G might be a tad too small if there are many packfiles
> > and they are each larger than 1G.
> 
> Why use windows on 64b? Default core.packedGitWindowSize equal to
> core.packedGitLimit

That's *not* a good idea when you have more than one packfile.

The limit is for the sum of all packfiles.  The settings above allow
up to 8 packfiles to be opened and mapped at once on 64b systems,
with each packfile being up to 1G in size before we start shifting
the window(s) around.  Doing as you suggest would reduce the number
of open packfiles to 1, which would severely hurt performance
when there is more than one packfile and Git keeps bouncing around
between them to satisfy the current process' demands.

One could probably argue that the defaults for 64b are too small;
perhaps they should be closer to 4G/24G seeing as how the 64b address
space is so huge that we're unlikely to run into issues with being
able to use >24G of virtual address at once.

> I haven't measured it but I suspect the OS calls for moving the
> windows are are quite slow on a relative basis since they have to
> rewrite a bunch of page tables.

Maybe.  Add a call to pack_report() at the end of the program you
are interested in and run it.  We keep track of how often we move
windows around; you may find that we don't move them often enough
(or at all) to cause problems here.  Or just run it under strace
and watch mmap() activity, filtering out the uninteresting bits.

> Why is the window so small on 32b? I
> thought we were up to about a 1GB packfile before running out of
> address space with Mozilla. Shouldn't the window simply be set as
> large as possible on 32b, this size being a function of the available
> address space, not the amount of physical memory?

Because programs need to malloc() stuff to work.  And we need
stack space.  And we need to let the runtime linker mmap() in the
shared libraries we are linked to.  All in all we do get tight in
some 32b cases.  The above defaults for 32b were chosen based on
the Linux kernel repository (its under 256M) and based on some
(crude) performance testing on Linux (which seemed to say the
32M packedGitWindowSize wasn't really hurting us).  They were
basically set to give us maximum address space for working heap
and yet not have a negative impact on one of our (at the time)
largest user groups.

In particular repack (aka git-pack-objects) is a real memory pig,
especially now with its various caches.  The more address space we
can let it use in a 32b case the better off we probably are.

If someone can show that increasing these 32b defaults is the
right thing to do even in very large repositories, *especially*
with something really brutal like `git-blame` on a very busy file or
`git repack -f -a` then please submit a patch to boost them.  ;-)

-- 
Shawn.

  reply	other threads:[~2007-08-14  5:14 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-11 21:12 performance on repack Jon Smirl
2007-08-11 22:09 ` David Kastrup
2007-08-11 22:34   ` Linus Torvalds
2007-08-11 23:21     ` Jon Smirl
2007-08-12 10:33 ` Martin Koegler
2007-08-12 13:49   ` Jon Smirl
2007-08-14  3:12     ` Shawn O. Pearce
2007-08-14  4:10       ` Jon Smirl
2007-08-14  5:13         ` Shawn O. Pearce [this message]
2007-08-14  5:57           ` Jon Smirl
2007-08-14 14:52       ` Nicolas Pitre
2007-08-14 21:41       ` Nicolas Pitre
2007-08-15  1:20         ` Jon Smirl
2007-08-15  1:59           ` Nicolas Pitre
2007-08-15  5:32         ` Shawn O. Pearce
2007-08-15 15:08           ` Jon Smirl
2007-08-15 17:11             ` Martin Koegler
2007-08-15 18:38               ` Jon Smirl
2007-08-15 19:00                 ` Nicolas Pitre
2007-08-15 19:42                   ` Jon Smirl
2007-08-16  8:10                   ` David Kastrup
2007-08-16 15:34                     ` Nicolas Pitre
2007-08-16 16:13                       ` Jon Smirl
2007-08-16 16:21                         ` Nicolas Pitre
2007-08-15 21:05             ` Nicolas Pitre
2007-08-15 20:49           ` Nicolas Pitre
2007-08-30  4:27             ` Nicolas Pitre
2007-08-30  4:36               ` Nicolas Pitre
2007-08-30 16:17                 ` Jon Smirl
2007-09-01 21:54                 ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070814051321.GG27913@spearce.org \
    --to=spearce@spearce.org \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=mkoegler@auto.tuwien.ac.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).