git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: david@lang.hm
To: Joshua Redstone <joshua.redstone@fb.com>
Cc: Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Joey Hess <joey@kitenet.net>,
	"dgma@mohsinc.com" <dgma@mohsinc.com>,
	Matt Graham <mdg149@gmail.com>,
	Tomas Carnecky <tom@dbservice.com>, Greg Troxel <gdt@ir.bbn.com>,
	David Barr <davidbarr@google.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git performance results on a large repository
Date: Mon, 6 Feb 2012 17:28:06 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.02.1202061722310.1107@asgard.lang.hm> (raw)
In-Reply-To: <CB55A6A4.40AFD%joshua.redstone@fb.com>

On Mon, 6 Feb 2012, Joshua Redstone wrote:

> David Lang and David Barr, I generated the pack files by doing a repack:
> "git repack -a -d -f --max-pack-size=10g --depth=100 --window=250"  after
> generating the repo.

how many pack files does this end up creating?

I think that doing a full repack the way you did will group all revisions 
of a given file into a pack.

while what I'm saying is that if you create the packs based on time, 
rather than space efficiency of the resulting pack files, you may end up 
not having to go through as much date when doing things like a git blame.

what you did was

initialize repo
4M commits
repack

what I'm saying is

initialize repo
loop
    500K commits
    repack (and set pack to .keep so it doesn't get overwritten)

so you will end up with ~8 sets of pack files, but time based so that when 
you only need recent information you only look at the most recent pack 
file. If you need to go back through all time, the multiple pack files 
will be a little more expensive to process.

this has the added advantage that the 8 small repacks should be cheaper 
than the one large repack as it isn't trying to cover all commits each 
time.

David Lang

  parent reply	other threads:[~2012-02-07  1:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03 14:20 Git performance results on a large repository Joshua Redstone
2012-02-03 14:56 ` Ævar Arnfjörð Bjarmason
2012-02-03 17:00   ` Joshua Redstone
2012-02-03 22:40     ` Sam Vilain
2012-02-03 22:57       ` Sam Vilain
2012-02-07  1:19       ` Nguyen Thai Ngoc Duy
2012-02-03 23:05     ` Matt Graham
2012-02-04  1:25   ` Evgeny Sazhin
2012-02-03 23:35 ` Chris Lee
2012-02-04  0:01 ` Zeki Mokhtarzada
2012-02-04  5:07 ` Joey Hess
2012-02-04  6:53 ` Nguyen Thai Ngoc Duy
2012-02-04 18:05   ` Joshua Redstone
2012-02-05  3:47     ` Nguyen Thai Ngoc Duy
2012-02-06 15:40       ` Joey Hess
2012-02-07 13:43         ` Nguyen Thai Ngoc Duy
2012-02-09 21:06           ` Joshua Redstone
2012-02-10  7:12             ` Nguyen Thai Ngoc Duy
2012-02-10  9:39               ` Christian Couder
2012-02-10 12:24                 ` Nguyen Thai Ngoc Duy
2012-02-06  7:10     ` David Mohs
2012-02-06 16:23     ` Matt Graham
2012-02-06 20:50       ` Joshua Redstone
2012-02-06 21:07         ` Greg Troxel
2012-02-07  1:28         ` david [this message]
2012-02-06 21:17     ` Sam Vilain
2012-02-04 20:05   ` Joshua Redstone
2012-02-05 15:01   ` Tomas Carnecky
2012-02-05 15:17     ` Nguyen Thai Ngoc Duy
2012-02-04  8:57 ` slinky
2012-02-04 21:42 ` Greg Troxel
2012-02-05  4:30 ` david
2012-02-05 11:24   ` David Barr
2012-02-07  8:58 ` Emanuele Zattin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.02.1202061722310.1107@asgard.lang.hm \
    --to=david@lang.hm \
    --cc=davidbarr@google.com \
    --cc=dgma@mohsinc.com \
    --cc=gdt@ir.bbn.com \
    --cc=git@vger.kernel.org \
    --cc=joey@kitenet.net \
    --cc=joshua.redstone@fb.com \
    --cc=mdg149@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=tom@dbservice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).