From: david@lang.hm
To: Joshua Redstone <joshua.redstone@fb.com>
Cc: Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
Joey Hess <joey@kitenet.net>,
"dgma@mohsinc.com" <dgma@mohsinc.com>,
Matt Graham <mdg149@gmail.com>,
Tomas Carnecky <tom@dbservice.com>, Greg Troxel <gdt@ir.bbn.com>,
David Barr <davidbarr@google.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git performance results on a large repository
Date: Mon, 6 Feb 2012 17:28:06 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.02.1202061722310.1107@asgard.lang.hm> (raw)
In-Reply-To: <CB55A6A4.40AFD%joshua.redstone@fb.com>
On Mon, 6 Feb 2012, Joshua Redstone wrote:
> David Lang and David Barr, I generated the pack files by doing a repack:
> "git repack -a -d -f --max-pack-size=10g --depth=100 --window=250" after
> generating the repo.
how many pack files does this end up creating?
I think that doing a full repack the way you did will group all revisions
of a given file into a pack.
while what I'm saying is that if you create the packs based on time,
rather than space efficiency of the resulting pack files, you may end up
not having to go through as much date when doing things like a git blame.
what you did was
initialize repo
4M commits
repack
what I'm saying is
initialize repo
loop
500K commits
repack (and set pack to .keep so it doesn't get overwritten)
so you will end up with ~8 sets of pack files, but time based so that when
you only need recent information you only look at the most recent pack
file. If you need to go back through all time, the multiple pack files
will be a little more expensive to process.
this has the added advantage that the 8 small repacks should be cheaper
than the one large repack as it isn't trying to cover all commits each
time.
David Lang
next prev parent reply other threads:[~2012-02-07 1:31 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-03 14:20 Git performance results on a large repository Joshua Redstone
2012-02-03 14:56 ` Ævar Arnfjörð Bjarmason
2012-02-03 17:00 ` Joshua Redstone
2012-02-03 22:40 ` Sam Vilain
2012-02-03 22:57 ` Sam Vilain
2012-02-07 1:19 ` Nguyen Thai Ngoc Duy
2012-02-03 23:05 ` Matt Graham
2012-02-04 1:25 ` Evgeny Sazhin
2012-02-03 23:35 ` Chris Lee
2012-02-04 0:01 ` Zeki Mokhtarzada
2012-02-04 5:07 ` Joey Hess
2012-02-04 6:53 ` Nguyen Thai Ngoc Duy
2012-02-04 18:05 ` Joshua Redstone
2012-02-05 3:47 ` Nguyen Thai Ngoc Duy
2012-02-06 15:40 ` Joey Hess
2012-02-07 13:43 ` Nguyen Thai Ngoc Duy
2012-02-09 21:06 ` Joshua Redstone
2012-02-10 7:12 ` Nguyen Thai Ngoc Duy
2012-02-10 9:39 ` Christian Couder
2012-02-10 12:24 ` Nguyen Thai Ngoc Duy
2012-02-06 7:10 ` David Mohs
2012-02-06 16:23 ` Matt Graham
2012-02-06 20:50 ` Joshua Redstone
2012-02-06 21:07 ` Greg Troxel
2012-02-07 1:28 ` david [this message]
2012-02-06 21:17 ` Sam Vilain
2012-02-04 20:05 ` Joshua Redstone
2012-02-05 15:01 ` Tomas Carnecky
2012-02-05 15:17 ` Nguyen Thai Ngoc Duy
2012-02-04 8:57 ` slinky
2012-02-04 21:42 ` Greg Troxel
2012-02-05 4:30 ` david
2012-02-05 11:24 ` David Barr
2012-02-07 8:58 ` Emanuele Zattin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.02.1202061722310.1107@asgard.lang.hm \
--to=david@lang.hm \
--cc=davidbarr@google.com \
--cc=dgma@mohsinc.com \
--cc=gdt@ir.bbn.com \
--cc=git@vger.kernel.org \
--cc=joey@kitenet.net \
--cc=joshua.redstone@fb.com \
--cc=mdg149@gmail.com \
--cc=pclouds@gmail.com \
--cc=tom@dbservice.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).