From: bdowning@lavos.net (Brian Downing)
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jakub Narebski <jnareb@gmail.com>,
Brandon Casey <casey@nrlssc.navy.mil>,
Nicolas Pitre <nico@cam.org>, Jan Holesovsky <kendy@suse.cz>,
git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] RFC: git lazy clone proof-of-concept
Date: Thu, 14 Feb 2008 17:57:47 -0600 [thread overview]
Message-ID: <20080214235747.GV27535@lavos.net> (raw)
In-Reply-To: <20080214235129.GU27535@lavos.net>
On Thu, Feb 14, 2008 at 05:51:29PM -0600, Brian Downing wrote:
> Do you by chance have repack.usedeltabaseoffset turned on? That has the
> unfortunate side effect of changing the output of verify-pack -v to be
> almost useless for my packinfo script (specifically, it no longer
> reports the parent SHA1 hash for deltas, and the script is basically all
> about deltra tree statistics.) I suppose that should probably be fixed,
> but I never looked into it.
That being said, the most useful output for figuring out where all the
space in the pack is going in my experience is gotten from:
git-verify-pack -v | packinfo.pl -tree -filenames
That will produce a huge amount of output, which is basically the tree
structure of the delta chains in the file. If things aren't being
deltified together properly, it's usually pretty obvious.
A delta chain in this output looks approximately like this:
# 0 blob 03156f21... 1767 1767 Documentation/git-lost-found.txt @ tags/v1.2.0~142
# 1 blob f52a9d7f... 10 1777 Documentation/git-lost-found.txt @ tags/v1.5.0-rc1~74
# 2 blob a8cc5739... 51 1828 Documentation/git-lost+found.txt @ tags/v0.99.9h^0
# 3 blob 660e90b1... 15 1843 Documentation/git-lost+found.txt @ master~3222^2~2
# 4 blob 0cb8e3bb... 33 1876 Documentation/git-lost+found.txt @ master~3222^2~3
# 2 blob e48607f0... 311 2088 Documentation/git-lost-found.txt @ tags/v1.5.2-rc3~4
# size: count 6 total 2187 min 10 max 1767 mean 364.50 median 51 std_dev 635.85
# path size: count 6 total 11179 min 1767 max 2088 mean 1863.17 median 1843 std_dev 107.26
# The first number after the sha1 is the object size, the second
# number is the path size. The statistics are across all objects in
# the previous delta tree. Obviously they are omitted for trees of
# one object.
# A path size is the sum of the size of the delta chain, including the
# base object. In other words, it's how many bytes need be read to
# reassemble the file from deltas.
This is also quite slow, as it runs git-ls-tree -t -r on every commit in
the repository to assign file names to blobs. You can leave out the
-filenames option to not do this (if you don't care about seeing
filenames, that is).
-bcd
next prev parent reply other threads:[~2008-02-14 23:58 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-08 17:28 [PATCH] RFC: git lazy clone proof-of-concept Jan Holesovsky
2008-02-08 18:03 ` Nicolas Pitre
2008-02-09 14:25 ` Jan Holesovsky
2008-02-09 22:05 ` Mike Hommey
2008-02-09 23:38 ` Nicolas Pitre
2008-02-10 7:23 ` Marco Costalba
2008-02-10 12:08 ` Johannes Schindelin
2008-02-10 16:46 ` David Symonds
2008-02-10 17:45 ` Johannes Schindelin
2008-02-10 19:45 ` Nicolas Pitre
2008-02-10 20:32 ` Johannes Schindelin
2008-02-08 18:14 ` Harvey Harrison
2008-02-09 14:27 ` Jan Holesovsky
2008-02-08 18:20 ` Johannes Schindelin
2008-02-08 18:49 ` Mike Hommey
2008-02-08 19:04 ` Johannes Schindelin
2008-02-09 15:06 ` Jan Holesovsky
2008-02-08 19:00 ` Jakub Narebski
2008-02-08 19:26 ` Jon Smirl
2008-02-08 20:09 ` Nicolas Pitre
2008-02-11 10:13 ` Andreas Ericsson
2008-02-12 2:55 ` [PATCH 1/2] pack-objects: Allow setting the #threads equal to #cpus automatically Brandon Casey
2008-02-12 5:53 ` Andreas Ericsson
[not found] ` <1202784078-23700-1-git-send-email-casey@nrlssc.navy.mil>
2008-02-12 2:59 ` [PATCH 2/2] pack-objects: Default to zero threads, meaning auto-assign to #cpus Brandon Casey
2008-02-12 4:57 ` Nicolas Pitre
2008-02-08 20:19 ` [PATCH] RFC: git lazy clone proof-of-concept Harvey Harrison
2008-02-08 20:24 ` Jon Smirl
2008-02-08 20:25 ` Harvey Harrison
2008-02-08 20:41 ` Jon Smirl
2008-02-09 15:27 ` Jan Holesovsky
2008-02-10 3:10 ` Nicolas Pitre
2008-02-10 4:59 ` Sean
2008-02-10 5:22 ` Nicolas Pitre
2008-02-10 5:35 ` Sean
2008-02-11 1:42 ` Jakub Narebski
2008-02-11 2:04 ` Nicolas Pitre
2008-02-11 10:11 ` Jakub Narebski
2008-02-10 9:34 ` Joachim B Haga
2008-02-10 16:43 ` Johannes Schindelin
2008-02-10 17:01 ` Jon Smirl
2008-02-10 17:36 ` Johannes Schindelin
2008-02-10 18:47 ` Johannes Schindelin
2008-02-10 19:42 ` Nicolas Pitre
2008-02-10 20:11 ` Jon Smirl
2008-02-12 20:37 ` Johannes Schindelin
2008-02-12 21:05 ` Nicolas Pitre
2008-02-12 21:08 ` Linus Torvalds
2008-02-12 21:36 ` Jon Smirl
2008-02-12 21:59 ` Linus Torvalds
2008-02-12 22:25 ` Linus Torvalds
2008-02-12 22:43 ` Jon Smirl
2008-02-12 23:39 ` Linus Torvalds
2008-02-12 21:25 ` Jon Smirl
2008-02-14 19:20 ` Johannes Schindelin
2008-02-14 20:05 ` Jakub Narebski
2008-02-14 20:16 ` Nicolas Pitre
2008-02-14 21:04 ` Johannes Schindelin
2008-02-14 21:59 ` Jakub Narebski
2008-02-14 23:38 ` Johannes Schindelin
2008-02-14 23:51 ` Brian Downing
2008-02-14 23:57 ` Brian Downing [this message]
2008-02-15 0:08 ` Johannes Schindelin
2008-02-15 1:41 ` Nicolas Pitre
2008-02-17 8:18 ` Shawn O. Pearce
2008-02-17 9:05 ` Junio C Hamano
2008-02-17 18:44 ` Nicolas Pitre
2008-02-15 1:07 ` Jakub Narebski
2008-02-15 9:43 ` Jan Holesovsky
2008-02-14 21:08 ` Brandon Casey
2008-02-15 9:34 ` Jan Holesovsky
2008-02-10 19:50 ` Nicolas Pitre
2008-02-14 19:41 ` Brandon Casey
2008-02-14 19:58 ` Johannes Schindelin
2008-02-14 20:11 ` Nicolas Pitre
2008-02-11 1:20 ` Jakub Narebski
2008-02-08 20:16 ` Johannes Schindelin
2008-02-08 21:35 ` Jakub Narebski
2008-02-08 21:52 ` Johannes Schindelin
2008-02-08 22:03 ` Mike Hommey
2008-02-08 22:34 ` Johannes Schindelin
2008-02-08 22:50 ` Mike Hommey
2008-02-08 23:14 ` Johannes Schindelin
2008-02-08 23:38 ` Mike Hommey
2008-02-09 21:20 ` Jan Hudec
2008-02-09 15:54 ` Jan Holesovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080214235747.GV27535@lavos.net \
--to=bdowning@lavos.net \
--cc=Johannes.Schindelin@gmx.de \
--cc=casey@nrlssc.navy.mil \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=kendy@suse.cz \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).