git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Chris Mason <mason@suse.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Nicolas Pitre <nico@cam.org>, Alon Ziv <alonz@nolaviz.org>,
	git@vger.kernel.org
Subject: Re: [PATCH] add the ability to create and retrieve delta objects
Date: Tue, 3 May 2005 12:09:51 -0400	[thread overview]
Message-ID: <200505031209.52460.mason@suse.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0505030804170.3594@ppc970.osdl.org>

On Tuesday 03 May 2005 11:07, Linus Torvalds wrote:
> On Tue, 3 May 2005, Chris Mason wrote:
> > On the full import of all the bk->cvs changesets, the average file size
> > in .git is 4074 bytes.  73% of the files are 4096 bytes or smaller.
>
> Have you checked how many of those are blobs?
>
I've got cg-admin-lsobj running (effectively find .git -type f | xargs 
cat-file), it is taking a looong time but the ratios seem to stay pretty 
constant as it makes progress:

total: 186863
blob: 93688     (6.6 per commit)
commit: 14172
tree: 79003      (5.5 per commit)

> For many commits, we generate as many (or more) _tree_ objects as we
> generate blobs.
>
> And tree obejcts from the same "supertree" really is something that I
> wouldn't mind packing some way, because they really tend to be very much
> related (since they refer to each other). Eg the commit and the top-level
> tree are almost always a pair, since you'd get a shared top-level tree
> only with two commits that have the exact same content (which definitely
> happens, don't get me wrong, but it we get some duplication for that case,
> we'd still be winning).
>

The packed item patch wouldn't duplicate info in this case.  When it initially 
creates the packed buffer (before compression), it checks for an existing 
file with the same sha1 and returns if one is found.  This is to preserve the 
optimizations for write_tree case where it frequently tries to create files 
that already exist.

-chris

  reply	other threads:[~2005-05-03 16:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-03  3:57 RFC: adding xdelta compression to git Alon Ziv
2005-05-03  4:12 ` Nicolas Pitre
2005-05-03  4:52 ` Linus Torvalds
2005-05-03  5:30   ` Davide Libenzi
2005-05-03 15:52     ` C. Scott Ananian
2005-05-03 17:35       ` Linus Torvalds
2005-05-03 18:10         ` Davide Libenzi
2005-05-03  8:06   ` [PATCH] add the ability to create and retrieve delta objects Nicolas Pitre
2005-05-03 11:24     ` Chris Mason
2005-05-03 12:51       ` Nicolas Pitre
2005-05-03 15:07       ` Linus Torvalds
2005-05-03 16:09         ` Chris Mason [this message]
2005-05-03 15:57       ` C. Scott Ananian
2005-05-03 16:35         ` Chris Mason
2005-05-03 14:13     ` Chris Mason
2005-05-03 14:24       ` Nicolas Pitre
2005-05-03 14:37         ` Chris Mason
2005-05-03 15:04           ` Nicolas Pitre
2005-05-03 16:54             ` Chris Mason
2005-05-03 14:48     ` Linus Torvalds
2005-05-03 15:52       ` Nicolas Pitre
2005-05-04 15:56     ` Chris Mason
2005-05-04 16:12       ` C. Scott Ananian
2005-05-04 17:44         ` Chris Mason
2005-05-04 22:03           ` Linus Torvalds
2005-05-04 22:43             ` Chris Mason
2005-05-05  3:25             ` Nicolas Pitre
2005-05-04 21:47       ` Geert Bosch
2005-05-04 22:34         ` Chris Mason
2005-05-05  3:10           ` Nicolas Pitre
2005-05-03 12:48   ` RFC: adding xdelta compression to git Dan Holmsand
2005-05-03 15:50   ` C. Scott Ananian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200505031209.52460.mason@suse.com \
    --to=mason@suse.com \
    --cc=alonz@nolaviz.org \
    --cc=git@vger.kernel.org \
    --cc=nico@cam.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).