git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Sam Vilain <sam@vilain.net>, Steven Grimm <koreth@midwinter.com>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	Junio C Hamano <junkio@cox.net>,
	Daniel Barkalow <barkalow@iabervon.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] Add --no-reuse-delta option to git-gc
Date: Mon, 11 Jun 2007 10:01:18 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.0.99.0706110930170.12885@xanadu.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0706111109430.4059@racer.site>

On Mon, 11 Jun 2007, Johannes Schindelin wrote:

> Hi,
> 
> On Sun, 10 Jun 2007, Nicolas Pitre wrote:
> 
> > On Sun, 10 Jun 2007, Sam Vilain wrote:
> > 
> > > Anyway it's a free world so be my guest to implement it, I guess if 
> > > this was selectable it would only be a minor annoyance waiting a bit 
> > > longer pulling from from some repositories, and it would be 
> > > interesting to see if it did make a big difference with pack file 
> > > sizes.
> > 
> > It won't happen for a simple reason: to be backward compatible with 
> > older GIT clients.  If you have your repo compressed with bzip2 and an 
> > old client pulls it then the server would have to decompress and 
> > recompress everything with gzip.  If instead your repo remains with gzip 
> > and a new client asks for bzip2 then you have to recompress as well 
> > (slow).  So in practice it is best to remain with a single compression 
> > method.
> 
> With the extension mechanism we have in place, the client can send what 
> kind of compression it supports, and the server can actually refuse to 
> send anything if it does not want to recompress.
> 
> What I am trying to say: you do not necessarily have to allow every client 
> to access that particular repository. I agree that mixed-compression repos 
> are evil, but nothing stands in the way of a flag allowing (or 
> disallowing) recompression in a different format when fetching.

I know.

But is it worthwhile?  I think not.

However I won't stand in the way of anyone who wants to try and provide 
numbers.  I just don't believe this is worthwhile and am not inclined to 
do it.

OK... Well, I just performed a really quick test:

$ mkdir test-bzip2
$ mkdir test-gzip
$ cp git/*.[cho] test-bzip2
$ cp git/*.[cho] test-gzip
$ bzip2 test-bzip2/*
$ gzip test-gzip/*
$ du -s test-bzip2 test-gzip
5016    test-bzip2
4956    test-gzip

It is true that bzip2 is better with large files, but we typically have 
very few of them in a Git repo, and in the presence of large files bzip2 
then becomes _much_ slower than gzip. So, given that the nature of Git 
objects are likely to be small in 98% of the cases due to deltas, it 
appears that bzip2 won't be a gain at all but rather a waste, making a 
poor case for supporting it forever afterwards.

> So if you should decide someday to track data with Git (remember: Generic 
> Information Tracker, not just source code),

Bah... if you please.

> that is particularly unfit for 
> compression with gzip, but that you _need_ to store in a different 
> compressed manner, you can set up a repository which will _only_ _ever_ 
> use that compression.

Maybe.  But you'd better have a concrete data set and result numbers to 
convince me.  Designing software for hypothetical situations before they 
actually exist leads to bloatware.


Nicolas

  reply	other threads:[~2007-06-11 14:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-08  2:54 [PATCH] Add --no-reuse-delta, --window, and --depth options to git-gc Theodore Ts'o
2007-05-08  3:13 ` Nicolas Pitre
2007-05-08  3:21   ` Theodore Tso
2007-05-08  3:38     ` Dana How
2007-05-08  4:43     ` Junio C Hamano
2007-05-08 13:46       ` Nicolas Pitre
2007-05-08 13:28 ` [PATCH] Add --no-reuse-delta, --window, and --depth options to Theodore Ts'o
2007-05-08 13:28   ` [PATCH] Add pack.depth option to git-pack-objects and change default depth to 50 Theodore Ts'o
2007-05-08 13:28     ` [PATCH] Add --no-reuse-delta option to git-gc Theodore Ts'o
2007-05-08 15:35       ` Nicolas Pitre
2007-05-09  5:05       ` Daniel Barkalow
2007-05-09  8:15         ` Junio C Hamano
2007-05-09  9:02           ` Steven Grimm
2007-05-09 11:35             ` Other compression?, was " Johannes Schindelin
2007-05-09 15:15             ` Junio C Hamano
2007-05-09 19:10             ` Shawn O. Pearce
2007-06-10  7:40               ` Sam Vilain
2007-06-11  1:51                 ` Nicolas Pitre
2007-06-11  6:20                   ` Steven Grimm
2007-06-11  6:31                     ` Shawn O. Pearce
2007-06-11 10:20                   ` Johannes Schindelin
2007-06-11 14:01                     ` Nicolas Pitre [this message]
2007-06-11 21:40                       ` Johannes Schindelin
2007-05-09 19:48           ` [PATCH] Add --aggressive option to 'git gc' Theodore Tso
2007-05-09 20:19             ` Junio C Hamano
2007-05-09 22:22               ` Theodore Tso
2007-05-10  7:38             ` Junio C Hamano
2007-05-08 15:38     ` [PATCH] Add pack.depth option to git-pack-objects and change default depth to 50 Nicolas Pitre
2007-05-08 16:30       ` Theodore Tso
2007-05-08 16:49         ` Johannes Schindelin
2007-05-08 18:09           ` Theodore Tso
2007-05-08 18:46             ` Nicolas Pitre
2007-05-09 13:49               ` Theodore Tso
2007-05-09 14:17                 ` Johannes Schindelin
2007-05-08 17:07         ` Dana How
2007-05-08 17:35         ` Nicolas Pitre
2007-05-09  5:03           ` Junio C Hamano
2007-05-08 15:30   ` [PATCH] Add --no-reuse-delta, --window, and --depth options to Nicolas Pitre
2007-05-08 21:12     ` Junio C Hamano
2007-05-08 23:59       ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.99.0706110930170.12885@xanadu.home \
    --to=nico@cam.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=koreth@midwinter.com \
    --cc=sam@vilain.net \
    --cc=spearce@spearce.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).