git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Dana How" <danahow@gmail.com>
To: "Nicolas Pitre" <nico@cam.org>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	"Jakub Narebski" <jnareb@gmail.com>,
	git@vger.kernel.org, danahow@gmail.com
Subject: Re: If you would write git from scratch now, what would you change?
Date: Mon, 26 Nov 2007 21:59:15 -0800	[thread overview]
Message-ID: <56b7f5510711262159x2e1fd4fdw8e914cb4a22376a1@mail.gmail.com> (raw)
In-Reply-To: <alpine.LFD.0.99999.0711262346410.9605@xanadu.home>

On Nov 26, 2007 8:58 PM, Nicolas Pitre <nico@cam.org> wrote:
> On Mon, 26 Nov 2007, Shawn O. Pearce wrote:
> > - Loose objects storage is difficult to work with
> >
> >   The standard loose object format of DEFLATE("$type $size\0$data")
> >   makes it harder to work with as you need to inflate at least
> >   part of the object just to see what the hell it is or how big
> >   its final output buffer needs to be.
>
> It is a bit cumbersome indeed, but I'm afraid we're really stuck with it
> since every object SHA1 depends on that format.

Yes,  now I remember: this was the same argument you used to
convince me that losing the "new" (deprecated) loose format was OK.

However,  if we changed
WRITE(DEFLATE(SHA1("$type $size\0$data")))
(where SHA1(x) = x but has the side-effect of updating the SHA-1)
to
WRITE($pack_style_object_header)
SHA1("$type $size\0")
WRITE(DEFLATE(SHA1($data)))
then the SHA-1 result is the same but we get the pack-style header,
and blobs can be sucked straight into packs when not deltified.
The SHA-1 result is still usable at the end to rename the temporary
loose object file
(and put it in the correct xx subdirectory).

Because we can't change the SHA-1 result we unfortunately can
never drop the 2nd call above [this is something that could
have been different, to respond to the email that started this thread].
You didn't like the duplication between the 1st and 2nd call,
but I can't say I see that as a big deal.

> >   It also makes it very hard to stream into a packfile if you have
> >   determined its not worth creating a delta for the object (or no
> >   suitable delta base is available).
> >
> >   The new (now deprecated) loose object format that was based on
> >   the packfile header format simplified this and made it much
> >   easier to work with.
>
> Not really.  Since separate zlib compression levels for loose objects
> and packed objects were introduced, there was a bunch of correctness
> issues.  What do you do when both compression levels are different?
> Sometimes ignore them, sometimes not? Because the default loose object
> compression level is about speed and the default pack compression level
> is about good space reduction, the correct thing to do by default would
> have been to always decompress and recompress anyway when copying an
> otherwise unmodified loose object into a pack.
Not exactly.  I did think about this.  When you are packing to stdout,
and only sending the resulting packfile locally,  you don't want to
bother with recompressing everything.  [This is the "workgroup" case
that concerns me.]  Other cases,  sure,
recompression could help (e.g., packing to a file means the file
will probably be around for a while,  so you want to recompress
if the levels are unequal;  and you probably want to recompress
as well if the packfile will be sent over a "slow" link).

Thanks,
-- 
Dana L. How  danahow@gmail.com  +1 650 804 5991 cell

  reply	other threads:[~2007-11-27  5:59 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-25 21:48 If you would write git from scratch now, what would you change? Jakub Narebski
2007-11-25 22:23 ` Pierre Habouzit
2007-11-26  1:28   ` Steven Walter
2007-11-26  6:11     ` Junio C Hamano
2007-11-26  6:36       ` Adam Roben
2007-11-26 15:32         ` Carlos Rica
2007-11-26 16:40           ` Daniel Barkalow
2007-11-26 16:46 ` Andy Parkins
2007-11-26 17:10   ` Benoit Sigoure
2007-11-26 18:56     ` Jan Hudec
2007-11-26 19:12       ` David Kastrup
2007-11-26 19:34         ` Jan Hudec
2007-11-26 19:50           ` Michael Poole
2007-11-26 20:09             ` Jan Hudec
2007-11-26 20:31               ` Michael Poole
2007-11-26 20:48                 ` Jon Smirl
2007-11-26 20:11     ` Andy Parkins
2007-11-26 19:25   ` Marco Costalba
2007-11-27  1:20     ` Shawn O. Pearce
2007-11-27  1:46       ` Jakub Narebski
2007-11-27  1:58         ` Shawn O. Pearce
2007-11-27 11:39           ` Johannes Schindelin
2007-11-27 23:59             ` [RFC] git-gui USer's Survey 2007 (was: If you would write git from scratch now, what would you change?) Jakub Narebski
2007-11-28 12:32               ` Johannes Schindelin
2007-11-28 15:48                 ` Jason Sewall
2007-11-28 23:25                 ` Jan Hudec
2007-11-28 23:48                   ` Johannes Schindelin
2007-11-29  6:57                     ` Jan Hudec
2007-11-29 12:01                       ` Johannes Schindelin
2007-11-30 17:50                         ` Jan Hudec
2007-11-30 18:25                           ` Marco Costalba
2007-12-01  2:35                           ` Shawn O. Pearce
2007-12-01  2:53                             ` Marco Costalba
2007-11-28 13:18               ` [RFC] git-gui USer's Survey 2007 Sergei Organov
2007-11-27  8:45     ` If you would write git from scratch now, what would you change? Andy Parkins
2007-11-27 13:15       ` Marco Costalba
2007-11-27 23:56         ` Jan Hudec
2007-11-27 17:48       ` Johannes Schindelin
2007-12-04 11:00         ` Andy Parkins
2007-11-27 17:33   ` Jing Xue
2007-11-26 16:48 ` Jon Smirl
2007-11-26 17:11 ` David Kastrup
2007-11-26 19:27   ` Jan Hudec
2007-11-26 20:11     ` Benoit Sigoure
2007-11-26 20:36       ` Jan Hudec
2007-11-26 19:30   ` Nicolas Pitre
2007-11-26 19:34     ` David Kastrup
2007-11-26 19:57       ` Jan Hudec
2007-11-26 20:35         ` David Kastrup
2007-11-26 21:00           ` Jan Hudec
2007-11-26 21:28           ` Nicolas Pitre
2007-11-26 20:45         ` Wincent Colaiuta
2007-11-26 21:24           ` Junio C Hamano
2007-11-26 21:35             ` Nicolas Pitre
2007-11-26 21:47               ` Junio C Hamano
2007-11-26 22:03                 ` Nicolas Pitre
2007-11-27  1:03             ` Shawn O. Pearce
2007-11-27  3:35               ` Junio C Hamano
2007-11-27  5:10                 ` Steven Grimm
2007-11-26 21:27     ` Johannes Schindelin
2007-11-26 21:39       ` Nicolas Pitre
2007-11-26 21:40         ` Johannes Schindelin
2007-11-27 14:11     ` Andreas Ericsson
2007-11-27 14:38       ` Jakub Narebski
2007-11-26 19:18 ` Dana How
2007-11-26 19:52   ` Nicolas Pitre
2007-11-26 20:17     ` Dana How
2007-11-26 20:55       ` Nicolas Pitre
2007-11-26 22:02         ` Dana How
2007-11-26 22:22           ` Nicolas Pitre
2007-11-26 20:17   ` Jakub Narebski
2007-11-26 20:36     ` Dana How
2007-11-27  1:25   ` Shawn O. Pearce
2007-11-27  5:07     ` Nicolas Pitre
2007-11-27  1:48 ` Shawn O. Pearce
2007-11-27  1:54   ` Junio C Hamano
2007-11-27  1:59     ` Shawn O. Pearce
2007-11-27  2:15       ` Jakub Narebski
2007-11-27 11:47         ` C# binding, was " Johannes Schindelin
2007-11-27  4:58   ` Nicolas Pitre
2007-11-27  5:59     ` Dana How [this message]
2007-11-27  6:12       ` Shawn O. Pearce
2007-11-27 16:33     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56b7f5510711262159x2e1fd4fdw8e914cb4a22376a1@mail.gmail.com \
    --to=danahow@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=nico@cam.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).