git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: sf <sf-gmane@stephan-feder.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/3] sha1_file: add the ability to parse objects in "pack file format"
Date: Tue, 11 Jul 2006 15:09:07 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0607111449190.5623@g5.osdl.org> (raw)
In-Reply-To: <44B4172B.3070503@stephan-feder.de>



On Tue, 11 Jul 2006, sf wrote:
> 
> And in the traditional format type and length are compressed whereas in
> the pack-file format they are not.

Ahh. Yes.

> > This should probably be applied to the main tree asap if we think
> > this is at all a worthwhile exercise. But somebody should verify that I 
> > got the format right first!
> 
> Sorry but see above.

Good catch, thanks indeed.

Doing that for unpacked objects would in fact make a lot of things much 
simpler, so it would be good to do. The _bad_ part is that this also makes 
it a lot harder to see the difference between a "binary header" and a 
"compressed ASCII header". The two are not "obviously different" any more.

The common byte sequence for a compressed stream is

	78 9c ...

where the first byte is the CMF byte (compression info and method).

But it's not the only possible such sequence according to the zlib format.

(The 16-bit hex number in MSB format, ie 0x789c above, is defined to have 
a built-in checksum, so that it must be a multiple of 31 according to the 
standard: 0x789c = 996 * 31).

So if we have a uncompressed header, we'd need to add a separate 2-byte 
fingerprint to it _before_ the real header that isn't divisible by 31, and 
use that as the thing to test.

Ho humm. I'll see what I can come up with.

		Linus

  reply	other threads:[~2006-07-11 22:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-10 23:01 Revisiting large binary files issue Carl Baldwin
2006-07-10 23:14 ` Junio C Hamano
2006-07-11  6:20   ` Peter Baumann
2006-07-10 23:28 ` Linus Torvalds
2006-07-11  9:40   ` [RFC]: Pack-file object format for individual objects (Was: Revisiting large binary files issue.) sf
2006-07-11 18:00     ` Linus Torvalds
2006-07-11 21:45       ` sf
2006-07-11 22:17         ` Linus Torvalds
2006-07-11 22:26           ` Linus Torvalds
2006-07-11 14:55   ` Revisiting large binary files issue Carl Baldwin
2006-07-11 17:09     ` Linus Torvalds
2006-07-11 17:10       ` [PATCH 1/3] Make the unpacked object header functions static to sha1_file.c Linus Torvalds
2006-07-11 17:12       ` [PATCH 2/3] sha1_file: add the ability to parse objects in "pack file format" Linus Torvalds
2006-07-11 18:40         ` Johannes Schindelin
2006-07-11 18:58           ` Linus Torvalds
2006-07-11 19:20             ` Johannes Schindelin
2006-07-11 19:48               ` Linus Torvalds
2006-07-11 21:25                 ` Johannes Schindelin
2006-07-11 21:47                 ` Junio C Hamano
2006-07-11 21:24         ` sf
2006-07-11 22:09           ` Linus Torvalds [this message]
2006-07-11 22:25             ` sf
2006-07-11 23:03             ` Junio C Hamano
2006-07-12  0:03               ` Linus Torvalds
2006-07-12  0:39                 ` Johannes Schindelin
2006-07-12  3:45                   ` Linus Torvalds
2006-07-12  4:31                     ` Linus Torvalds
2006-07-12  6:35                     ` Junio C Hamano
2006-07-12 16:29                       ` Linus Torvalds
2006-07-12  0:46                 ` Junio C Hamano
2006-07-12  3:42                   ` Linus Torvalds
2006-07-12  6:49                 ` Peter Baumann
2006-07-12  7:16                   ` Junio C Hamano
2006-07-12  8:28                     ` Peter Baumann
2006-07-12 15:13                   ` Linus Torvalds
2006-07-12 15:27                     ` Junio C Hamano
2006-07-11 17:16       ` [PATCH 3/3] Enable the new binary header format for unpacked objects Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0607111449190.5623@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=sf-gmane@stephan-feder.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).