git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Peter Baumann <Peter.B.Baumann@stud.informatik.uni-erlangen.de>
To: git@vger.kernel.org
Subject: Re: [PATCH 2/3] sha1_file: add the ability to parse objects in "pack file format"
Date: Wed, 12 Jul 2006 08:49:17 +0200	[thread overview]
Message-ID: <slrneb96rd.dma.Peter.B.Baumann@xp.machine.xx> (raw)
In-Reply-To: Pine.LNX.4.64.0607111656250.5623@g5.osdl.org

On 2006-07-12, Linus Torvalds <torvalds@osdl.org> wrote:
[...]
> Anyway, I think this following patch replaces the old 2/3 and 3/3 (it 
> still depends on the original [1/3] cleanup.
>
> (It also renames and reverses the meaning of the config file option: it's 
> now "[core] LegacyHeaders = true" for using legacy headers.)
>
> Not heavily tested, but seems ok.
>
> sf? Dscho? Can you check this thing out?
>
> 		Linus
> ----
[...]
> diff --git a/sha1_file.c b/sha1_file.c
> index 8734d50..475b23d 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -684,26 +684,74 @@ static void *map_sha1_file_internal(cons
>  	return map;
>  }
>  
> -static int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size)
> +static int unpack_sha1_header(z_stream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz)
>  {
> +	unsigned char c;
> +	unsigned int word, bits;
> +	unsigned long size;
> +	static const char *typename[8] = {
> +		NULL,	/* OBJ_EXT */
> +		"commit", "tree", "blob", "tag",
> +		NULL, NULL, NULL
> +	};
> +	const char *type;
> +
>  	/* Get the data stream */
>  	memset(stream, 0, sizeof(*stream));
>  	stream->next_in = map;
>  	stream->avail_in = mapsize;
>  	stream->next_out = buffer;
> -	stream->avail_out = size;
> +	stream->avail_out = bufsiz;
> +
> +	/*
> +	 * Is it a zlib-compressed buffer? If so, the first byte
> +	 * must be 0x78 (15-bit window size, deflated), and the
> +	 * first 16-bit word is evenly divisible by 31
> +	 */
> +	word = (map[0] << 8) + map[1];
> +	if (map[0] == 0x78 && !(word % 31)) {
> +		inflateInit(stream);
> +		return inflate(stream, 0);
> +	}
> +
> +	c = *map++;
> +	mapsize--;
> +	type = typename[(c >> 4) & 7];
> +	if (!type)
> +		return -1;
> +
> +	bits = 4;
> +	size = c & 0xf;
> +	while (!(c & 0x80)) {
> +		if (bits >= 8*sizeof(long))
> +			return -1;
> +		c = *map++;
> +		size += (c & 0x7f) << bits;
> +		bits += 7;
> +		mapsize--;
> +	}

This doesn't match the logic used in unpack_object_header, which is used
in the packs:

static unsigned long unpack_object_header(struct packed_git *p, unsigned long offset,
        enum object_type *type, unsigned long *sizep)
{
	unsigned shift;
	unsigned char *pack, c;
	unsigned long size;

	if (offset >= p->pack_size)
		die("object offset outside of pack file");

	pack =  (unsigned char *) p->pack_base + offset;
	c = *pack++;
	offset++;
	*type = (c >> 4) & 7;
	size = c & 15;
	shift = 4;
	while (c & 0x80) {			<==========
		if (offset >= p->pack_size)
	        	die("object offset outside of pack file");
		c = *pack++;
		offset++;
		size += (c & 0x7f) << shift;
		shift += 7;
	}
	*sizep = size;				<==========
	return offset;
}

> @@ -1414,6 +1462,49 @@ static int write_buffer(int fd, const vo
>  	return 0;
>  }
>  
> +static int write_binary_header(unsigned char *hdr, enum object_type type, unsigned long len)
> +{
> +	int hdr_len;
> +	unsigned char c;
> +
> +	c = (type << 4) | (len & 15);
> +	len >>= 4;
> +	hdr_len = 1;
> +	while (len) {
> +		*hdr++ = c;
> +		hdr_len++;
> +		c = (len & 0x7f);
> +		len >>= 7;
> +	}
> +	*hdr = c | 0x80;
> +	return hdr_len;
> +}
> +

Dito, but in this case see pack-objects.c

/*
 * The per-object header is a pretty dense thing, which is
 *  - first byte: low four bits are "size", then three bits of "type",
 *    and the high bit is "size continues".
 *  - each byte afterwards: low seven bits are size continuation,
 *    with the high bit being "size continues"
 */
static int encode_header(enum object_type type, unsigned long size, unsigned char *hdr)
{
        int n = 1;
        unsigned char c;

        if (type < OBJ_COMMIT || type > OBJ_DELTA)
                die("bad type %d", type);

        c = (type << 4) | (size & 15);
        size >>= 4;
        while (size) {
                *hdr++ = c | 0x80;	<=======
                c = size & 0x7f;
                size >>= 7;
                n++;
        }
        *hdr = c;			<=======
        return n;
}



-Peter Baumann

  parent reply	other threads:[~2006-07-12  6:50 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-10 23:01 Revisiting large binary files issue Carl Baldwin
2006-07-10 23:14 ` Junio C Hamano
2006-07-11  6:20   ` Peter Baumann
2006-07-10 23:28 ` Linus Torvalds
2006-07-11  9:40   ` [RFC]: Pack-file object format for individual objects (Was: Revisiting large binary files issue.) sf
2006-07-11 18:00     ` Linus Torvalds
2006-07-11 21:45       ` sf
2006-07-11 22:17         ` Linus Torvalds
2006-07-11 22:26           ` Linus Torvalds
2006-07-11 14:55   ` Revisiting large binary files issue Carl Baldwin
2006-07-11 17:09     ` Linus Torvalds
2006-07-11 17:10       ` [PATCH 1/3] Make the unpacked object header functions static to sha1_file.c Linus Torvalds
2006-07-11 17:12       ` [PATCH 2/3] sha1_file: add the ability to parse objects in "pack file format" Linus Torvalds
2006-07-11 18:40         ` Johannes Schindelin
2006-07-11 18:58           ` Linus Torvalds
2006-07-11 19:20             ` Johannes Schindelin
2006-07-11 19:48               ` Linus Torvalds
2006-07-11 21:25                 ` Johannes Schindelin
2006-07-11 21:47                 ` Junio C Hamano
2006-07-11 21:24         ` sf
2006-07-11 22:09           ` Linus Torvalds
2006-07-11 22:25             ` sf
2006-07-11 23:03             ` Junio C Hamano
2006-07-12  0:03               ` Linus Torvalds
2006-07-12  0:39                 ` Johannes Schindelin
2006-07-12  3:45                   ` Linus Torvalds
2006-07-12  4:31                     ` Linus Torvalds
2006-07-12  6:35                     ` Junio C Hamano
2006-07-12 16:29                       ` Linus Torvalds
2006-07-12  0:46                 ` Junio C Hamano
2006-07-12  3:42                   ` Linus Torvalds
2006-07-12  6:49                 ` Peter Baumann [this message]
2006-07-12  7:16                   ` Junio C Hamano
2006-07-12  8:28                     ` Peter Baumann
2006-07-12 15:13                   ` Linus Torvalds
2006-07-12 15:27                     ` Junio C Hamano
2006-07-11 17:16       ` [PATCH 3/3] Enable the new binary header format for unpacked objects Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=slrneb96rd.dma.Peter.B.Baumann@xp.machine.xx \
    --to=peter.b.baumann@stud.informatik.uni-erlangen.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).