git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Steven E. Harris" <seh@panix.com>
To: git@vger.kernel.org
Subject: Re: Confused over packfile and index design
Date: Sun, 10 Apr 2011 16:10:46 -0400	[thread overview]
Message-ID: <m24o657tq1.fsf@Spindle.sehlabs.com> (raw)
In-Reply-To: alpine.LFD.2.00.1104092147520.28032@xanadu.home

Nicolas Pitre <nico@fluxnic.net> writes:

> So the idea is to do that once to construct the pack index and allow
> for random access once the index is available.  Accessing a particular
> object without the pack index would be extremely costly otherwise,
> especially if it is towards the end of the pack.

Thanks for the explanation. It's clear now.

> The reason for storing only the expanded data size is to have the
> exact buffer size allocated for the inflated data.  The zlib stream
> that follows is encoded to consume only the needed data to produce the
> inflated object.  When the output buffer is all used, the zlib library
> should flag the end of the deflated stream.  If not then there is an
> error in the pack data.

That provides some error checking, then, as we trust zlib to know when
it's had enough input, and we have to trust its assessment on how much
is enough, given the lack of delimiting or framing in the packfile
format.

By the way, I looked over the zlib manual¹, and I see that many of the
inflating/decompressing functions require the caller to specify the
number of input bytes available. There is inflateBack() that uses
callback functions to request more data upon underflow. The higher-level
inflate() function also looks like it can be called in a loop, refilling
the input buffer upon underflow. Is Git using one of these two functions
here?

[...]

> When in doubt, the code is always the ultimate source of information.

Yes, I need to learn my way around in there to find the call sites
relevant to this discussion.


Footnotes: 
¹ http://www.zlib.net/manual.html

-- 
Steven E. Harris

      reply	other threads:[~2011-04-10 20:11 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-08 23:58 Confused over packfile and index design Steven E. Harris
2011-04-09  0:20 ` Jeff King
2011-04-09  2:07 ` Shawn Pearce
2011-04-09 14:30   ` Steven E. Harris
2011-04-09 14:45     ` Shawn Pearce
2011-04-10  2:08 ` Nicolas Pitre
2011-04-10 20:10   ` Steven E. Harris [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m24o657tq1.fsf@Spindle.sehlabs.com \
    --to=seh@panix.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).