From: Thomas Gummerer <t.gummerer@gmail.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org, trast@student.ethz.ch, gitster@pobox.com,
peff@peff.net, spearce@spearce.org, davidbarr@google.com
Subject: Re: Index format v5
Date: Mon, 21 May 2012 22:30:18 +0200 [thread overview]
Message-ID: <20120521203018.GA57389@tgummerer.unibz.it> (raw)
In-Reply-To: <4FB73268.4020204@alum.mit.edu>
On 05/19, Michael Haggerty wrote:
> I think it is possible to remove the last magic number and also to
> make the CRC handling easier. I have pushed some suggested changes
> to github [1]:
>
> 1. With the current code, trying to read a file that is less than 24
> bytes long would result in a struct.error (because it would try to
> unpack a string that is shorter than the struct) whereas the
> underlying error in this case should almost always be reported as a
> signature error. So it is correct to read the signature separately
> from the rest of the header, but it is even more correct to check
> the signature (including its length) before reading on.
I looked at the current git code, and the filesize is used for
checking if the index size is big enough. I'll implement it that way
in the prototype for now.
> 2. I introduce a class CRC to hold checksums, so that (a) the code
> for handling checksums can be encapsulated, and (b) an instance of
> this class can be passed into functions and mutated in-place, which
> is less cumbersome than requiring functions to return (data, crc)
> tuples. This, in turn, makes possible...
>
> 3. ...a new function read_struct(f, s, crc), which reads the data
> for a struct.Struct from f, checksums it, and returns the unpacked
> data. This function is more convenient to use than the old
> read_calc_crc().
>
> 4. The checksum instance can also be made responsible for verifying
> that the next four bytes in the file agree with the expected
> checksum. This removes some more code duplication. (See
> CRC.matches().)
>
> 5. You read the extension offsets using CRC_STRUCT.size, which is
> technically correct but misleading. In fact, the extension offsets
> should be documented using their own EXTENSION_INDEX_STRUCT. Also,
> it makes more sense to store the unpacked integer offsets to
> extoffsets rather than the raw 4-byte strings.
>
> 6. With a couple of more minor changes it is possible to replace the
> magic number "24" in read_index_entries(). With this change the
> computation is documented very explicitly and is also (somewhat)
> robust against future changes in the format.
>
> Look over my changes and take whatever you want.
Thanks, I didn't merge them directly, but I took the basics out and
merged them with my code.
--
Thomas
next prev parent reply other threads:[~2012-05-21 20:31 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-03 17:25 Index format v5 Thomas Gummerer
2012-05-03 18:16 ` Thomas Rast
2012-05-03 19:03 ` Junio C Hamano
2012-05-04 7:12 ` Michael Haggerty
2012-05-07 22:18 ` Robin Rosenberg
2012-05-03 18:21 ` Ronan Keryell
2012-05-03 20:36 ` Thomas Gummerer
2012-05-03 18:54 ` Junio C Hamano
2012-05-03 19:11 ` Thomas Rast
2012-05-03 19:31 ` Thomas Rast
2012-05-03 19:32 ` Thomas Rast
2012-05-03 20:32 ` Junio C Hamano
2012-05-03 21:38 ` Thomas Gummerer
2012-05-07 18:57 ` Robin Rosenberg
2012-05-03 19:38 ` solo-git
2012-05-04 13:20 ` Nguyen Thai Ngoc Duy
2012-05-04 15:44 ` Thomas Gummerer
2012-05-04 13:25 ` Philip Oakley
2012-05-04 15:46 ` Junio C Hamano
2012-05-06 10:23 ` Nguyen Thai Ngoc Duy
2012-05-07 13:44 ` Thomas Gummerer
2012-05-06 16:49 ` Phil Hord
2012-05-07 13:08 ` Thomas Gummerer
2012-05-07 15:15 ` Michael Haggerty
2012-05-08 14:11 ` Thomas Gummerer
2012-05-08 14:25 ` Nguyen Thai Ngoc Duy
2012-05-08 14:34 ` Nguyen Thai Ngoc Duy
2012-05-10 6:53 ` Thomas Gummerer
2012-05-10 11:06 ` Nguyen Thai Ngoc Duy
2012-05-09 8:37 ` Michael Haggerty
2012-05-10 12:19 ` Thomas Gummerer
2012-05-10 18:17 ` Michael Haggerty
2012-05-11 17:12 ` Thomas Gummerer
2012-05-13 19:50 ` Michael Haggerty
2012-05-14 15:01 ` Thomas Gummerer
2012-05-14 21:08 ` Michael Haggerty
2012-05-14 22:10 ` Thomas Rast
2012-05-15 6:43 ` Michael Haggerty
2012-05-15 13:49 ` Thomas Gummerer
2012-05-15 15:02 ` Michael Haggerty
2012-05-18 15:38 ` Thomas Gummerer
2012-05-19 13:00 ` Michael Haggerty
2012-05-21 7:45 ` Thomas Gummerer
2012-05-16 5:01 ` Michael Haggerty
2012-05-16 21:54 ` Thomas Gummerer
2012-05-19 5:40 ` Michael Haggerty
2012-05-21 20:30 ` Thomas Gummerer [this message]
2012-05-13 21:01 ` Philip Oakley
2012-05-14 14:54 ` Thomas Gummerer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120521203018.GA57389@tgummerer.unibz.it \
--to=t.gummerer@gmail.com \
--cc=davidbarr@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
--cc=spearce@spearce.org \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).