From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
"Junio C Hamano" <gitster@pobox.com>,
"Stefan Beller" <sbeller@google.com>,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Brandon Williams" <bmwill@google.com>,
git@vger.kernel.org
Subject: Re: [PATCH v2 08/21] read_packed_refs(): read references with minimal copying
Date: Thu, 21 Sep 2017 09:34:04 +0200 [thread overview]
Message-ID: <b7936c07-a2a2-f17c-b557-2b4916cac3bc@alum.mit.edu> (raw)
In-Reply-To: <20170920182732.wy6bojeaonpxb3mc@sigill.intra.peff.net>
On 09/20/2017 08:27 PM, Jeff King wrote:
> On Tue, Sep 19, 2017 at 08:22:16AM +0200, Michael Haggerty wrote:
>
>> Instead of copying data from the `packed-refs` file one line at time
>> and then processing it, process the data in place as much as possible.
>>
>> Also, instead of processing one line per iteration of the main loop,
>> process a reference line plus its corresponding peeled line (if
>> present) together.
>>
>> Note that this change slightly tightens up the parsing of the
>> `parse-ref` file. Previously, the parser would have accepted multiple
>
> s/parse-ref/packed-refs/, I assume
Thanks; will fix.
> The patch itself looks good, though I did notice an interesting tangent.
>
>> + if (eof - pos < GIT_SHA1_HEXSZ + 2 ||
>> + parse_oid_hex(p, &oid, &p) ||
>> + !isspace(*p++))
>> + die_invalid_line(refs->path, pos, eof - pos);
>
> I wondered why you didn't just check the output of parse_oid_hex(), and
> included the length check (since in the long run we'd like to get rid of
> uses of the static GIT_SHA1_HEXSZ macro). I imagine the answer is that
> this is an mmap'd buffer, and we can't guarantee that parse_oid_hex()
> wouldn't walk off the end of it.
Yes.
> That's fine for now, but I suspect it may become a problem when we move
> to having a second hash function with a different length. You can't just
> say "it must have as many bytes as the longest hash", because of course
> we could have the shorter hash at the end of the buffer. But we also
> can't say "it must have as many bytes as the shortest hash", because if
> the content implies it's a longer hash, we'd read off the end of the
> buffer.
>
> I think in the long run we will need a parse_oid_hex() function that
> takes a ptr/len (or start/end) pair.
Yes, that makes sense.
> [...]
Michael
next prev parent reply other threads:[~2017-09-21 7:34 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-19 6:22 [PATCH v2 00/21] Read `packed-refs` using mmap() Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 01/21] ref_iterator: keep track of whether the iterator output is ordered Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 02/21] prefix_ref_iterator: break when we leave the prefix Michael Haggerty
2017-09-20 20:25 ` Stefan Beller
2017-09-21 4:59 ` Jeff King
2017-09-21 17:29 ` Stefan Beller
2017-09-21 7:42 ` Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 03/21] packed_ref_cache: add a backlink to the associated `packed_ref_store` Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 04/21] die_unterminated_line(), die_invalid_line(): new functions Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 05/21] read_packed_refs(): use mmap to read the `packed-refs` file Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 06/21] read_packed_refs(): only check for a header at the top of the file Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 07/21] read_packed_refs(): make parsing of the header line more robust Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 08/21] read_packed_refs(): read references with minimal copying Michael Haggerty
2017-09-20 18:27 ` Jeff King
2017-09-21 7:34 ` Michael Haggerty [this message]
2017-09-19 6:22 ` [PATCH v2 09/21] packed_ref_cache: remember the file-wide peeling state Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 10/21] mmapped_ref_iterator: add iterator over a packed-refs file Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 11/21] mmapped_ref_iterator_advance(): no peeled value for broken refs Michael Haggerty
2017-09-20 18:29 ` Jeff King
2017-09-19 6:22 ` [PATCH v2 12/21] packed-backend.c: reorder some definitions Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 13/21] packed_ref_cache: keep the `packed-refs` file mmapped if possible Michael Haggerty
2017-09-19 12:44 ` Michael Haggerty
2017-09-24 6:56 ` Junio C Hamano
2017-09-20 18:40 ` Jeff King
2017-09-20 18:51 ` Jeff King
2017-09-21 8:04 ` Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 14/21] read_packed_refs(): ensure that references are ordered when read Michael Haggerty
2017-09-20 18:50 ` Jeff King
2017-09-21 8:27 ` Michael Haggerty
2017-09-25 15:44 ` Johannes Schindelin
2017-09-19 6:22 ` [PATCH v2 15/21] packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator` Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 16/21] packed_read_raw_ref(): read the reference from the mmapped buffer Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 17/21] ref_store: implement `refs_peel_ref()` generically Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 18/21] packed_ref_store: get rid of the `ref_cache` entirely Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 19/21] ref_cache: remove support for storing peeled values Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 20/21] mmapped_ref_iterator: inline into `packed_ref_iterator` Michael Haggerty
2017-09-19 6:22 ` [PATCH v2 21/21] packed-backend.c: rename a bunch of things and update comments Michael Haggerty
2017-09-19 19:53 ` [PATCH v2 00/21] Read `packed-refs` using mmap() Johannes Schindelin
2017-09-20 18:57 ` Jeff King
2017-09-25 15:55 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7936c07-a2a2-f17c-b557-2b4916cac3bc@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=Johannes.Schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).