git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, peff@peff.net, jrnieder@gmail.com
Subject: Re: [PATCH 01/20] pack-revindex: introduce a new API
Date: Wed, 13 Jan 2021 11:23:30 -0500	[thread overview]
Message-ID: <X/8egmj9Tno3pvhC@nand.local> (raw)
In-Reply-To: <xmqqa6tdz2fo.fsf@gitster.c.googlers.com>

On Wed, Jan 13, 2021 at 12:06:03AM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > In the next several patches, we will prepare for loading a reverse index
> > either in memory, or from a yet-to-be-introduced on-disk format. To do
>
> Does "load revindex in memory" (as opposed to "from on-disk file")
> mean the good old "read the forward index and make inverse map
> in-core", or something else?

Indeed, that's what it means. I've made that clearer in the patch
message by saying it explicitly.

> IOW, is "We will prepare a reverse index either by computing in
> memory from forward index, or loading from on-disk file" what we
> want to say here?

Yep.

> Without knowing what exactly "pack position", "offset" and "index
> position" refer to, the above three are almost impossible to grok.
> Can we have one paragraph description for each?  Something along the
> lines of...

Yep, and I see later on in the thread that you want to discard this
suggestion since Peff has suggested similar changes in pack-revindex.h,
which I've applied.

> >     Unlike some of the callers that used to access '->offset' and '->nr'
> >     directly, the error checking around this call is somewhat more
> >     robust. This is important since callers can pass an offset which
> >     does not contain an object.
>
> Meaning "offset ought to point at the boundary between objects in
> the pack stream, and the API, unlike the direct access, makes sure
> that is the case"?  That is a good thing.

Indeed, and I've called that out directly in the patch message to
highlight it.

> > diff --git a/pack-revindex.c b/pack-revindex.c
> > index ecdde39cf4..6d86a85208 100644
> > --- a/pack-revindex.c
> > +++ b/pack-revindex.c
> > @@ -203,3 +203,35 @@ struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
> >
> >  	return p->revindex + pos;
> >  }
> > +
> > +int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
> > +{
> > +	int ret;
> > +
> > +	if (load_pack_revindex(p) < 0)
> > +		return -1;
> > +
> > +	ret = find_revindex_position(p, ofs);
> > +	if (ret < 0)
> > +		return -1;
>
> Why not "return ret"?  We know that find_revindex_position() would
> signal an error by returning -1, but is there a reason why we want
> to prevent it from returning richer errors in the future?

No reason, I've changed it to 'return ret' instead.

> > +	*pos = ret;
>
> The untold assumption is that uint32_t can fit the maximum returned
> value from find_revindex_position() and "signed int" can also big
> enough.  I guess it is OK to be limited to up-to 2 billion objects
> on 32-bit systems.

Indeed, and that is fixed in a later on patch. I'll make sure to call it
out there.

> > +	return 0;
> > +}
> > +
> > +uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos)
> > +{
> > +	if (!p->revindex)
> > +		BUG("pack_pos_to_index: reverse index not yet loaded");
>
> The previous function lazy loaded the revindex, but this one and the
> next one refuses to work without revindex.  Intended?

Yes, and discussed a little bit more here [1]. Obviously that discussion
doesn't do any good to those reading the git log in the future, so I've
summarized the important detail (that some callers are equipped to deal
with errors but others aren't) in the patch.

> > +	if (pos >= p->num_objects)
> > +		BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos);
>
> Personally I find it easier to place items on a single line in an
> ascending order of magnitude, i.e.
>
> 	if (p->num_objects <= pos)
> 		BUG("...");

More readable, thanks.

> The assertion requires pos to be strictly lower than p->num_objects,
> which is in line with how we usually count elements of an array of
> size p->num_objects, but the next one allows pos == p->num_objects;
> intended?
>
> p->revindex[] is an array of two-member struct, so if an element of
> the array is invalid for its .nr member here because pos is exactly
> at p->num_objects, I would imagine it is also invalid for its .offset
> member, too, no?
>
> Ah, perhaps the "offset beyond the end of the pack positions" is a
> sentinel element to give the in-pack-stream size of the object at
> the last pack position?  If that is the case, it deserves a comment,
> I would think.

Exactly. I added a detail about that in the patch, too.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/X%2F3ODgaa9wr65M09@nand.local/

  parent reply	other threads:[~2021-01-13 16:26 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:16 [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Taylor Blau
2021-01-08 18:16 ` [PATCH 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-12  8:41   ` Jeff King
2021-01-12  9:41     ` Jeff King
2021-01-12 16:27     ` Taylor Blau
2021-01-13  8:06   ` Junio C Hamano
2021-01-13  8:54     ` Junio C Hamano
2021-01-13 13:17     ` Jeff King
2021-01-13 16:23     ` Taylor Blau [this message]
2021-01-08 18:16 ` [PATCH 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-12  8:47   ` Jeff King
2021-01-12 16:31     ` Taylor Blau
2021-01-13 13:02       ` Jeff King
2021-01-08 18:16 ` [PATCH 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-12 16:34     ` Taylor Blau
2021-01-08 18:16 ` [PATCH 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-08 18:17 ` [PATCH 05/20] check_object(): " Taylor Blau
2021-01-11 11:43   ` Derrick Stolee
2021-01-11 16:15     ` Taylor Blau
2021-01-12  8:54       ` Jeff King
2021-01-12  8:51   ` Jeff King
2021-01-08 18:17 ` [PATCH 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 07/20] show_objects_for_type(): " Taylor Blau
2021-01-12  8:57   ` Jeff King
2021-01-12 16:35     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 08/20] get_size_by_pos(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 09/20] try_partial_reuse(): " Taylor Blau
2021-01-12  9:06   ` Jeff King
2021-01-12 16:47     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 13/20] packed_object_info(): " Taylor Blau
2021-01-12  9:11   ` Jeff King
2021-01-12 16:51     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 14/20] unpack_entry(): " Taylor Blau
2021-01-12  9:22   ` Jeff King
2021-01-12 16:56     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-11 11:52   ` Derrick Stolee
2021-01-11 16:23     ` Taylor Blau
2021-01-11 17:09       ` Derrick Stolee
2021-01-12  9:28         ` Jeff King
2021-01-08 18:17 ` [PATCH 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-08 18:17 ` [PATCH 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-11 16:27     ` Taylor Blau
2021-01-11 17:11       ` Derrick Stolee
2021-01-12  9:32   ` Jeff King
2021-01-12 16:59     ` Taylor Blau
2021-01-13 13:05       ` Jeff King
2021-01-08 18:18 ` [PATCH 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-12  9:34   ` Jeff King
2021-01-08 18:18 ` [PATCH 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-12  9:37   ` Jeff King
2021-01-12 17:02     ` Taylor Blau
2021-01-11 12:07 ` [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Derrick Stolee
2021-01-11 16:30   ` Taylor Blau
2021-01-11 17:15     ` Derrick Stolee
2021-01-11 17:29       ` Taylor Blau
2021-01-11 18:40       ` Junio C Hamano
2021-01-12  9:45 ` Jeff King
2021-01-12 17:07   ` Taylor Blau
2021-01-13  0:23 ` Junio C Hamano
2021-01-13  0:52   ` Taylor Blau
2021-01-13  2:15     ` Junio C Hamano
2021-01-13  3:23       ` Taylor Blau
2021-01-13  8:21         ` Junio C Hamano
2021-01-13 13:13           ` Jeff King
2021-01-13 15:34             ` Taylor Blau
2021-01-13 20:06               ` Junio C Hamano
2021-01-13 20:13                 ` Taylor Blau
2021-01-13 20:22                 ` Jeff King
2021-01-13 22:23 ` [PATCH v2 " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-14  6:46     ` Junio C Hamano
2021-01-14 12:00       ` Derrick Stolee
2021-01-14 17:06       ` Taylor Blau
2021-01-14 19:19         ` Jeff King
2021-01-14 20:47           ` Junio C Hamano
2021-01-13 22:23   ` [PATCH v2 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-13 22:23   ` [PATCH v2 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 05/20] check_object(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 07/20] show_objects_for_type(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 08/20] get_size_by_pos(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 09/20] try_partial_reuse(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 13/20] packed_object_info(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 14/20] unpack_entry(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-14  6:43     ` Junio C Hamano
2021-01-14 17:00       ` Taylor Blau
2021-01-14 19:33       ` Jeff King
2021-01-14 20:11         ` Jeff King
2021-01-14 20:15           ` Taylor Blau
2021-01-15  2:22             ` Junio C Hamano
2021-01-15  2:29               ` Taylor Blau
2021-01-14 20:51         ` Junio C Hamano
2021-01-13 22:24   ` [PATCH v2 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-14  6:33     ` Junio C Hamano
2021-01-14 16:53       ` Taylor Blau
2021-01-14 19:43         ` Jeff King
2021-01-13 22:24   ` [PATCH v2 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-13 22:25   ` [PATCH v2 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-14 16:56       ` Taylor Blau
2021-01-14 19:51   ` [PATCH v2 00/20] pack-revindex: prepare for on-disk reverse index Jeff King
2021-01-14 20:53     ` Junio C Hamano
2021-01-15  9:32       ` Jeff King
2021-01-15  9:33         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X/8egmj9Tno3pvhC@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).