From: Patrick Steinhardt <ps@pks.im>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, peff@peff.net, dstolee@microsoft.com
Subject: Re: [PATCH v2 2/8] repack: fix trying to use preferred pack in alternates
Date: Thu, 13 Apr 2023 11:31:43 +0200 [thread overview]
Message-ID: <ZDfL_6G-ySm94Vla@ncase> (raw)
In-Reply-To: <ZDb6gZycwTdsaB6o@nand.local>
[-- Attachment #1: Type: text/plain, Size: 5395 bytes --]
On Wed, Apr 12, 2023 at 02:37:53PM -0400, Taylor Blau wrote:
> On Wed, Apr 12, 2023 at 12:22:35PM +0200, Patrick Steinhardt wrote:
> > When doing a geometric repack with multi-pack-indices, then we ask
> > git-multi-pack-index(1) to use the largest packfile as the preferred
> > pack. It can happen though that the largest packfile is not part of the
> > main object database, but instead part of an alternate object database.
> > The result is that git-multi-pack-index(1) will not be able to find the
> > preferred pack and print a warning. It then falls back to use the first
> > packfile that the multi-pack-index shall reference.
> >
> > Fix this bug by only considering packfiles as preferred pack that are
> > local. This is the right thing to do given that a multi-pack-index
> > should never reference packfiles borrowed from an alternate.
> >
> > While at it, rename the function `get_largest_active_packfile()` to
> > `get_preferred_pack()` to better document its intent.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
>
> > @@ -464,7 +466,16 @@ static struct packed_git *get_largest_active_pack(struct pack_geometry *geometry
> > }
> > if (geometry->split == geometry->pack_nr)
> > return NULL;
> > - return geometry->pack[geometry->pack_nr - 1];
> > +
> > + for (i = geometry->pack_nr; i > 0; i--)
> > + /*
> > + * A pack that is not local would never be included in a
> > + * multi-pack index. We thus skip over any non-local packs.
> > + */
> > + if (geometry->pack[i - 1]->pack_local)
> > + return geometry->pack[i - 1];
> > +
> > + return NULL;
> > }
>
> Looking good, we want to go through this list in reverse order, since
> the packs are ordered largest to smallest. I think that you could
> slightly rewrite the loop condition to avoid having to always access
> `geometry->pack` at `i-1` instead of `i`, like so:
>
> --- 8< ---
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 9d36dc8b84..ba468ac44e 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -467,13 +467,15 @@ static struct packed_git *get_preferred_pack(struct pack_geometry *geometry)
> if (geometry->split == geometry->pack_nr)
> return NULL;
>
> - for (i = geometry->pack_nr; i > 0; i--)
> + for (i = geometry->pack_nr - 1; i >= 0; i--) {
> /*
> * A pack that is not local would never be included in a
> * multi-pack index. We thus skip over any non-local packs.
> */
> - if (geometry->pack[i - 1]->pack_local)
> - return geometry->pack[i - 1];
> + struct packed_git *p = geometry->pack[i];
> + if (p->pack_local)
> + return p;
> + }
>
> return NULL;
> }
> --- >8 ---
There's a gotcha: `i` is an `unit32_t`, so `i >= 0` would always be true
and thus we wrap around and would try to access the pack array at an
out-of-bound index.
> but I'm not sure that the loop condition is quite right to begin with.
> We don't want to iterate all the way down to the beginning of the pack
> list, since some of those packs may be below the "split" line, IOW their
> contents are going to be rolled up and the packs destroyed.
>
> So I think the right loop condition would be:
>
> for (i = geometry->pack_nr - 1; i >= geometry->split; i--)
>
> which I think makes the "if (geometry->split == geometry->pack_nr)"
> condition above redundant with this loop, so you could probably drop
> that.
>
> I would definitely appreciate a second set of eye here. The pack *at*
> the split line is considered frozen (IOW, we create a new pack
> consisting of the packs in range [0, geometry->split), and leave the
> packs in range [geometry->split, geometry->nr) alone).
>
> So it should be fine to enter that loop when "i == geometry->split",
> because it's OK to return that as a potential preferred pack.
That makes sense indeed. Will amend.
[snip]
> > + test $(wc -l <packs) = 1 &&
> > +
> > + # We should also see a multi-pack-index. This multi-pack-index should
> > + # never refer to any packfiles in the alternate object database.
> > + # Consequentially, it should be valid even with the alternate ODB
> > + # deleted.
> > + rm -r shared &&
> > + git -C member multi-pack-index verify
>
> To test this even more directly, I think that you could ensure that the
> set of packs (which should just be the single entry found in "packs"
> above) matches what we expect. That should look something like:
>
> test-tool read-midx member/.git/objects >packs.midx &&
> grep "^pack-.*\.idx$" packs.midx | sort >actual &&
> xargs -n 1 basename <packs | sort >expect
> test_cmp expect actual
>
> Note that the read-midx test helper outputs packs with the "*.idx"
> suffix, so you'd probably want to alter your find invocation a few lines
> above accordingly, or rewrite it before writing into "actual".
>
> Alternatively, since we expect there to be just a single pack here, I
> think we could just as easily write something like:
>
> test-tool read-midx member/.git/objects >packs.midx &&
> grep "^pack-.*\.idx$" packs.midx >actual &&
> test_line_count = 1 actual &&
> grep $(cat actual) packs
>
> though I slightly prefer the former since it is less brittle to changing
> this test. Either way, though.
Thanks, I've adopted that approach with a slight modification.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-04-13 9:31 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 11:08 [PATCH] repack: fix geometric repacking with gitalternates Patrick Steinhardt
2023-04-04 18:55 ` Taylor Blau
2023-04-04 19:00 ` Taylor Blau
2023-04-05 7:08 ` Patrick Steinhardt
2023-04-10 15:06 ` Derrick Stolee
2023-04-10 23:49 ` Taylor Blau
2023-04-11 17:13 ` Patrick Steinhardt
2023-04-11 21:13 ` Taylor Blau
2023-04-12 9:37 ` Patrick Steinhardt
2023-04-11 17:06 ` Patrick Steinhardt
2023-04-11 17:26 ` Patrick Steinhardt
2023-04-11 21:14 ` Taylor Blau
2023-04-10 23:29 ` Taylor Blau
2023-04-12 10:22 ` [PATCH v2 0/8] " Patrick Steinhardt
2023-04-12 10:22 ` [PATCH v2 1/8] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-12 17:56 ` Taylor Blau
2023-04-13 9:28 ` Patrick Steinhardt
2023-04-12 10:22 ` [PATCH v2 2/8] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-12 18:37 ` Taylor Blau
2023-04-13 9:31 ` Patrick Steinhardt [this message]
2023-04-12 10:22 ` [PATCH v2 3/8] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-12 20:39 ` Taylor Blau
2023-04-12 10:22 ` [PATCH v2 4/8] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-12 21:33 ` Taylor Blau
2023-04-12 10:22 ` [PATCH v2 5/8] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-12 21:52 ` Taylor Blau
2023-04-12 10:22 ` [PATCH v2 6/8] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-12 10:22 ` [PATCH v2 7/8] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-12 23:56 ` Junio C Hamano
2023-04-13 5:11 ` Junio C Hamano
2023-04-13 6:41 ` Patrick Steinhardt
2023-04-12 10:23 ` [PATCH v2 8/8] repack: disable writing bitmaps when doing a local geometric repack Patrick Steinhardt
2023-04-12 22:01 ` Taylor Blau
2023-04-13 9:54 ` Patrick Steinhardt
2023-04-13 10:14 ` Patrick Steinhardt
2023-04-12 22:02 ` [PATCH v2 0/8] repack: fix geometric repacking with gitalternates Taylor Blau
2023-04-13 11:16 ` [PATCH v3 00/10] " Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 01/10] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-13 13:49 ` Derrick Stolee
2023-04-13 11:16 ` [PATCH v3 02/10] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 03/10] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 04/10] pack-objects: split out `--stdin-packs` tests into separate file Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 05/10] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 06/10] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 07/10] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 08/10] t/helper: allow chmtime to print verbosely without modifying mtime Patrick Steinhardt
2023-04-13 11:16 ` [PATCH v3 09/10] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-13 13:59 ` Derrick Stolee
2023-04-13 14:13 ` Patrick Steinhardt
2023-04-13 15:40 ` Junio C Hamano
2023-04-13 11:16 ` [PATCH v3 10/10] repack: disable writing bitmaps when doing a local repack Patrick Steinhardt
2023-04-13 14:54 ` [PATCH v3 00/10] repack: fix geometric repacking with gitalternates Derrick Stolee
2023-04-14 2:03 ` Junio C Hamano
2023-04-14 5:42 ` Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 " Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 01/10] midx: fix segfault with no packs and invalid preferred pack Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 02/10] repack: fix trying to use preferred pack in alternates Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 03/10] repack: fix generating multi-pack-index with only non-local packs Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 04/10] pack-objects: split out `--stdin-packs` tests into separate file Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 05/10] pack-objects: fix error when packing same pack twice Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 06/10] pack-objects: fix error when same packfile is included and excluded Patrick Steinhardt
2023-04-14 6:01 ` [PATCH v4 07/10] pack-objects: extend test coverage of `--stdin-packs` with alternates Patrick Steinhardt
2023-04-14 6:02 ` [PATCH v4 08/10] t/helper: allow chmtime to print verbosely without modifying mtime Patrick Steinhardt
2023-04-14 6:02 ` [PATCH v4 09/10] repack: honor `-l` when calculating pack geometry Patrick Steinhardt
2023-04-14 6:02 ` [PATCH v4 10/10] repack: disable writing bitmaps when doing a local repack Patrick Steinhardt
2023-04-14 13:23 ` [PATCH v4 00/10] repack: fix geometric repacking with gitalternates Derrick Stolee
2023-04-14 17:29 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZDfL_6G-ySm94Vla@ncase \
--to=ps@pks.im \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).