git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: "Derrick Stolee" <derrickstolee@github.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>,
	git <git@vger.kernel.org>,
	"Kaartic Sivaram" <kaartic.sivaraam@gmail.com>,
	"Philip Oakley" <philipoakley@iee.email>,
	"Martin Ågren" <martin.agren@gmail.com>
Subject: Re: [PATCH v5 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests
Date: Wed, 17 Aug 2022 15:32:31 +0530	[thread overview]
Message-ID: <CAPOJW5znYngr4n4tBBCgqZY4Hr38NArHC7Go=ujDkmsFXY57mQ@mail.gmail.com> (raw)
In-Reply-To: <YvwS5RlEMvgDm93m@nand.local>

Hello Taylor, extremely thanks for finding the reason for this failure.

On Wed, Aug 17, 2022 at 3:28 AM Taylor Blau <me@ttaylorr.com> wrote:
>
> Hi Abhradeep,
>
> When I took a close look at the process to generate MIDX bitmaps, I found a
> couple of interesting things. The first more trivial fix is that we
> incorrectly propagate the "preferred"-ness bit from packs in an existing
> MIDX when generating a new one. If the identity of the preferred pack
> changes, we should not drag forward those bits on objects already known
> (and preferred) by the existing MIDX:
>
> --- >8 ---
>
> diff --git a/midx.c b/midx.c
> index 3ff6e91e6e..40e520534c 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -619,6 +619,9 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
>
>                 if (m) {
>                         uint32_t start = 0, end;
> +                       int orig_preferred_pack = -1;
> +                       if (0 <= preferred_pack && preferred_pack < m->num_packs)
> +                               orig_preferred_pack = info[preferred_pack].orig_pack_int_id;
>
>                         if (cur_fanout)
>                                 start = ntohl(m->chunk_oid_fanout[cur_fanout - 1]);
> @@ -629,7 +632,7 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
>                                 nth_midxed_pack_midx_entry(m,
>                                                            &entries_by_fanout[nr_fanout],
>                                                            cur_object);
> -                               if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack)
> +                               if (nth_midxed_pack_int_id(m, cur_object) == orig_preferred_pack)
>                                         entries_by_fanout[nr_fanout].preferred = 1;
>                                 else
>                                         entries_by_fanout[nr_fanout].preferred = 0;
>
> --- 8< ---

I am not able to understand this modification.
`info[preferred_pack].orig_pack_int_id` and `preferred_pack` have the
same value, right? I see `ctx.info` getting sorted only after calling
`get_sorted_entries()` function.

> But a more interesting problem arose when I took a closer look at the
> psuedo-pack order of objects generated according to
> `prepare_midx_packing_data()`. With Johannes' fixed $test_tick value, I
> was able to see the following in runs that succeeded:
>
>     27bb4ecd3e96cd0b3bc37d92a78cb5cbf34c418afa67f74cc52517ff7df418e1 (12 in pack-63c460f99a5c08f631396b1828c64006170a9d543b064506fd11b504a62acf52.idx)
>     c68154d69c19f010afce786c6debe926ae6e7decfb946a4549085a792cf9de7e (202 in pack-63c460f99a5c08f631396b1828c64006170a9d543b064506fd11b504a62acf52.idx)
>     a0b85b314ede46aa9f9b5796a284a4cf0b86ebb8fa32f87ae246e21b5378b11c (392 in pack-63c460f99a5c08f631396b1828c64006170a9d543b064506fd11b504a62acf52.idx)
>     [...]
>
> and the following in runs that failed:
>
>     46193a971f5045cb3ca6022957541f9ccddfbfe78591d8506e2d952f8113059b (221 in pack-3fc052de674e3d48096af7cc5125675c0ae1082aa798eb9358de357b2655f9ad.idx)
>     67df8a01ac84cf5f028855c48384eac3336bb02a52603bac285c4b31d66b3ab5 (12 in pack-2021cdedb33b542b244eacf3d009d1384471a53286b0c1235c91d124355dc818.idx)
>     1556b5f0ad7cb0c25a1fc47355fcffc00775e90d94ae8c511e5776b204796ce6 (200 in pack-2021cdedb33b542b244eacf3d009d1384471a53286b0c1235c91d124355dc818.idx)
>
> In the successful case, pack 63c460f99a... is preferred, and its objects
> appear in ascending order of their pack offsets. But in the other case,
> pack 3fc052de67... is preferred, but its first object starts at offset
> 221. Huh? That's not right:
>
>     $ git show-index <.git/objects/pack/pack-3fc052de674e3d48096af7cc5125675c0ae1082aa798eb9358de357b2655f9ad.idx
>     221 46193a971f5045cb3ca6022957541f9ccddfbfe78591d8506e2d952f8113059b (1f4bd28e)
>     12 4d332072f161629ffe4652ecd3ce377ef88447bec73f05ab0f3515f98bd061cf (fadf885b)
>
> Indeed, there is another object there at offset 12. Missing that object
> (since it comes from a preferred pack) is an invariant violation (since
> all objects from the preferred pack should be selected when multiple
> copies are available).
>
> It's missing because the existing MIDX selects that object from a
> different pack, and when we get to fanout 0x4d (the one which should
> include that object), we skip over seeing its copy in the preferred pack
> because that pack already appears in the existing MIDX, though it wasn't
> preferred.

ahh, now I understand what the problem was actually. Thanks :)

> I think there are a couple of ways to fix this. The easiest thing to do
> would be to force the identity of the preferred pack to be the same when
> generating a MIDX bitmap *while reusing an existing MIDX*, since that is
> the only time this bug can happen.
>
> But that's a little magical for my taste. I think a more reasonable fix
> would be to include copies of all objects from the preferred pack
> including in the case where that pack was non-preferred in an existing
> MIDX and at least one object in that pack was selected from a different
> pack in the existing MIDX.

I think the later approach makes the most sense to me. It might not be
a good idea to keep the same pack as `preferred` as a better candidate
would be ignored in that case.

> Abhradeep -- let me know if this is something you want to look into. I
> think it's a very worthwhile bug to fix, since it is definitely
> trigger-able in the wild (notably, only with `git multi-pack-index write
> --bitmap` without `--stdin-packs` and only under certain circumstances),
> and not just limited to SHA-256 mode.
>
> If you are busy experimenting with CRoaring, that's no problem and I can
> fix this up, too. Either way, it would be worth you and others weighing
> in on which fix you think is worth pursuing.

I will be happy to fix it but I can't work on it right now (neither on
CRoaring) because I am currently preparing for my exam. I can continue
my work after that (i.e. from 19 aug). If you feel it is getting too
late then you can do this too. I am also thinking of  writing a patch
for bitmap specific test dump tool (as Johannes proposed previously).

My exam dates are 18 Aug, 31 Aug, 1 Sep, 2 Sep and 3 Sep (I know the
dates are weird) The dates are adjusted on request for Smart India
Hackathon ( 24 Aug - 27 Aug).

Thanks :)

  reply	other threads:[~2022-08-17 10:04 UTC|newest]

Thread overview: 162+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-20 12:33 [PATCH 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty via GitGitGadget
2022-06-20 12:33 ` [PATCH 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-06-20 16:56   ` Derrick Stolee
2022-06-20 17:09     ` Taylor Blau
2022-06-21  8:31       ` Abhradeep Chakraborty
2022-06-22 16:26         ` Taylor Blau
2022-06-21  8:23     ` Abhradeep Chakraborty
2022-06-20 17:21   ` Taylor Blau
2022-06-21  9:22     ` Abhradeep Chakraborty
2022-06-22 16:29       ` Taylor Blau
2022-06-22 16:45         ` Abhradeep Chakraborty
2022-06-20 20:21   ` Derrick Stolee
2022-06-21 10:08     ` Abhradeep Chakraborty
2022-06-22 16:30       ` Taylor Blau
2022-06-20 12:33 ` [PATCH 2/6] pack-bitmap: prepare to read " Abhradeep Chakraborty via GitGitGadget
2022-06-20 20:49   ` Derrick Stolee
2022-06-21 10:28     ` Abhradeep Chakraborty
2022-06-20 22:06   ` Taylor Blau
2022-06-21 11:52     ` Abhradeep Chakraborty
2022-06-22 16:49       ` Taylor Blau
2022-06-22 17:18         ` Abhradeep Chakraborty
2022-06-22 21:34           ` Taylor Blau
2022-06-20 12:33 ` [PATCH 3/6] pack-bitmap-write.c: write " Abhradeep Chakraborty via GitGitGadget
2022-06-20 22:16   ` Taylor Blau
2022-06-21 12:50     ` Abhradeep Chakraborty
2022-06-22 16:51       ` Taylor Blau
2022-06-20 12:33 ` [PATCH 4/6] builtin/pack-objects.c: learn pack.writeBitmapLookupTable Taylor Blau via GitGitGadget
2022-06-20 22:18   ` Taylor Blau
2022-06-20 12:33 ` [PATCH 5/6] bitmap-commit-table: add tests for the bitmap lookup table Abhradeep Chakraborty via GitGitGadget
2022-06-22 16:54   ` Taylor Blau
2022-06-20 12:33 ` [PATCH 6/6] bitmap-lookup-table: add performance tests Abhradeep Chakraborty via GitGitGadget
2022-06-22 17:14   ` Taylor Blau
2022-06-26 13:10 ` [PATCH v2 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty via GitGitGadget
2022-06-26 13:10   ` [PATCH v2 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-06-27 14:18     ` Derrick Stolee
2022-06-27 15:48       ` Taylor Blau
2022-06-27 16:51       ` Abhradeep Chakraborty
2022-06-26 13:10   ` [PATCH v2 2/6] pack-bitmap-write.c: write " Abhradeep Chakraborty via GitGitGadget
2022-06-27 14:35     ` Derrick Stolee
2022-06-27 16:12       ` Taylor Blau
2022-06-27 17:10       ` Abhradeep Chakraborty
2022-06-27 16:05     ` Taylor Blau
2022-06-27 18:29       ` Abhradeep Chakraborty
2022-06-26 13:10   ` [PATCH v2 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Abhradeep Chakraborty via GitGitGadget
2022-06-27 14:43     ` Derrick Stolee
2022-06-27 17:42       ` Abhradeep Chakraborty
2022-06-27 17:49         ` Taylor Blau
2022-06-27 17:47     ` Taylor Blau
2022-06-27 18:39       ` Abhradeep Chakraborty
2022-06-29 20:11         ` Taylor Blau
2022-06-26 13:10   ` [PATCH v2 4/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-06-27 15:12     ` Derrick Stolee
2022-06-27 18:06       ` [PATCH v2 4/6] pack-bitmap: prepare to read lookup table Abhradeep Chakraborty
2022-06-27 18:32         ` Derrick Stolee
2022-06-27 21:49       ` [PATCH v2 4/6] pack-bitmap: prepare to read lookup table extension Taylor Blau
2022-06-28  8:59         ` [PATCH v2 4/6] pack-bitmap: prepare to read lookup table Abhradeep Chakraborty
2022-06-29 20:22           ` Taylor Blau
2022-06-30  6:58             ` [PATCH v2 4/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty
2022-06-27 21:38     ` Taylor Blau
2022-06-28 19:25       ` Abhradeep Chakraborty
2022-06-29 20:37         ` Taylor Blau
2022-06-29 20:41           ` Taylor Blau
2022-06-30  8:35           ` Abhradeep Chakraborty
2022-06-26 13:10   ` [PATCH v2 5/6] bitmap-lookup-table: add performance tests for lookup table Abhradeep Chakraborty via GitGitGadget
2022-06-27 21:53     ` Taylor Blau
2022-06-28  7:58       ` Abhradeep Chakraborty
2022-06-29 20:40         ` Taylor Blau
2022-06-26 13:10   ` [PATCH v2 6/6] p5310-pack-bitmaps.sh: enable pack.writeReverseIndex for testing Abhradeep Chakraborty via GitGitGadget
2022-06-27 21:50     ` Taylor Blau
2022-06-28  8:01       ` Abhradeep Chakraborty
2022-07-04  8:46   ` [PATCH v3 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty via GitGitGadget
2022-07-04  8:46     ` [PATCH v3 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-08 16:38       ` Philip Oakley
2022-07-09  7:53         ` Abhradeep Chakraborty
2022-07-10 15:01           ` Philip Oakley
2022-07-14 23:15             ` Taylor Blau
2022-07-15 10:36               ` Philip Oakley
2022-07-15 18:48             ` Abhradeep Chakraborty
2022-07-04  8:46     ` [PATCH v3 2/6] pack-bitmap-write.c: write " Abhradeep Chakraborty via GitGitGadget
2022-07-14 23:26       ` Taylor Blau
2022-07-15  2:22       ` Taylor Blau
2022-07-15 15:58         ` Abhradeep Chakraborty
2022-07-15 22:15           ` Taylor Blau
2022-07-16 11:50             ` Abhradeep Chakraborty
2022-07-26  0:34               ` Taylor Blau
2022-07-18  8:59       ` Martin Ågren
2022-07-04  8:46     ` [PATCH v3 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Abhradeep Chakraborty via GitGitGadget
2022-07-04  8:46     ` [PATCH v3 4/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-15  2:46       ` Taylor Blau
2022-07-15 16:38         ` Abhradeep Chakraborty
2022-07-15 22:20           ` Taylor Blau
2022-07-18  9:06             ` Martin Ågren
2022-07-18 19:25               ` Abhradeep Chakraborty
2022-07-18 23:26                 ` Martin Ågren
2022-07-26  0:45               ` Taylor Blau
2022-07-04  8:46     ` [PATCH v3 5/6] bitmap-lookup-table: add performance tests for lookup table Abhradeep Chakraborty via GitGitGadget
2022-07-15  2:53       ` Taylor Blau
2022-07-15 18:23         ` Abhradeep Chakraborty
2022-07-04  8:46     ` [PATCH v3 6/6] p5310-pack-bitmaps.sh: remove pack.writeReverseIndex Abhradeep Chakraborty via GitGitGadget
2022-07-04 16:35     ` [PATCH v3 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty
2022-07-06 19:21     ` Junio C Hamano
2022-07-07  8:48       ` Abhradeep Chakraborty
2022-07-07 18:09         ` Kaartic Sivaraam
2022-07-07 18:42           ` Abhradeep Chakraborty
2022-07-20 14:05     ` [PATCH v4 " Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 2/6] pack-bitmap-write.c: write " Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 4/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 5/6] p5310-pack-bitmaps.sh: enable `pack.writeReverseIndex` Abhradeep Chakraborty via GitGitGadget
2022-07-20 14:05       ` [PATCH v4 6/6] bitmap-lookup-table: add performance tests for lookup table Abhradeep Chakraborty via GitGitGadget
2022-07-20 18:38       ` [PATCH v5 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty via GitGitGadget
2022-07-20 18:38         ` [PATCH v5 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-20 18:38         ` [PATCH v5 2/6] pack-bitmap-write.c: write " Abhradeep Chakraborty via GitGitGadget
2022-07-26  0:52           ` Taylor Blau
2022-07-26 18:22             ` Abhradeep Chakraborty
2022-07-20 18:38         ` [PATCH v5 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Abhradeep Chakraborty via GitGitGadget
2022-07-28 19:22           ` Johannes Schindelin
2022-08-02 12:40             ` Abhradeep Chakraborty
2022-08-02 15:35               ` Johannes Schindelin
2022-08-02 17:44                 ` Abhradeep Chakraborty
2022-08-08 13:06                   ` Johannes Schindelin
2022-08-08 13:58                     ` Abhradeep Chakraborty
2022-08-09  9:03                       ` Johannes Schindelin
2022-08-09 12:03                         ` Abhradeep Chakraborty
2022-08-09 12:07                           ` Abhradeep Chakraborty
2022-08-10  9:09                           ` Johannes Schindelin
2022-08-10  9:20                             ` Johannes Schindelin
2022-08-10 10:04                               ` Abhradeep Chakraborty
2022-08-10 17:51                                 ` Derrick Stolee
2022-08-12 18:51                                   ` Abhradeep Chakraborty
2022-08-12 19:22                                     ` Derrick Stolee
2022-08-13 10:59                                       ` Abhradeep Chakraborty
2022-08-16 21:57                                         ` Taylor Blau
2022-08-17 10:02                                           ` Abhradeep Chakraborty [this message]
2022-08-17 20:38                                             ` Taylor Blau
2022-08-19 21:49                                               ` Taylor Blau
2022-08-13 11:05                               ` Abhradeep Chakraborty
2022-08-16 18:47                             ` Taylor Blau
2022-07-20 18:38         ` [PATCH v5 4/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-07-26  1:13           ` Taylor Blau
2022-07-26 18:56             ` Abhradeep Chakraborty
2022-07-26 19:36             ` Eric Sunshine
2022-07-20 18:38         ` [PATCH v5 5/6] p5310-pack-bitmaps.sh: enable `pack.writeReverseIndex` Abhradeep Chakraborty via GitGitGadget
2022-07-26  1:18           ` Taylor Blau
2022-07-26  7:15             ` Ævar Arnfjörð Bjarmason
2022-07-26 13:32               ` Derrick Stolee
2022-07-26 13:54                 ` Ævar Arnfjörð Bjarmason
2022-07-26 18:17                   ` Abhradeep Chakraborty
2022-07-20 18:38         ` [PATCH v5 6/6] bitmap-lookup-table: add performance tests for lookup table Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55         ` [PATCH v6 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 1/6] Documentation/technical: describe bitmap lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 2/6] bitmap: move `get commit positions` code to `bitmap_writer_finish` Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 3/6] pack-bitmap-write.c: write lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 4/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 5/6] pack-bitmap: prepare to read lookup table extension Abhradeep Chakraborty via GitGitGadget
2022-08-14 16:55           ` [PATCH v6 6/6] bitmap-lookup-table: add performance tests for lookup table Abhradeep Chakraborty via GitGitGadget
2022-08-19 21:21           ` [PATCH v6 0/6] [GSoC] bitmap: integrate a lookup table extension to the bitmap format Junio C Hamano
2022-08-22 14:42             ` Johannes Schindelin
2022-08-22 14:48               ` Taylor Blau
2022-08-25 22:16           ` Taylor Blau
2022-08-26 16:02             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPOJW5znYngr4n4tBBCgqZY4Hr38NArHC7Go=ujDkmsFXY57mQ@mail.gmail.com' \
    --to=chakrabortyabhradeep79@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=martin.agren@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=philipoakley@iee.email \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).