From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>, git@vger.kernel.org
Subject: Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
Date: Sat, 13 Jan 2024 23:51:57 +0100 [thread overview]
Message-ID: <20240113225157.GD3000857@szeder.dev> (raw)
In-Reply-To: <20240113183544.GA3000857@szeder.dev>
On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote:
> On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > >> * tb/path-filter-fix (2023-10-18) 17 commits
> > >> - bloom: introduce `deinit_bloom_filters()`
> > >> ...
> > >> - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> > >>
> > >> The Bloom filter used for path limited history traversal was broken
> > >> on systems whose "char" is unsigned; update the implementation and
> > >> bump the format version to 2.
> > >>
> > >> Expecting a reroll.
> > >> cf. <20231023202212.GA5470@szeder.dev>
> > >> source: <cover.1697653929.git.me@ttaylorr.com>
> > >
> > > I was confused by this one, since I couldn't figure out which tests
> > > Gábor was referring to here. I responded in [1], but haven't heard back
> > > since the end of October.
> > > ...
> > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/
>
> I keep referring to the test in:
>
> https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/
>
> which, rather disappointingly, is still the only test out there
> exercising the interaction between split commit graphs and different
> modified path Bloom filter versions. Note that in that message I
> mentioned that merging layers with differenet Bloom filter versions
> seemed to work, but that, alas, is no longer the case, because it's
> now broken in Taylor's recent iterations of the patch series.
>
> At the risk of sounding like a broken record: the interaction of split
> commit graphs and different Bloom filter versions should be thoroughly
> tested.
On a related note, if current git (I tried current master and v2.43.0)
encounters a commit graph layer containing v2 Bloom filters (created
by current seen) while writing a new commit graph, then it segfaults
dereferencing a NULL 'settings' pointer in
get_or_compute_bloom_filter().
The test below demonstrates this, but it's quite hacky using two
different git versions: it has to be run by an old git version not yet
supporting v2 Bloom filters, and a new git version already supporting
them should be installed at /tmp/git-new/.
diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index 2ba0324a69..0464dd68d5 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' '
test_cmp expect.err err
'
+CENT=$(printf "\302\242")
+test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' '
+ git init split-v2-old &&
+ (
+ cd split-v2-old &&
+ git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
+ for i in 1 2 3
+ do
+ echo $i >$CENT &&
+ git add $CENT &&
+ git commit -m "$i" || return 1
+ done &&
+ git log --oneline -- $CENT >expect &&
+
+ # Here we write a commit graph layer containing v2 changed
+ # path Bloom filters using a git binary built from current
+ # 'seen' branch.
+ git rev-parse HEAD^ |
+ /tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \
+ commit-graph write --stdin-commits --changed-paths --split &&
+
+ # This is current master, and segfaults.
+ git commit-graph write --reachable --changed-paths &&
+
+ git log --oneline -- $CENT >actual &&
+ test_cmp expect actual
+ )
+'
+
test_done
(gdb) r
Starting program: /home/szeder/src/git/git commit-graph write --reachable --changed-paths
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00005555556b8714 in get_or_compute_bloom_filter (r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, settings=0x0, computed=0x7fffffffd634) at bloom.c:253
253 diffopt.max_changes = settings->max_changed_paths;
(gdb) l
248 return NULL;
249
250 repo_diff_setup(r, &diffopt);
251 diffopt.flags.recursive = 1;
252 diffopt.detect_rename = 0;
253 diffopt.max_changes = settings->max_changed_paths;
254 diff_setup_done(&diffopt);
255
256 /* ensure commit is parsed so we have parent information */
257 repo_parse_commit(r, c);
(gdb) bt
#0 0x00005555556b8714 in get_or_compute_bloom_filter (
r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1,
settings=0x0, computed=0x7fffffffd634) at bloom.c:253
#1 0x00005555556d16d5 in compute_bloom_filters (ctx=0x555555a1c200)
at commit-graph.c:1783
#2 0x00005555556d4329 in write_commit_graph (odb=0x555555a19210,
pack_indexes=0x0, commits=0x7fffffffd7c0,
flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS),
opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:2603
#3 0x00005555556d19ab in write_commit_graph_reachable (odb=0x555555a19210,
flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS),
opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:1849
#4 0x00005555555a63c2 in graph_write (argc=0, argv=0x7fffffffde20, prefix=0x0)
at builtin/commit-graph.c:294
#5 0x00005555555a66f4 in cmd_commit_graph (argc=3, argv=0x7fffffffde20,
prefix=0x0) at builtin/commit-graph.c:353
#6 0x0000555555575b43 in run_builtin (p=0x5555559da260 <commands+576>,
argc=4, argv=0x7fffffffde20) at git.c:469
#7 0x0000555555575fcb in handle_builtin (argc=4, argv=0x7fffffffde20)
at git.c:724
#8 0x0000555555576272 in run_argv (argcp=0x7fffffffdc7c, argv=0x7fffffffdc70)
at git.c:788
#9 0x000055555557685d in cmd_main (argc=4, argv=0x7fffffffde20) at git.c:923
#10 0x000055555568ad55 in main (argc=5, argv=0x7fffffffde18)
at common-main.c:62
next prev parent reply other threads:[~2024-01-13 22:52 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-03 1:02 What's cooking in git.git (Jan 2024, #01; Tue, 2) Junio C Hamano
2024-01-03 5:53 ` ps/refstorage-extension (was: What's cooking in git.git (Jan 2024, #01; Tue, 2)) Patrick Steinhardt
2024-01-03 9:01 ` What's cooking in git.git (Jan 2024, #01; Tue, 2) Jeff King
2024-01-03 16:37 ` Junio C Hamano
2024-01-05 8:59 ` Jeff King
2024-01-05 16:34 ` Junio C Hamano
2024-01-03 17:14 ` René Scharfe
2024-01-03 16:43 ` Taylor Blau
2024-01-03 18:08 ` Junio C Hamano
2024-01-13 18:35 ` SZEDER Gábor
2024-01-13 22:06 ` Taylor Blau
2024-01-13 23:41 ` SZEDER Gábor
2024-01-16 20:37 ` Taylor Blau
2024-02-25 22:59 ` SZEDER Gábor
2024-02-26 14:44 ` Taylor Blau
2024-01-13 22:51 ` SZEDER Gábor [this message]
2024-01-16 20:49 ` Taylor Blau
2024-01-16 22:45 ` Junio C Hamano
2024-01-16 23:31 ` Taylor Blau
2024-01-16 23:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240113225157.GD3000857@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).