git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>, git@vger.kernel.org
Subject: Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
Date: Sat, 13 Jan 2024 23:51:57 +0100	[thread overview]
Message-ID: <20240113225157.GD3000857@szeder.dev> (raw)
In-Reply-To: <20240113183544.GA3000857@szeder.dev>

On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote:
> On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> > 
> > >> * tb/path-filter-fix (2023-10-18) 17 commits
> > >>  - bloom: introduce `deinit_bloom_filters()`
> > >>  ...
> > >>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> > >>
> > >>  The Bloom filter used for path limited history traversal was broken
> > >>  on systems whose "char" is unsigned; update the implementation and
> > >>  bump the format version to 2.
> > >>
> > >>  Expecting a reroll.
> > >>  cf. <20231023202212.GA5470@szeder.dev>
> > >>  source: <cover.1697653929.git.me@ttaylorr.com>
> > >
> > > I was confused by this one, since I couldn't figure out which tests
> > > Gábor was referring to here. I responded in [1], but haven't heard back
> > > since the end of October.
> > > ...
> > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/
> 
> I keep referring to the test in:
> 
>   https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/
> 
> which, rather disappointingly, is still the only test out there
> exercising the interaction between split commit graphs and different
> modified path Bloom filter versions.  Note that in that message I
> mentioned that merging layers with differenet Bloom filter versions
> seemed to work, but that, alas, is no longer the case, because it's
> now broken in Taylor's recent iterations of the patch series.
> 
> At the risk of sounding like a broken record: the interaction of split
> commit graphs and different Bloom filter versions should be thoroughly
> tested.

On a related note, if current git (I tried current master and v2.43.0)
encounters a commit graph layer containing v2 Bloom filters (created
by current seen) while writing a new commit graph, then it segfaults
dereferencing a NULL 'settings' pointer in
get_or_compute_bloom_filter().

The test below demonstrates this, but it's quite hacky using two
different git versions: it has to be run by an old git version not yet
supporting v2 Bloom filters, and a new git version already supporting
them should be installed at /tmp/git-new/.


diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index 2ba0324a69..0464dd68d5 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' '
 	test_cmp expect.err err
 '
 
+CENT=$(printf "\302\242")
+test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' '
+	git init split-v2-old &&
+	(
+		cd split-v2-old &&
+		git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
+		for i in 1 2 3
+		do
+			echo $i >$CENT &&
+			git add $CENT &&
+			git commit -m "$i" || return 1
+		done &&
+		git log --oneline -- $CENT >expect &&
+
+		# Here we write a commit graph layer containing v2 changed
+		# path Bloom filters using a git binary built from current
+		# 'seen' branch.
+		git rev-parse HEAD^ |
+		/tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \
+			commit-graph write --stdin-commits --changed-paths --split &&
+
+		# This is current master, and segfaults.
+		git commit-graph write --reachable --changed-paths &&
+
+		git log --oneline -- $CENT >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done



(gdb) r
Starting program: /home/szeder/src/git/git commit-graph write --reachable --changed-paths
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00005555556b8714 in get_or_compute_bloom_filter (r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, settings=0x0, computed=0x7fffffffd634) at bloom.c:253
253		diffopt.max_changes = settings->max_changed_paths;
(gdb) l
248			return NULL;
249	
250		repo_diff_setup(r, &diffopt);
251		diffopt.flags.recursive = 1;
252		diffopt.detect_rename = 0;
253		diffopt.max_changes = settings->max_changed_paths;
254		diff_setup_done(&diffopt);
255	
256		/* ensure commit is parsed so we have parent information */
257		repo_parse_commit(r, c);
(gdb) bt
#0  0x00005555556b8714 in get_or_compute_bloom_filter (
    r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, 
    settings=0x0, computed=0x7fffffffd634) at bloom.c:253
#1  0x00005555556d16d5 in compute_bloom_filters (ctx=0x555555a1c200)
    at commit-graph.c:1783
#2  0x00005555556d4329 in write_commit_graph (odb=0x555555a19210, 
    pack_indexes=0x0, commits=0x7fffffffd7c0, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), 
    opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:2603
#3  0x00005555556d19ab in write_commit_graph_reachable (odb=0x555555a19210, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), 
    opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:1849
#4  0x00005555555a63c2 in graph_write (argc=0, argv=0x7fffffffde20, prefix=0x0)
    at builtin/commit-graph.c:294
#5  0x00005555555a66f4 in cmd_commit_graph (argc=3, argv=0x7fffffffde20, 
    prefix=0x0) at builtin/commit-graph.c:353
#6  0x0000555555575b43 in run_builtin (p=0x5555559da260 <commands+576>, 
    argc=4, argv=0x7fffffffde20) at git.c:469
#7  0x0000555555575fcb in handle_builtin (argc=4, argv=0x7fffffffde20)
    at git.c:724
#8  0x0000555555576272 in run_argv (argcp=0x7fffffffdc7c, argv=0x7fffffffdc70)
    at git.c:788
#9  0x000055555557685d in cmd_main (argc=4, argv=0x7fffffffde20) at git.c:923
#10 0x000055555568ad55 in main (argc=5, argv=0x7fffffffde18)
    at common-main.c:62




  parent reply	other threads:[~2024-01-13 22:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-03  1:02 What's cooking in git.git (Jan 2024, #01; Tue, 2) Junio C Hamano
2024-01-03  5:53 ` ps/refstorage-extension (was: What's cooking in git.git (Jan 2024, #01; Tue, 2)) Patrick Steinhardt
2024-01-03  9:01 ` What's cooking in git.git (Jan 2024, #01; Tue, 2) Jeff King
2024-01-03 16:37   ` Junio C Hamano
2024-01-05  8:59     ` Jeff King
2024-01-05 16:34       ` Junio C Hamano
2024-01-03 17:14   ` René Scharfe
2024-01-03 16:43 ` Taylor Blau
2024-01-03 18:08   ` Junio C Hamano
2024-01-13 18:35     ` SZEDER Gábor
2024-01-13 22:06       ` Taylor Blau
2024-01-13 23:41         ` SZEDER Gábor
2024-01-16 20:37           ` Taylor Blau
2024-02-25 22:59             ` SZEDER Gábor
2024-02-26 14:44               ` Taylor Blau
2024-01-13 22:51       ` SZEDER Gábor [this message]
2024-01-16 20:49         ` Taylor Blau
2024-01-16 22:45           ` Junio C Hamano
2024-01-16 23:31             ` Taylor Blau
2024-01-16 23:42               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240113225157.GD3000857@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).