From: Taylor Blau <me@ttaylorr.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
Date: Tue, 16 Jan 2024 15:49:24 -0500 [thread overview]
Message-ID: <Zabr1Glljjgl/UUB@nand.local> (raw)
In-Reply-To: <20240113225157.GD3000857@szeder.dev>
On Sat, Jan 13, 2024 at 11:51:57PM +0100, SZEDER Gábor wrote:
> On a related note, if current git (I tried current master and v2.43.0)
> encounters a commit graph layer containing v2 Bloom filters (created
> by current seen) while writing a new commit graph, then it segfaults
> dereferencing a NULL 'settings' pointer in
> get_or_compute_bloom_filter().
>
> The test below demonstrates this, but it's quite hacky using two
> different git versions: it has to be run by an old git version not yet
> supporting v2 Bloom filters, and a new git version already supporting
> them should be installed at /tmp/git-new/.
>
> diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
> index 2ba0324a69..0464dd68d5 100755
> --- a/t/t4216-log-bloom.sh
> +++ b/t/t4216-log-bloom.sh
> @@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' '
> test_cmp expect.err err
> '
>
> +CENT=$(printf "\302\242")
> +test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' '
> + git init split-v2-old &&
> + (
> + cd split-v2-old &&
> + git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
> + for i in 1 2 3
> + do
> + echo $i >$CENT &&
> + git add $CENT &&
> + git commit -m "$i" || return 1
> + done &&
> + git log --oneline -- $CENT >expect &&
> +
> + # Here we write a commit graph layer containing v2 changed
> + # path Bloom filters using a git binary built from current
> + # 'seen' branch.
> + git rev-parse HEAD^ |
> + /tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \
> + commit-graph write --stdin-commits --changed-paths --split &&
> +
> + # This is current master, and segfaults.
> + git commit-graph write --reachable --changed-paths &&
> +
> + git log --oneline -- $CENT >actual &&
> + test_cmp expect actual
> + )
> +'
> +
> test_done
Thanks. The segfault is reproducible on my end, but I don't think that
it is possible to fix this for existing versions of Git. The problem (as
you note in your backtrace) is here:
#0 0x000055555569c842 in get_or_compute_bloom_filter (
r=0x5555559c9ce0 <the_repo>, c=0x5555559dffd0, compute_if_not_present=1,
settings=0x0, computed=0x7fffffffe0f4) at bloom.c:253
Which tries to dereference ctx->bloom_settings, which is NULL. Note that
we initialize some sensible defaults for ctx->bloom_settings in
commit-graph.c::write_commit_graph():
struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS;
/* ... */
bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY",
bloom_settings.bits_per_entry);
bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES",
bloom_settings.num_hashes);
bloom_settings.max_changed_paths = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS",
bloom_settings.max_changed_paths);
ctx->bloom_settings = &bloom_settings;
...but we'll throw those away in favor of whatever is in the topmost
layer of the existing commit-graph chain later on in that same function:
if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) {
struct commit_graph *g;
g = ctx->r->objects->commit_graph;
/* We have changed-paths already. Keep them in the next graph */
if (g && g->chunk_bloom_data) {
ctx->changed_paths = 1;
ctx->bloom_settings = g->bloom_filter_settings;
}
}
OK, everything seems fine thus far, until we inspect the value of
g->bloom_filter_settings, which is NULL, becuase of this hunk from
commit-graph.c::graph_read_bloom_data():
if (hash_version != 1)
return 0;
which terminates the function before we assign g->bloom_filter_settings
for the existing (written with v2 Bloom filters) graph layer.
I don't think that there is a way to fix this in a backwards compatible
way, but I'm comfortable with that in this instance since we don't
expect users to upgrading to v2 Bloom filters and then writing new graph
layers using a non-v2 compatible version of Git.
We can add a warning in the series that I'm working on indicating this,
but I don't think there's much more we can do besides changing this to
indicate a warning and bailing instead of segfaulting.
Thanks,
Taylor
next prev parent reply other threads:[~2024-01-16 22:18 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-03 1:02 What's cooking in git.git (Jan 2024, #01; Tue, 2) Junio C Hamano
2024-01-03 5:53 ` ps/refstorage-extension (was: What's cooking in git.git (Jan 2024, #01; Tue, 2)) Patrick Steinhardt
2024-01-03 9:01 ` What's cooking in git.git (Jan 2024, #01; Tue, 2) Jeff King
2024-01-03 16:37 ` Junio C Hamano
2024-01-05 8:59 ` Jeff King
2024-01-05 16:34 ` Junio C Hamano
2024-01-03 17:14 ` René Scharfe
2024-01-03 16:43 ` Taylor Blau
2024-01-03 18:08 ` Junio C Hamano
2024-01-13 18:35 ` SZEDER Gábor
2024-01-13 22:06 ` Taylor Blau
2024-01-13 23:41 ` SZEDER Gábor
2024-01-16 20:37 ` Taylor Blau
2024-02-25 22:59 ` SZEDER Gábor
2024-02-26 14:44 ` Taylor Blau
2024-01-13 22:51 ` SZEDER Gábor
2024-01-16 20:49 ` Taylor Blau [this message]
2024-01-16 22:45 ` Junio C Hamano
2024-01-16 23:31 ` Taylor Blau
2024-01-16 23:42 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zabr1Glljjgl/UUB@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).