git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, me@ttaylorr.com, derrickstolee@github.com,
	lessleydennington@gmail.com, vdye@github.com, avarab@gmail.com,
	jonathantanmy@google.com
Subject: Re: [PATCH v3] commit-graph: refactor to avoid prepare_repo_settings
Date: Thu, 23 Jun 2022 14:59:20 -0700	[thread overview]
Message-ID: <xmqq1qvfyrbb.fsf@gitster.g> (raw)
In-Reply-To: <9b56496b0809cc8a25af877ea97042e2cb7f2af6.1655246092.git.steadmon@google.com> (Josh Steadmon's message of "Tue, 14 Jun 2022 15:37:21 -0700")

Josh Steadmon <steadmon@google.com> writes:

> This series of changes broke fuzz-commit-graph, which attempts to parse
> arbitrary fuzzing-engine-provided bytes as a commit graph file.
> commit-graph.c:parse_commit_graph() calls prepare_repo_settings(), but
> since we run the fuzz tests without a valid repository, we are hitting
> the BUG() from 44c7e62 for every test case.
>
> Fix this by moving the majority of the implementaiton of
> `parse_commit_graph()` into a new function,
> `parse_commit_graph_settings()` that accepts a repo_settings pointer.
> This allows fuzz-commit-graph to continue to test the commit-graph
> parser implementation without relying on prepare_repo_settings().

It sounds like this is not a "fix" but a workaround to bend the
production code so that a non-production test shim can be inserted
more easily.

I am OK with the idea, but have a huge problem with the new name.

Is it just me who thinks parse_commit_graph_settings() is a function
that parses some kind of settings that affects the way the commit
graph gets used or written?

Stepping back a bit, why can't fuzz-commit-graph prepare a
repository object that looks sufficiently real?  Something along the
lines of...

                struct repository fake_repo;

                fake_repo.settings.initialized = 1;
                fake_repo.gitdir = ".";
                parse_commit_graph(&fake_repo, (void *)data, size);
		...

Also, I feel somewhat uneasy to see these changes:

> -	if (get_configured_generation_version(r) >= 2) {
> +	if (s->commit_graph_generation_version >= 2) {
> -	if (r->settings.commit_graph_read_changed_paths) {
> +	if (s->commit_graph_read_changed_paths) {
> -	ctx->write_generation_data = (get_configured_generation_version(r) == 2);
> +	ctx->write_generation_data = (r->settings.commit_graph_generation_version == 2);
>  	ctx->num_generation_data_overflows = 0;

that makes the production code bend over backwards to _avoid_
referencing 'r', only to cater to the test shim.  That's an
artificial limitation we are forcing on our developers who works on
this code.  It might be that what is in the repository settings is
sufficient for today's code to work, but I do not think needs for
fuzz tests should tie the hand of this production code by forbidding
it to look at other things in the repository in the future.  After
all, tests are to serve the production code, not the other way
around.

On the other hand, I think a change that is slightly smaller than
the posted patch, which justifies itself with a completely different
rationale, would be totally acceptable.  You can justify this change
with NO mention of fuzzers.

    The parse_commit_graph() function takes a "struct repository *"
    pointer, but all it cares about is the .settings member of it.

    Update the function and all its existing callers so that it
    takes "struct repo_settings *" instead.

Now, in the future, some developers _might_ find it necessary to
access stuff other than the repository settings to validate the
contents of the graph file, and we may need to change it to take a
full repository structure again.  The test should adjust to such
needs of the production code, not the other way around.  But until
then...

Thanks.

  parent reply	other threads:[~2022-06-23 21:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 18:03 [RFC PATCH] repo-settings: set defaults even when not in a repo Josh Steadmon
2022-03-23 19:22 ` Derrick Stolee
2022-03-23 19:52   ` Taylor Blau
2022-03-28 19:15     ` Josh Steadmon
2022-03-29  1:21       ` Taylor Blau
2022-03-28 19:53     ` Josh Steadmon
2022-03-29  1:22       ` Taylor Blau
2022-03-29  9:03     ` Ævar Arnfjörð Bjarmason
2022-03-30  2:26       ` Taylor Blau
2022-04-09  6:33         ` Josh Steadmon
2022-03-29  9:04     ` Ævar Arnfjörð Bjarmason
2022-03-30  2:34       ` Taylor Blau
2022-03-30 17:38         ` Ævar Arnfjörð Bjarmason
2022-03-30 20:14           ` Junio C Hamano
2022-04-09  6:52     ` [RFC PATCH v2] commit-graph: refactor to avoid prepare_repo_settings Josh Steadmon
2022-06-07 20:02       ` Jonathan Tan
2022-06-14 22:38         ` Josh Steadmon
2022-06-14 22:37     ` [PATCH v3] " Josh Steadmon
2022-06-14 23:32       ` Taylor Blau
2022-06-23 21:59       ` Junio C Hamano [this message]
2022-07-14 21:44         ` Josh Steadmon
2022-07-14 21:43     ` [PATCH v4] commit-graph: pass repo_settings instead of repository Josh Steadmon
2022-07-14 22:48       ` Junio C Hamano
2022-03-23 20:11 ` [RFC PATCH] repo-settings: set defaults even when not in a repo Victoria Dye
2022-03-23 20:54   ` Junio C Hamano
2022-03-23 21:19     ` Victoria Dye
2022-03-23 20:51 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq1qvfyrbb.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=lessleydennington@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=steadmon@google.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).