From: Junio C Hamano <gitster@pobox.com>
To: "Arthur Chan via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Arthur Chan <arthur.chan@adalogics.com>
Subject: Re: [PATCH] fuzz: add basic fuzz testing for git command
Date: Tue, 13 Sep 2022 09:13:32 -0700 [thread overview]
Message-ID: <xmqqv8pr9rrn.fsf@gitster.g> (raw)
In-Reply-To: <pull.1351.git.1663078962231.gitgitgadget@gmail.com> (Arthur Chan via GitGitGadget's message of "Tue, 13 Sep 2022 14:22:42 +0000")
"Arthur Chan via GitGitGadget" <gitgitgadget@gmail.com> writes:
> .gitignore | 2 +
> Makefile | 2 +
> fuzz-cmd-base.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++
> fuzz-cmd-base.h | 13 ++++++
> fuzz-cmd-status.c | 68 +++++++++++++++++++++++++++
> 5 files changed, 202 insertions(+)
> create mode 100644 fuzz-cmd-base.c
> create mode 100644 fuzz-cmd-base.h
> create mode 100644 fuzz-cmd-status.c
Just like we have t/ hierarchy for testing, if we plan to add more
fuzz-* related things on top of what we already have (like those
that can be seen in the context of this patch), I would prefer to
see a creation of fuzz/ hierarchy and move existing stuff there as
the first step before adding more.
And more fuzzing is good, if we can afford it ;-)
Thanks.
Even though I am not taking this patch as-is, let's give a cursory
look to make sure the future iteration can be more reviewable by
pointing out various CodingGuidelines issues.
> diff --git a/fuzz-cmd-base.c b/fuzz-cmd-base.c
> new file mode 100644
> index 00000000000..98f05c78372
> --- /dev/null
> +++ b/fuzz-cmd-base.c
> @@ -0,0 +1,117 @@
> +#include "cache.h"
Good to have this as the first thing.
> +#include "fuzz-cmd-base.h"
> +
> +
> +/*
> + * This function is used to randomize the content of a file with the
> + * random data. The random data normally come from the fuzzing engine
> + * LibFuzzer in order to create randomization of the git file worktree
> + * and possibly messing up of certain git config file to fuzz different
> + * git command execution logic.
> + */
> +void randomize_git_file(char *dir, char *name, char *data_chunk, int data_size) {
Unlike other control structure with multiple statements in a block,
the surrounding braces {} around function block sit on their own
lines. I.e.
void randomize_git_file(char *dir, char *name, char *data_chunk, int data_size)
{
> + char fname[256];
In our codebase, tab-width is 8 and we indent with tabs.
Use <strbuf.h> and avoid snprintf(), e.g.
struct strbuf fname = STRBUF_INIT;
strbuf_addf(&fname, "%s/%s", dir, name);
... use fname.buf ...
strbuf_release(&fname);
> + FILE *fp;
> +
Good that you leave a blank between the end of decl and the
beginning of the statements.
> + snprintf(fname, 255, "%s/%s", dir, name);
> +
> + fp = fopen(fname, "wb");
> + if (fp) {
> + fwrite(data_chunk, 1, data_size, fp);
> + fclose(fp);
> + }
> +}
Why doesn't this care about errors at all? Not even fopen errors?
> +/*
> + * This function is the variants of the above functions which takes
> + * in a set of target files to be processed. These target file are
"... is a variant of the above function, which takes a set of ..."
> + * passing to the above function one by one for content rewrite.
> + */
> +void randomize_git_files(char *dir, char *name_set[], int files_count, char *data, int size) {
> + int data_size = size / files_count;
> +
> + for(int i=0; i<files_count; i++) {
We do not yet officially allow variable decl for for() statement
like this. We'll start allowing it later this year but we are
waiting for oddball platform/compiler folks to scream right now.
IOW, we write the above more like so:
int data_size = size / files_count;
int i;
for (i = 0; i < files_count; i++) {
Take also notice how we use whitespaces around non-unary operators.
> + char *data_chunk = malloc(data_size);
> + memcpy(data_chunk, data + (i * data_size), data_size);
> + randomize_git_file(dir, name_set[i], data_chunk, data_size);
> +
> + free(data_chunk);
> + }
As data_size does not change in this loop and the contents of
data_chunk from each round is discardable, allocating once outside
may make more sense. Actually, as the called function makes only
read-only accesses of data_chunk, I do not quite see why you need to
make a copy in the first place.
We do not use malloc() etc. directly out of the system; study wrapper.c
and find xmalloc() and friends.
What if size is not a multiple of files_count, by the way?
I'll stop here as we already have plenty above (read: it is not "I
didn't spot any problems in the patch after this point").
Thanks.
next prev parent reply other threads:[~2022-09-13 17:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-13 14:22 [PATCH] fuzz: add basic fuzz testing for git command Arthur Chan via GitGitGadget
2022-09-13 15:57 ` Ævar Arnfjörð Bjarmason
2022-09-16 15:54 ` Arthur Chan
2022-09-13 16:13 ` Junio C Hamano [this message]
2022-09-16 16:06 ` Arthur Chan
2022-09-16 17:29 ` [PATCH v2] " Arthur Chan via GitGitGadget
2022-09-16 17:37 ` Junio C Hamano
2022-09-16 18:07 ` Arthur Chan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqv8pr9rrn.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=arthur.chan@adalogics.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).