git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 04/20] commit-graph: check size of oid fanout chunk
Date: Tue, 10 Oct 2023 20:08:05 -0400	[thread overview]
Message-ID: <ZSXnZXglgbfK3VYd@nand.local> (raw)
In-Reply-To: <20231009205951.GD3282181@coredump.intra.peff.net>

On Mon, Oct 09, 2023 at 04:59:51PM -0400, Jeff King wrote:
> We load the oid fanout chunk with pair_chunk(), which means we never see
> the size of the chunk. We just assume the on-disk file uses the
> appropriate size, and if it's too small we'll access random memory.
>
> It's easy to check this up-front; the fanout always consists of 256
> uint32's, since it is a fanout of the first byte of the hash pointing
> into the oid index. These parameters can't be changed without
> introducing a new chunk type.

Cool, this is the first patch that should start reducing our usage of
the new pair_chunk_unsafe() and hardening these reads. Let's take a
look...

> This matches the similar check in the midx OIDF chunk (but note that
> rather than checking for the error immediately, the graph code just
> leaves parts of the struct NULL and checks for required fields later).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  commit-graph.c          | 13 +++++++++++--
>  t/t5318-commit-graph.sh | 26 ++++++++++++++++++++++++++
>  2 files changed, 37 insertions(+), 2 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index a689a55b79..9b3b01da61 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -305,6 +305,16 @@ static int verify_commit_graph_lite(struct commit_graph *g)
>  	return 0;
>  }
>
> +static int graph_read_oid_fanout(const unsigned char *chunk_start,
> +				 size_t chunk_size, void *data)
> +{
> +	struct commit_graph *g = data;
> +	if (chunk_size != 256 * sizeof(uint32_t))
> +		return error("commit-graph oid fanout chunk is wrong size");

Should we mark this string for translation?

> +	g->chunk_oid_fanout = (const uint32_t *)chunk_start;
> +	return 0;
> +}
> +

Nice. This makes sense and seems like an obvious improvement over the
existing code.

I wonder how common this pattern is. We have read_chunk() which is for
handling more complex scenarios than this. But the safe version of
pair_chunk() really just wants to check that the size of the chunk is as
expected and assign the location in the mmap to some pointer.

Do you think it would be worth changing pair_chunk() to take an expected
size_t and handle this generically? I.e. have a version of
chunk-format::pair_chunk_fn() that looks something like:

    static int pair_chunk_fn(const unsigned char *chunk_start,
                             size_t chunk_size, void *data)
    {
        const unsigned char **p = data;
        if (chunk_size != data->size)
            return -1;
        *p = chunk_start;
        return 0;
    }

and then our call here would be:

  if (pair_chunk(cf, GRAPH_CHUNKID_OIDFANOUT,
                 (const unsigned char **)&graph->chunk_oid_fanout,
                 256 * sizeof(uint32_t)) < 0)
      return error("commit-graph oid fanout chunk is wrong size");

I dunno. It's hard to have a more concrete recomendation without having
read the rest of the series. So it's possible that this is just complete
nonsense ;-). But my hunch is that there are a number of callers that
would benefit from having this built in.

Thanks,
Taylor

  reply	other threads:[~2023-10-11  0:08 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-09 20:55 [PATCH 0/20] bounds-checks for chunk-based files Jeff King
2023-10-09 20:58 ` [PATCH 01/20] chunk-format: note that pair_chunk() is unsafe Jeff King
2023-10-10 23:45   ` Taylor Blau
2023-10-11 22:49     ` Jeff King
2023-10-09 20:58 ` [PATCH 02/20] t: add library for munging chunk-format files Jeff King
2023-10-10 23:47   ` Taylor Blau
2023-10-09 20:59 ` [PATCH 03/20] midx: stop ignoring malformed oid fanout chunk Jeff King
2023-10-10 23:50   ` Taylor Blau
2023-10-11 22:52     ` Jeff King
2023-10-09 20:59 ` [PATCH 04/20] commit-graph: check size of " Jeff King
2023-10-11  0:08   ` Taylor Blau [this message]
2023-10-11  1:24     ` Taylor Blau
2023-10-11 23:01     ` Jeff King
2023-10-09 21:02 ` [PATCH 05/20] midx: check size of oid lookup chunk Jeff King
2023-10-09 21:04 ` [PATCH 06/20] commit-graph: check consistency of fanout table Jeff King
2023-10-11 14:45   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 07/20] midx: check size of pack names chunk Jeff King
2023-10-11 14:52   ` Taylor Blau
2023-10-11 23:06     ` Jeff King
2023-10-09 21:05 ` [PATCH 08/20] midx: enforce chunk alignment on reading Jeff King
2023-10-11 14:56   ` Taylor Blau
2023-10-11 15:01   ` Taylor Blau
2023-10-11 23:09     ` Jeff King
2023-10-09 21:05 ` [PATCH 09/20] midx: check size of object offset chunk Jeff King
2023-10-11 18:31   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 10/20] midx: bounds-check large " Jeff King
2023-10-11 18:38   ` Taylor Blau
2023-10-11 23:18     ` Jeff King
2023-10-09 21:05 ` [PATCH 11/20] midx: check size of revindex chunk Jeff King
2023-10-11 18:41   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 12/20] commit-graph: check size of commit data chunk Jeff King
2023-10-11 18:46   ` Taylor Blau
2023-10-11 23:22     ` Jeff King
2023-10-09 21:05 ` [PATCH 13/20] commit-graph: detect out-of-bounds extra-edges pointers Jeff King
2023-10-11 19:02   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 14/20] commit-graph: bounds-check base graphs chunk Jeff King
2023-10-11 19:05   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 15/20] commit-graph: check size of generations chunk Jeff King
2023-10-09 21:05 ` [PATCH 16/20] commit-graph: bounds-check generation overflow chunk Jeff King
2023-10-09 21:05 ` [PATCH 17/20] commit-graph: check bounds when accessing BDAT chunk Jeff King
2023-10-11 19:11   ` Taylor Blau
2023-10-11 23:27     ` Jeff King
2023-10-09 21:05 ` [PATCH 18/20] commit-graph: check bounds when accessing BIDX chunk Jeff King
2023-10-11 19:15   ` Taylor Blau
2023-10-09 21:05 ` [PATCH 19/20] commit-graph: detect out-of-order BIDX offsets Jeff King
2023-10-11 19:16   ` Taylor Blau
2023-10-09 21:06 ` [PATCH 20/20] chunk-format: drop pair_chunk_unsafe() Jeff King
2023-10-11 19:19 ` [PATCH 0/20] bounds-checks for chunk-based files Taylor Blau
2023-10-11 23:31   ` Jeff King
2023-10-13 19:25 ` [PATCH 0/8] chunk-format: introduce `pair_chunk_expect()` convenience API Taylor Blau
2023-10-13 19:25   ` [PATCH 1/8] chunk-format: introduce `pair_chunk_expect()` helper Taylor Blau
2023-10-13 19:25   ` [PATCH 2/8] commit-graph: read `OIDF` chunk with `pair_chunk_expect()` Taylor Blau
2023-10-13 19:25   ` [PATCH 3/8] commit-graph: read `CDAT` " Taylor Blau
2023-10-13 19:25   ` [PATCH 4/8] commit-graph: read `GDAT` " Taylor Blau
2023-10-13 19:25   ` [PATCH 5/8] commit-graph: read `BIDX` " Taylor Blau
2023-10-13 19:49     ` Taylor Blau
2023-10-14 16:10     ` Junio C Hamano
2023-10-20 10:31       ` Jeff King
2023-10-13 19:25   ` [PATCH 6/8] midx: read `OIDF` " Taylor Blau
2023-10-13 21:04     ` Junio C Hamano
2023-10-13 19:25   ` [PATCH 7/8] midx: read `OIDL` " Taylor Blau
2023-10-13 19:25   ` [PATCH 8/8] midx: read `OOFF` " Taylor Blau
2023-10-20 10:23   ` [PATCH 0/8] chunk-format: introduce `pair_chunk_expect()` convenience API Jeff King
2023-10-14  0:43 ` [PATCH 21/20] t5319: make corrupted large-offset test more robust Jeff King
2023-10-14 19:42   ` Junio C Hamano
2023-10-15  3:17     ` Jeff King
2023-10-15 17:04       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSXnZXglgbfK3VYd@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).