git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH] strvec: use size_t to store nr and alloc
Date: Sun, 12 Sep 2021 17:58:32 -0400	[thread overview]
Message-ID: <YT54CNYgtGcqexwq@coredump.intra.peff.net> (raw)
In-Reply-To: <87o88z82pc.fsf@evledraar.gmail.com>

On Sat, Sep 11, 2021 at 06:13:18PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I don't really think any of that needs to go into the commit message,
> > but if that's a hold-up, I can try to summarize it (though I think
> > referring to the commit which _already_ did this and was accidentally
> > reverted would be sufficient).
> 
> Thanks, I have a WIP version of this outstanding starting with this
> patch that I was planning to submit sometime, but I'm happy to have you
> pursue it, especially with the ~100 outstanding patches I have in
> master..seen.
> 
> It does feel somewhere between iffy and a landmine waiting to be stepped
> on to only convert the member itself, and not any of the corresponding
> "int" variables that track it to "size_t".

I don't think it's a landmine. If those other places are using "int" and
have trouble with a too-big strvec after the patch, then they were
generally already buggy before, too. I have poked around before for
cases where half-converting some code from size_t to int can possibly
make things worse, but it's pretty rare.

Meanwhile, strvec using an int has obvious and easy-to-trigger overflow
problems. Try this:

  yes '000aagent' | GIT_PROTOCOL=version=2 ./git-upload-pack .

where we will happily allocate an arbitrarily-large strvec. If you have
the patience and the RAM, this will overflow the alloc field. Luckily we
catch it before under-allocating the array. Because our ALLOC_GROW()
growth pattern is fixed, we'll always end up with a negative 32-bit int,
which gets cast to a very large size_t, and an st_mult() invocation
complains.

Much scarier to me is that string_list seems to be using "unsigned int"
for its counter, so it never goes negative! Try this:

  yes | git pack-objects --stdin-packs foo

which reads into a string_list. It takes even more RAM to overflow
because string_list is so horribly inefficient. But here once again
we're saved by a quirk of ALLOC_GROW() dating back to 4234a76167 (Extend
--pretty=oneline to cover the first paragraph,, 2007-06-11).

alloc_nr() will overflow to a small number when it hits around 2^32 / 3.
Before that patch, we'd then realloc a too-small buffer and have a heap
overflow. After that patch (I think in order to catch growth by more
than 1 element), we notice the case that alloc_nr() didn't give us a big
enough size, and extend it exactly to what was asked for. So no heap
overflow, though we do enter a quadratic growth loop. :-/

So I don't think we have any heap overflows here, which is good. But we
are protected by a fair bit of luck, and none of this would get nearly
as close to danger if we just used a sensible type for our allocation
sizes.

I do think we should separately fix the v2 protocol not to just allocate
arbitrary-sized strvecs on behalf of the user. I took a stab at that
this weekend and have a series I'm pretty happy with, but of course it
conflicts with your ab/serve-cleanup (and in fact, I ended up doing some
of those same cleanups, like not passing "keys" around). I'll see if I
can re-build it on top.

> If you do the change I suggested in
> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
> find that there's at least one first-order reference to this that now
> uses "int" that if converted to "size_t" will result in a wrap-around
> error, we're lucky that one has a test failure.
> 
> I can tell you what that bug is, but maybe it's better if you find it
> yourself :) I.e. I found *that* one, but I'm not sure I found them
> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
> errors & the code context (note: pointer, obviously broken, but makes
> the compiler yell).

Yes, having converted s/int/size_t/ for other structures before, you
have to be very careful of signedness. A safer conversion is ssize_t
(which similarly avoids overflow problems on LP64 systems).

I'm sure there are other overflow bugs lurking around use of strvecs, if
you really push them hard. But I care a lot less about something like
"oops, I accidentally overwrite list[0] instead of list[2^32-1]". Those
are bugs that normal people will never see, because they aren't doing
stupid or malicious things. I care a lot more about our allocating
functions being vulnerable to heap overflows from malicious attackers.

So I was really hoping to do the size_t fix separately. It's hard for it
to screw anything up, and it addresses the most vulnerable and important
use.

-Peff

      parent reply	other threads:[~2021-09-12 21:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11 15:01 [PATCH] strvec: use size_t to store nr and alloc Jeff King
2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
2021-09-11 22:48   ` Philip Oakley
2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
2021-09-12  0:36         ` Carlo Arenas
2021-09-13  3:56           ` Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 2/7] pack-objects: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 3/7] sequencer.[ch]: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 4/7] upload-pack.c: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
2021-09-12  2:57         ` Eric Sunshine
2021-09-12  0:15       ` [PATCH v2 6/7] strvec: use size_t to store nr and alloc Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
2021-09-12  3:00         ` Eric Sunshine
2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
2021-09-13  5:38         ` Junio C Hamano
2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
2021-09-13 17:20             ` Jeff King
2021-09-13 10:47       ` Philip Oakley
2021-09-12 22:00     ` [PATCH] " Jeff King
2021-09-13 11:42       ` Philip Oakley
2021-09-12 21:58   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT54CNYgtGcqexwq@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).