git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
@ 2020-09-01 11:18 Jeff King
  2020-09-01 13:04 ` Derrick Stolee
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jeff King @ 2020-09-01 11:18 UTC (permalink / raw)
  To: git

This patch fixes a bug where xrealloc(ptr, 0) can double-free and
corrupt the heap on some platforms (including at least glibc).

The C99 standard says of malloc (section 7.20.3):

  If the size of the space requested is zero, the behavior is
  implementation-defined: either a null pointer is returned, or the
  behavior is as if the size were some nonzero value, except that the
  returned pointer shall not be used to access an object.

So we might get NULL back, or we might get an actual pointer (but we're
not allowed to look at its contents). To simplify our code, our
xmalloc() handles a NULL return by converting it into a single-byte
allocation. That way callers get consistent behavior. This was done way
back in 4e7a2eccc2 (?alloc: do not return NULL when asked for zero
bytes, 2005-12-29).

We also gave xcalloc() and xrealloc() the same treatment. And according
to C99, that is fine; the text above is in a paragraph that applies to
all three. But what happens to the memory we passed to realloc() in such
a case? I.e., if we do:

  ret = realloc(ptr, 0);

and "ptr" is non-NULL, but we get NULL back, is "ptr" still valid? C99
doesn't cover this case specifically, but says (section 7.20.3.4):

  The realloc function deallocates the old object pointed to by ptr and
  returns a pointer to a new object that has the size specified by size.

So "ptr" is now deallocated, and we must only look at "ret". And since
"ret" is NULL, that means we have no allocated object at all. But that's
not quite the whole story. It also says:

  If memory for the new object cannot be allocated, the old object is
  not deallocated and its value is unchanged.
  [...]
  The realloc function returns a pointer to the new object (which may
  have the same value as a pointer to the old object), or a null pointer
  if the new object could not be allocated.

So if we see a NULL return with a non-zero size, we can expect that the
original object _is_ still valid. But with a zero size, it's ambiguous.
The NULL return might mean a failure (in which case the object is
valid), or it might mean that we successfully allocated nothing, and
used NULL to represent that.

The glibc manpage for realloc() explicitly says:

  [...]if size is equal to zero, and ptr is not NULL, then the call is
  equivalent to free(ptr).

Likewise, this StackOverflow answer:

  https://stackoverflow.com/a/2135302

claims that C89 gave similar guidance (but I don't have a copy to verify
it). A comment on this answer:

  https://stackoverflow.com/a/2022410

claims that Microsoft's CRT behaves the same.

But our current "retry with 1 byte" code passes the original pointer
again. So on glibc, we effectively free() the pointer and then try to
realloc() it again, which is undefined behavior.

The simplest fix here is to just pass "ret" (which we know to be NULL)
to the follow-up realloc(). That does mean that a system which _doesn't_
free the original pointer would leak it. But that interpretation of the
standard seems unlikely (if a system didn't deallocate in this case, I'd
expect it to simply return the original pointer). If it turns out to be
an issue, we can handle the "!size" case up front instead, before we
call realloc() at all.

Signed-off-by: Jeff King <peff@peff.net>
---
 wrapper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/wrapper.c b/wrapper.c
index 4ff4a9c3db..b0d375beee 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
 	memory_limit_check(size, 0);
 	ret = realloc(ptr, size);
 	if (!ret && !size)
-		ret = realloc(ptr, 1);
+		ret = realloc(ret, 1);
 	if (!ret)
 		die("Out of memory, realloc failed");
 	return ret;
-- 
2.28.0.844.g468840f815

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 11:18 [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc() Jeff King
@ 2020-09-01 13:04 ` Derrick Stolee
  2020-09-01 13:51   ` Jeff King
  2020-09-01 15:20 ` Andreas Schwab
  2020-09-01 15:56 ` Junio C Hamano
  2 siblings, 1 reply; 10+ messages in thread
From: Derrick Stolee @ 2020-09-01 13:04 UTC (permalink / raw)
  To: Jeff King, git

On 9/1/2020 7:18 AM, Jeff King wrote:
> This patch fixes a bug where xrealloc(ptr, 0) can double-free and
> corrupt the heap on some platforms (including at least glibc).

!!! Good find !!!

> The simplest fix here is to just pass "ret" (which we know to be NULL)
> to the follow-up realloc(). That does mean that a system which _doesn't_
> free the original pointer would leak it. But that interpretation of the
> standard seems unlikely (if a system didn't deallocate in this case, I'd
> expect it to simply return the original pointer). If it turns out to be
> an issue, we can handle the "!size" case up front instead, before we
> call realloc() at all.

Adding an `if (!size) {free(ptr); return NULL;}` block was what I
expected. Was that chosen just so we can rely more on the system
realloc(), or is there a performance implication that I'm not
seeing?

> @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
>  	memory_limit_check(size, 0);
>  	ret = realloc(ptr, size);
>  	if (!ret && !size)
> -		ret = realloc(ptr, 1);
> +		ret = realloc(ret, 1);

I appreciate all the additional context for such a small change.

LGTM.
-Stolee

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 13:04 ` Derrick Stolee
@ 2020-09-01 13:51   ` Jeff King
  2020-09-01 14:24     ` Derrick Stolee
  2020-09-01 15:58     ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Jeff King @ 2020-09-01 13:51 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Tue, Sep 01, 2020 at 09:04:36AM -0400, Derrick Stolee wrote:

> > The simplest fix here is to just pass "ret" (which we know to be NULL)
> > to the follow-up realloc(). That does mean that a system which _doesn't_
> > free the original pointer would leak it. But that interpretation of the
> > standard seems unlikely (if a system didn't deallocate in this case, I'd
> > expect it to simply return the original pointer). If it turns out to be
> > an issue, we can handle the "!size" case up front instead, before we
> > call realloc() at all.
> 
> Adding an `if (!size) {free(ptr); return NULL;}` block was what I
> expected. Was that chosen just so we can rely more on the system
> realloc(), or is there a performance implication that I'm not
> seeing?

I went back and forth on whether to do that or not. This case should
basically never happen, so I like both the performance and readability
of only triggering it when realloc() returns NULL. But it would get rid
of the hand-waving above, and I doubt the performance is measurable.

If we do handle it up-front, then I think we'd actually want:

  if (!size) {
          free(ptr);
	  return xmalloc(0);
  }

(i.e., to never return NULL for consistency with xmalloc() and
xcalloc()).

> > @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
> >  	memory_limit_check(size, 0);
> >  	ret = realloc(ptr, size);
> >  	if (!ret && !size)
> > -		ret = realloc(ptr, 1);
> > +		ret = realloc(ret, 1);
> 
> I appreciate all the additional context for such a small change.

Somebody's got to complete with you for ratio of commit message to diff
lines. :)

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 13:51   ` Jeff King
@ 2020-09-01 14:24     ` Derrick Stolee
  2020-09-01 15:58     ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Derrick Stolee @ 2020-09-01 14:24 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 9/1/2020 9:51 AM, Jeff King wrote:
> On Tue, Sep 01, 2020 at 09:04:36AM -0400, Derrick Stolee wrote:
>> Adding an `if (!size) {free(ptr); return NULL;}` block was what I
>> expected. Was that chosen just so we can rely more on the system
>> realloc(), or is there a performance implication that I'm not
>> seeing?
> 
> I went back and forth on whether to do that or not. This case should
> basically never happen, so I like both the performance and readability
> of only triggering it when realloc() returns NULL. But it would get rid
> of the hand-waving above, and I doubt the performance is measurable.
> 
> If we do handle it up-front, then I think we'd actually want:
> 
>   if (!size) {
>           free(ptr);
> 	  return xmalloc(0);
>   }
> 
> (i.e., to never return NULL for consistency with xmalloc() and
> xcalloc()).

Good point. In that sense, your change makes a lot more
sense for staying consistent without strangeness like xmalloc(0).

>>> @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
>>>  	memory_limit_check(size, 0);
>>>  	ret = realloc(ptr, size);
>>>  	if (!ret && !size)
>>> -		ret = realloc(ptr, 1);
>>> +		ret = realloc(ret, 1);
>>
>> I appreciate all the additional context for such a small change.
> 
> Somebody's got to complete with you for ratio of commit message to diff
> lines. :)

Pretty sure I have a long way to match, but it is important to
have goals.

-Stolee

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 11:18 [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc() Jeff King
  2020-09-01 13:04 ` Derrick Stolee
@ 2020-09-01 15:20 ` Andreas Schwab
  2020-09-01 15:56 ` Junio C Hamano
  2 siblings, 0 replies; 10+ messages in thread
From: Andreas Schwab @ 2020-09-01 15:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Sep 01 2020, Jeff King wrote:

> diff --git a/wrapper.c b/wrapper.c
> index 4ff4a9c3db..b0d375beee 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
>  	memory_limit_check(size, 0);
>  	ret = realloc(ptr, size);
>  	if (!ret && !size)
> -		ret = realloc(ptr, 1);
> +		ret = realloc(ret, 1);

Since you already know ret == NULL, you could just say malloc(1), As
written, it looks like a typo (why use a variable to pass a constant?).

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 11:18 [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc() Jeff King
  2020-09-01 13:04 ` Derrick Stolee
  2020-09-01 15:20 ` Andreas Schwab
@ 2020-09-01 15:56 ` Junio C Hamano
  2 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-09-01 15:56 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> The simplest fix here is to just pass "ret" (which we know to be NULL)
> to the follow-up realloc(). That does mean that a system which _doesn't_
> free the original pointer would leak it. But that interpretation of the
> standard seems unlikely (if a system didn't deallocate in this case, I'd
> expect it to simply return the original pointer). If it turns out to be
> an issue, we can handle the "!size" case up front instead, before we
> call realloc() at all.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  wrapper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/wrapper.c b/wrapper.c
> index 4ff4a9c3db..b0d375beee 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
>  	memory_limit_check(size, 0);
>  	ret = realloc(ptr, size);
>  	if (!ret && !size)
> -		ret = realloc(ptr, 1);
> +		ret = realloc(ret, 1);

The original does look bogus.

It however may be easier to reason about if we used malloc(1) in the
fallback path for "we got NULL after asking for 0-byte" instead.  I
would have a hard time guessing the reason why we are reallocating
NULL without going back to this commit, reading the log and seeing
the original to see that the reason why we didn't use malloc() but
realloc() is we aimed for a minimum change, if I encounter this code
after I forgot this discussion.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 13:51   ` Jeff King
  2020-09-01 14:24     ` Derrick Stolee
@ 2020-09-01 15:58     ` Junio C Hamano
  2020-09-02  7:54       ` Jeff King
  2020-09-03  3:50       ` Jonathan Nieder
  1 sibling, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-09-01 15:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Derrick Stolee, git

Jeff King <peff@peff.net> writes:

> If we do handle it up-front, then I think we'd actually want:
>
>   if (!size) {
>           free(ptr);
> 	  return xmalloc(0);
>   }
>
> (i.e., to never return NULL for consistency with xmalloc() and
> xcalloc()).

Makes sense.  I suspect that this is optimizing for a wrong case,
but in practice that should not matter.  Not having to worry about
a request to resize to 0-byte in the remainder of the function is
actually a plus for readability, I would say.

>> > @@ -120,7 +120,7 @@ void *xrealloc(void *ptr, size_t size)
>> >  	memory_limit_check(size, 0);
>> >  	ret = realloc(ptr, size);
>> >  	if (!ret && !size)
>> > -		ret = realloc(ptr, 1);
>> > +		ret = realloc(ret, 1);
>> 
>> I appreciate all the additional context for such a small change.
>
> Somebody's got to complete with you for ratio of commit message to diff
> lines. :)
>
> -Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 15:58     ` Junio C Hamano
@ 2020-09-02  7:54       ` Jeff King
  2020-09-02 19:19         ` Junio C Hamano
  2020-09-03  3:50       ` Jonathan Nieder
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff King @ 2020-09-02  7:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git

On Tue, Sep 01, 2020 at 08:58:54AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > If we do handle it up-front, then I think we'd actually want:
> >
> >   if (!size) {
> >           free(ptr);
> > 	  return xmalloc(0);
> >   }
> >
> > (i.e., to never return NULL for consistency with xmalloc() and
> > xcalloc()).
> 
> Makes sense.  I suspect that this is optimizing for a wrong case,
> but in practice that should not matter.  Not having to worry about
> a request to resize to 0-byte in the remainder of the function is
> actually a plus for readability, I would say.

OK. Then here it is like that (and a slightly updated final paragraph in
the commit message).

There are other variants, too:

  - we could use malloc(1) versus xmalloc(0). Maybe more
    readable/obvious? But also potentially allocates an extra byte when
    the platform malloc(0) would not need to.

  - we could return a non-NULL "ptr" without shrinking it at all (nor
    allocating anything new). This is perfectly legal, and the
    underlying realloc() would still know the original size if anybody
    ever asked to grow it back again.

I have to admit I don't overly care between them. I suspect one of the
reasons we never ran into this 15-year-old bug is that it's quite hard
to convince Git to call realloc(0) in the first place. I only saw it
when investigating a bug in another series, and there the problem turned
out to be reading garbage bytes off the end of a buffer (which we
interpreted as a serialized ewah bitmap which happened to have a zero in
its length byte).

-- >8 --
Subject: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()

This patch fixes a bug where xrealloc(ptr, 0) can double-free and
corrupt the heap on some platforms (including at least glibc).

The C99 standard says of malloc (section 7.20.3):

  If the size of the space requested is zero, the behavior is
  implementation-defined: either a null pointer is returned, or the
  behavior is as if the size were some nonzero value, except that the
  returned pointer shall not be used to access an object.

So we might get NULL back, or we might get an actual pointer (but we're
not allowed to look at its contents). To simplify our code, our
xmalloc() handles a NULL return by converting it into a single-byte
allocation. That way callers get consistent behavior. This was done way
back in 4e7a2eccc2 (?alloc: do not return NULL when asked for zero
bytes, 2005-12-29).

We also gave xcalloc() and xrealloc() the same treatment. And according
to C99, that is fine; the text above is in a paragraph that applies to
all three. But what happens to the memory we passed to realloc() in such
a case? I.e., if we do:

  ret = realloc(ptr, 0);

and "ptr" is non-NULL, but we get NULL back, is "ptr" still valid? C99
doesn't cover this case specifically, but says (section 7.20.3.4):

  The realloc function deallocates the old object pointed to by ptr and
  returns a pointer to a new object that has the size specified by size.

So "ptr" is now deallocated, and we must only look at "ret". And since
"ret" is NULL, that means we have no allocated object at all. But that's
not quite the whole story. It also says:

  If memory for the new object cannot be allocated, the old object is
  not deallocated and its value is unchanged.
  [...]
  The realloc function returns a pointer to the new object (which may
  have the same value as a pointer to the old object), or a null pointer
  if the new object could not be allocated.

So if we see a NULL return with a non-zero size, we can expect that the
original object _is_ still valid. But with a non-zero size, it's
ambiguous. The NULL return might mean a failure (in which case the
object is valid), or it might mean that we successfully allocated
nothing, and used NULL to represent that.

The glibc manpage for realloc() explicitly says:

  [...]if size is equal to zero, and ptr is not NULL, then the call is
  equivalent to free(ptr).

Likewise, this StackOverflow answer:

  https://stackoverflow.com/a/2135302

claims that C89 gave similar guidance (but I don't have a copy to verify
it). A comment on this answer:

  https://stackoverflow.com/a/2022410

claims that Microsoft's CRT behaves the same.

But our current "retry with 1 byte" code passes the original pointer
again. So on glibc, we effectively free() the pointer and then try to
realloc() it again, which is undefined behavior.

The simplest fix here is to just pass "ret" (which we know to be NULL)
to the follow-up realloc(). But that means that a system which _doesn't_
free the original pointer would leak it. It's not clear if any such
systems exist, and that interpretation of the standard seems unlikely
(I'd expect a system that doesn't deallocate to simply return the
original pointer in this case). But it's easy enough to err on the safe
side, and just never pass a zero size to realloc() at all.

Signed-off-by: Jeff King <peff@peff.net>
---
 wrapper.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index 4ff4a9c3db..bcda41e374 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -117,10 +117,13 @@ void *xrealloc(void *ptr, size_t size)
 {
 	void *ret;
 
+	if (!size) {
+		free(ptr);
+		return xmalloc(0);
+	}
+
 	memory_limit_check(size, 0);
 	ret = realloc(ptr, size);
-	if (!ret && !size)
-		ret = realloc(ptr, 1);
 	if (!ret)
 		die("Out of memory, realloc failed");
 	return ret;
-- 
2.28.0.844.g468840f815


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-02  7:54       ` Jeff King
@ 2020-09-02 19:19         ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-09-02 19:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Derrick Stolee, git

Jeff King <peff@peff.net> writes:

> There are other variants, too:
>
>   - we could use malloc(1) versus xmalloc(0). Maybe more
>     readable/obvious? But also potentially allocates an extra byte when
>     the platform malloc(0) would not need to.
>
>   - we could return a non-NULL "ptr" without shrinking it at all (nor
>     allocating anything new). This is perfectly legal, and the
>     underlying realloc() would still know the original size if anybody
>     ever asked to grow it back again.
>
> I have to admit I don't overly care between them.

I don't either.  I admit that the latter I didn't think of---it
feels tricky and harder to reason about than any other variants.

> I suspect one of the
> reasons we never ran into this 15-year-old bug is that it's quite hard
> to convince Git to call realloc(0) in the first place. I only saw it
> when investigating a bug in another series, and there the problem turned
> out to be reading garbage bytes off the end of a buffer (which we
> interpreted as a serialized ewah bitmap which happened to have a zero in
> its length byte).

Thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
  2020-09-01 15:58     ` Junio C Hamano
  2020-09-02  7:54       ` Jeff King
@ 2020-09-03  3:50       ` Jonathan Nieder
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Nieder @ 2020-09-03  3:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Derrick Stolee, git

Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:

>> If we do handle it up-front, then I think we'd actually want:
>>
>>   if (!size) {
>>           free(ptr);
>>           return xmalloc(0);
>>   }
>>
>> (i.e., to never return NULL for consistency with xmalloc() and
>> xcalloc()).
>
> Makes sense.  I suspect that this is optimizing for a wrong case,
> but in practice that should not matter.  Not having to worry about
> a request to resize to 0-byte in the remainder of the function is
> actually a plus for readability, I would say.

I agree with both points: if we were repeatedly shrinking and growing
a buffer and cared about its performance, then we'd want the first
version, and since we aren't, we should prefer this version that is
more readable.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-03  3:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-01 11:18 [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc() Jeff King
2020-09-01 13:04 ` Derrick Stolee
2020-09-01 13:51   ` Jeff King
2020-09-01 14:24     ` Derrick Stolee
2020-09-01 15:58     ` Junio C Hamano
2020-09-02  7:54       ` Jeff King
2020-09-02 19:19         ` Junio C Hamano
2020-09-03  3:50       ` Jonathan Nieder
2020-09-01 15:20 ` Andreas Schwab
2020-09-01 15:56 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).