git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest
@ 2017-10-05  5:59 Jeff King
  2017-10-06  4:19 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2017-10-05  5:59 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Jonathan Nieder

When a caller of sha1_object_info_extended() sets the
"contentp" field in object_info, we call unpack_sha1_rest()
but do not check whether it signaled an error.

This causes two problems:

  1. We pass back NULL to the caller via the contentp field,
     but the function returns "0" for success. A caller
     might reasonably expect after a successful return that
     it can access contentp without a NULL check and
     segfault.

     As it happens, this is impossible to trigger in the
     current code. There is exactly one caller which uses
     contentp, read_object(). And the only thing it does
     after a successful call is to return the content
     pointer to its caller, using NULL as a sentinel for
     errors. So in effect it converts the success code from
     sha1_object_info_extended() back into an error!

     But this is still worth addressing avoid problems for
     future users of "contentp".

  2. Callers of unpack_sha1_rest() are expected to close the
     zlib stream themselves on error. Which means that we're
     leaking the stream.

The problem in (1) comes from from c84a1f3ed4 (sha1_file:
refactor read_object, 2017-06-21), which added the contentp
field.  Before that, we called unpack_sha1_rest() via
unpack_sha1_file(), which directly used the NULL to signal
an error.

But note that the leak in (2) is actually older than that.
The original unpack_sha1_file() directly returned the result
of unpack_sha1_rest() to its caller, when it should have
been closing the zlib stream itself on error.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_file.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 09ad64ce55..10c3a0083d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1124,10 +1124,14 @@ static int sha1_loose_object_info(const unsigned char *sha1,
 	} else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0)
 		status = error("unable to parse %s header", sha1_to_hex(sha1));
 
-	if (status >= 0 && oi->contentp)
+	if (status >= 0 && oi->contentp) {
 		*oi->contentp = unpack_sha1_rest(&stream, hdr,
 						 *oi->sizep, sha1);
-	else
+		if (!*oi->contentp) {
+			git_inflate_end(&stream);
+			status = -1;
+		}
+	} else
 		git_inflate_end(&stream);
 
 	munmap(map, mapsize);
-- 
2.14.2.1117.g65a3442612

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest
  2017-10-05  5:59 [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest Jeff King
@ 2017-10-06  4:19 ` Junio C Hamano
  2017-10-06  4:30   ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-10-06  4:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jonathan Tan, Jonathan Nieder

Jeff King <peff@peff.net> writes:

> When a caller of sha1_object_info_extended() sets the
> "contentp" field in object_info, we call unpack_sha1_rest()
> but do not check whether it signaled an error.
>
> This causes two problems:
>
>   1. We pass back NULL to the caller via the contentp field,
>      but the function returns "0" for success. A caller
>      might reasonably expect after a successful return that
>      it can access contentp without a NULL check and
>      segfault.
>
>      As it happens, this is impossible to trigger in the
>      current code. There is exactly one caller which uses
>      contentp, read_object(). And the only thing it does
>      after a successful call is to return the content
>      pointer to its caller, using NULL as a sentinel for
>      errors. So in effect it converts the success code from
>      sha1_object_info_extended() back into an error!
>
>      But this is still worth addressing avoid problems for
>      future users of "contentp".
>
>   2. Callers of unpack_sha1_rest() are expected to close the
>      zlib stream themselves on error. Which means that we're
>      leaking the stream.
>
> The problem in (1) comes from from c84a1f3ed4 (sha1_file:
> refactor read_object, 2017-06-21), which added the contentp
> field.  Before that, we called unpack_sha1_rest() via
> unpack_sha1_file(), which directly used the NULL to signal
> an error.
>
> But note that the leak in (2) is actually older than that.
> The original unpack_sha1_file() directly returned the result
> of unpack_sha1_rest() to its caller, when it should have
> been closing the zlib stream itself on error.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---

Obviously correct.  (2) is as old as Git itself; it eventually
blames down to e83c5163 ("Initial revision of "git", the information
manager from hell", 2005-04-07), where read-cache.c::unpack_sha1_file()
liberally returns NULL without cleaning up the zstream.

Thanks.

>  sha1_file.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index 09ad64ce55..10c3a0083d 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1124,10 +1124,14 @@ static int sha1_loose_object_info(const unsigned char *sha1,
>  	} else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0)
>  		status = error("unable to parse %s header", sha1_to_hex(sha1));
>  
> -	if (status >= 0 && oi->contentp)
> +	if (status >= 0 && oi->contentp) {
>  		*oi->contentp = unpack_sha1_rest(&stream, hdr,
>  						 *oi->sizep, sha1);
> -	else
> +		if (!*oi->contentp) {
> +			git_inflate_end(&stream);
> +			status = -1;
> +		}
> +	} else
>  		git_inflate_end(&stream);
>  
>  	munmap(map, mapsize);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest
  2017-10-06  4:19 ` Junio C Hamano
@ 2017-10-06  4:30   ` Jeff King
  2017-10-06  4:38     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2017-10-06  4:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, Jonathan Nieder

On Fri, Oct 06, 2017 at 01:19:21PM +0900, Junio C Hamano wrote:

> > But note that the leak in (2) is actually older than that.
> > The original unpack_sha1_file() directly returned the result
> > of unpack_sha1_rest() to its caller, when it should have
> > been closing the zlib stream itself on error.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> 
> Obviously correct.  (2) is as old as Git itself; it eventually
> blames down to e83c5163 ("Initial revision of "git", the information
> manager from hell", 2005-04-07), where read-cache.c::unpack_sha1_file()
> liberally returns NULL without cleaning up the zstream.

Thanks, I as too lazy to dig down further, but I'm always interested to
see the roots of these things (especially "bug in the original" versus
"introduced by a careless refactor").

I have a feeling that the world would be a better place if
unpack_sha1_rest() just always promised to close the zstream, since no
callers seem to want to look at it in the error case. But I wanted to go
for the minimal fix first.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest
  2017-10-06  4:30   ` Jeff King
@ 2017-10-06  4:38     ` Jeff King
  2017-10-10  1:34       ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2017-10-06  4:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jonathan Tan, Jonathan Nieder

On Fri, Oct 06, 2017 at 12:30:08AM -0400, Jeff King wrote:

> On Fri, Oct 06, 2017 at 01:19:21PM +0900, Junio C Hamano wrote:
> 
> > > But note that the leak in (2) is actually older than that.
> > > The original unpack_sha1_file() directly returned the result
> > > of unpack_sha1_rest() to its caller, when it should have
> > > been closing the zlib stream itself on error.
> > >
> > > Signed-off-by: Jeff King <peff@peff.net>
> > > ---
> > 
> > Obviously correct.  (2) is as old as Git itself; it eventually
> > blames down to e83c5163 ("Initial revision of "git", the information
> > manager from hell", 2005-04-07), where read-cache.c::unpack_sha1_file()
> > liberally returns NULL without cleaning up the zstream.
> 
> Thanks, I as too lazy to dig down further, but I'm always interested to
> see the roots of these things (especially "bug in the original" versus
> "introduced by a careless refactor").
> 
> I have a feeling that the world would be a better place if
> unpack_sha1_rest() just always promised to close the zstream, since no
> callers seem to want to look at it in the error case. But I wanted to go
> for the minimal fix first.

Actually, there are only two callers left these days. One of them leaks,
and the other immediately closes the zstream. So something like:

diff --git a/sha1_file.c b/sha1_file.c
index 09ad64ce55..cea003d182 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -978,10 +978,10 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s
 		while (status == Z_OK)
 			status = git_inflate(stream, Z_FINISH);
 	}
-	if (status == Z_STREAM_END && !stream->avail_in) {
-		git_inflate_end(stream);
+	git_inflate_end(stream);
+
+	if (status == Z_STREAM_END && !stream->avail_in)
 		return buf;
-	}
 
 	if (status < 0)
 		error("corrupt loose object '%s'", sha1_to_hex(sha1));
@@ -2107,7 +2107,6 @@ int read_loose_object(const char *path,
 		*contents = unpack_sha1_rest(&stream, hdr, *size, expected_sha1);
 		if (!*contents) {
 			error("unable to unpack contents of %s", path);
-			git_inflate_end(&stream);
 			goto out;
 		}
 		if (check_sha1_signature(expected_sha1, *contents,

seems reasonable. Doing it that (with my other patch on top) splits the
leak-fix and the not-yet-a-bug-but-confusing-error-return problems into
two separate patches.

I dunno. There aren't that many callers of unpack_sha1_rest(), so it may
not matter that much, but while we're here...

-Peff

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest
  2017-10-06  4:38     ` Jeff King
@ 2017-10-10  1:34       ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-10-10  1:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jonathan Tan, Jonathan Nieder

Jeff King <peff@peff.net> writes:

> Actually, there are only two callers left these days. One of them leaks,
> and the other immediately closes the zstream. So something like:
>
> diff --git a/sha1_file.c b/sha1_file.c
> index 09ad64ce55..cea003d182 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -978,10 +978,10 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s
>  		while (status == Z_OK)
>  			status = git_inflate(stream, Z_FINISH);
>  	}
> -	if (status == Z_STREAM_END && !stream->avail_in) {
> -		git_inflate_end(stream);
> +	git_inflate_end(stream);
> +
> +	if (status == Z_STREAM_END && !stream->avail_in)
>  		return buf;
> -	}
>  
>  	if (status < 0)
>  		error("corrupt loose object '%s'", sha1_to_hex(sha1));
> @@ -2107,7 +2107,6 @@ int read_loose_object(const char *path,
>  		*contents = unpack_sha1_rest(&stream, hdr, *size, expected_sha1);
>  		if (!*contents) {
>  			error("unable to unpack contents of %s", path);
> -			git_inflate_end(&stream);
>  			goto out;
>  		}
>  		if (check_sha1_signature(expected_sha1, *contents,
>
> seems reasonable. Doing it that (with my other patch on top) splits the
> leak-fix and the not-yet-a-bug-but-confusing-error-return problems into
> two separate patches.
>
> I dunno. There aren't that many callers of unpack_sha1_rest(), so it may
> not matter that much, but while we're here...

Yeah, I agree that it is a reasonable thing to do.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-10-10  1:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-05  5:59 [PATCH] sha1_loose_object_info: handle errors from unpack_sha1_rest Jeff King
2017-10-06  4:19 ` Junio C Hamano
2017-10-06  4:30   ` Jeff King
2017-10-06  4:38     ` Jeff King
2017-10-10  1:34       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).