git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Jeremy Linton <lintonrjeremy@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Duy Nguyen <pclouds@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH] packfile: Correct zlib buffer handling
Date: Tue, 12 Jun 2018 20:04:38 -0500
Message-ID: <CAEFTgiw03eDXAobWrP4J_CM1uGHoAEUkguV1_agAiNkssCpwyg@mail.gmail.com> (raw)
In-Reply-To: <xmqqk1rolcxg.fsf@gitster-ct.c.googlers.com>

Hi,

Sorry about the delay here (bit of a mix-up and didn't reply to the list).

(see inline )

On Sun, May 27, 2018 at 9:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>>> On Sun, May 27, 2018 at 1:57 AM, Junio C Hamano <gitster@pobox.com> wrote:

(trimming)

>
> Specifically, I was worried about this assertion:
>
>     Lets rely on the fact that the source buffer will only be fully
>     consumed when the when the destination buffer is inflated to the
>     correct size.
>
> which I think is the exact bad thinking that caused troubles for us
> in the past; isn't the explanation in 456cdf6e ("Fix loose object
> uncompression check.", 2007-03-19) relevant here?
>
> -       stream.avail_out = size + 1;
> +       stream.avail_out = size;
>         ...
>                 stream.next_in = in;
>                 st = git_inflate(&stream, Z_FINISH);
>                 if (!stream.avail_out)
> -                       break; /* the payload is larger than it should be */
> +                       break; /* done, st indicates if source fully consumed */
>                 curpos += stream.next_in - in;
>         } while (st == Z_OK || st == Z_BUF_ERROR);
>         git_inflate_end(&stream);
>         if ((st != Z_STREAM_END) || stream.total_out != size) {
>                 free(buffer);
>                 return NULL;
>         }
>
> With minimum stream.avail_out without slack, when !avail_out, i.e.
> when we fully filled the output buffer, it could be that we had
> correct input that deflates to the correct size, in which case we
> are happy---st would say Z_STREAM_END, we would leave the loop
> because it is neither OK nor BUF_ERROR, and total_out would report
> the size we expected.  Or the input zlib stream may have ended with
> bytes that express "this concludes the stream", and the input bytes
> before that was sufficient to construct the original payload fully,
> and we may have just fed the bytes before that "this concludes the
> stream" to git_inflate().
>
> In such a case, we haven't consumed all the avail_in.  We may
> already have all the correct output, i.e. !avail_out, but because we
> haven't consumed the "this concludes the stream", st is not
> STREAM_END in such a case.

If I understand correctly your concerned the avail_in is longer than
what is required to fill the output buffer..

I'm fairly sure that won't result in a Z_STREAM_END, as you rightfully
point out, but the loop _will_ terminate due to the output buffer
being full and then since its not Z_STREAM_END the
unpack_compressed_entry fails, as it should.

>
> Our existing while() loop, with one-byte slack in avail_out, would
> have let us continue and the next iteration of the loop would have
> consumed the input without producing any more output (i.e. avail_out
> would have been left to 1 in both of these final two rounds) and we
> would have exited the loop.  After calling inflate_end(), we would
> have noticed STREAM_END and correct size and we would have been
> happy.

Your assuming that zlib will terminate with an error, but a fully
decompressed buffer, because it hasn't consumed the entire input
buffer. I don't think that is how it works (its not how the
documentation is written, nor the bits of code i've looked at seem to
work, which granted i'm not a zlib maintainer).


>
> The updated code would handle this latter case rather badly, no?  We
> leave the loop early, notice st is not STREAM_END, and be very
> unhappy, because this patch did not give us to consume the very end
> of the input stream and left the loop early.

Your correct if the above case is a valid zlib behavior then there
would be a problem. But, I don't think the termination is dicated by
insufficient output space until there is an attempt to utilize that
space.


>
>>> This yields two problems, first a single byte overrun won't be detected
>>> properly because the Z_STREAM_END will then be set, but the null
>>> terminator will have been overwritten.
>
> Because we compare total_out and size at the end, we would detect it
> as an error in this function, no?  Then zlib overwriting NUL would
> not be a problem, as we would free the buffer and return NULL, no?
>
>>> The other problem is that
>>> more recent zlib patches have been poisoning the unconsumed portions
>>> of the buffers which also overwrites the null, while correctly
>>> returning length and status.
>
> Isn't that a bug in zlib, though?  Or do they do that deliberately?
>
> I think a workaround with lower impact would be to manually restore
> NUL at the end of the buffer.

I agree, just resetting the NULL its likely safer, and I will repost a
patch soon which if nothing else makes git more robust to errant zlib
behavior.

  reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25 22:56 Jeremy Linton
2018-05-26  5:51 ` Duy Nguyen
2018-05-26 23:57   ` Junio C Hamano
2018-05-27  5:02     ` Duy Nguyen
2018-05-27 11:53       ` Junio C Hamano
2018-05-28  2:41         ` Junio C Hamano
2018-06-13  1:04           ` Jeremy Linton [this message]
2018-05-25 23:17 Jeremy Linton
2018-05-25 23:36 ` Eric Sunshine
2018-05-26  1:06 ` Todd Zullinger

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEFTgiw03eDXAobWrP4J_CM1uGHoAEUkguV1_agAiNkssCpwyg@mail.gmail.com \
    --to=lintonrjeremy@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox