git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH] tree-walk: disallow overflowing modes
Date: Mon, 23 Jan 2023 09:33:57 +0100	[thread overview]
Message-ID: <230123.86a629tzgc.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <d673fde7-7eb2-6306-86b6-1c1a4c988ee8@web.de>


On Sat, Jan 21 2023, René Scharfe wrote:

> When parsing tree entries, reject mode values that don't fit into an
> unsigned int.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  tree-walk.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tree-walk.c b/tree-walk.c
> index 74f4d710e8..5e7bc38600 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -17,6 +17,8 @@ static const char *get_mode(const char *str, unsigned int *modep)
>  	while ((c = *str++) != ' ') {
>  		if (c < '0' || c > '7')
>  			return NULL;
> +		if ((mode << 3) >> 3 != mode)
> +			return NULL;
>  		mode = (mode << 3) + (c - '0');
>  	}
>  	*modep = mode;

There was a discussion about this on git-security last August, in a
report that turned out to be uninteresting for the security aspects.

I'll just quote my own reply here out of context
(<220811.86mtcbqd5x.gmgdl@evledraar.gmail.com> for those with access).

For context the other issue in the "two issues" below is the one I'm
working towards fixing in
https://lore.kernel.org/git/cover-0.4-00000000000-20221118T113442Z-avarab@gmail.com/,
the other is this file mode overflow.

As noted at the end below I'm conflicted between whether we should "fix"
this in some way, or just intentionally leave it alone, because while
it's entirely accidental, this is the one part of git's object format
I'm aware of that we could sneak in some extension in the future,
without entirely breaking backwards compatibility.

BEGIN QUOTE

There's really two issues being raised here, how we validate the "mode"
in tree entries, and how we sometimes misreport object type X as type Y.

I replied on-list just now noting that the "mode" thing is something I
was working towards fixing a while ago, but am happy to see a more
isolated fix for:
https://lore.kernel.org/git/220811.86r11nqfi2.gmgdl@evledraar.gmail.com/

I have local patches for the misreporting of object types, it's not
*that* hard to get right. Basically we just need to be more careful
about how we populate the "struct object". The most common case is when
we parse tags that say on their envelope that we refer to a type X, but
it's really a type Y.

In that case we haven't .parsed=1 the object, but we note the wrong type
down. My local patches are basically just deferring that, so we don't
trust such type claims, and take our *actual* parsing of the object over
a previous reference in the object store.

I don't think that leaves any difficult edge cases related to tree
parsing, as Xavier notes we'd have modes claiming that an X is Y, but we
can resolve it using the same principle.

But regarding the "mode parsing" I think
https://lore.kernel.org/git/YvQdR3sDqDMCIjIE@coredump.intra.peff.net/ is
taking us in the wrong direction, but wasn't comfortable saying so on
list, because this is "exploitable" in the following way:

	$ perl -wE 'say unpack "B*", "hello world"'
	0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100

Which, combined with Jeff's series on-list you can try e.g.:
	
	diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
	index ac4099ca893..d435dfd64c5 100755
	--- a/t/t5504-fetch-receive-strict.sh
	+++ b/t/t5504-fetch-receive-strict.sh
	@@ -357,16 +357,16 @@ test_expect_success 'badFilemode is not a strict error' '
	 	tree=$(
	 		cd badmode.git &&
	 		blob=$(echo blob | git hash-object -w --stdin | hex2oct) &&
	-		printf "123456 foo\0${blob}" |
	+		printf "106640110100001100101011011000110110001101111001000000111011101101111011100100110110001100100664 foo\0${blob}" |
	 		git hash-object -t tree --stdin -w --literally
	 	) &&
	 
	 	rm -rf dst.git &&
	 	git init --bare dst.git &&
	 	git -C dst.git config transfer.fsckObjects true &&
	-
	-	git -C badmode.git push ../dst.git $tree:refs/tags/tree 2>err &&
	-	grep "$tree: badFilemode" err
	+	git -C badmode.git push ../dst.git $tree:refs/tags/tree &&
	+	git -C badmode.git fsck &&
	+	git -C dst.git fsck
	 '
	 
	 test_done
	diff --git a/tree-walk.c b/tree-walk.c
	index 74f4d710e8f..2fb9f2e6cbe 100644
	--- a/tree-walk.c
	+++ b/tree-walk.c
	@@ -15,6 +15,7 @@ static const char *get_mode(const char *str, unsigned int *modep)
	 		return NULL;
	 
	 	while ((c = *str++) != ' ') {
	+		fprintf(stderr, "%c\n", c);
	 		if (c < '0' || c > '7')
	 			return NULL;
	 		mode = (mode << 3) + (c - '0');

Which gets you all tests passing, i.e. the particulars of the mode check
are such that we'll accept a very long mode (but if you play with it
you'll find it's not infinite, we'll run into other buffers elsewhere
pretty soon).

This is all assuming you're on a system whose "unsigned int" is 64 bit.

This part of it is also something I discovered in the beginning of 2021
(and there's some old off-list thread between myself & Elijah on the
topic I could dig up), but didn't report here at the time.

So, we could just "fix" it, and re the "wrong direction" above
downgrading the "bad mode" check to a mere warning is going a bit too
far given the above. We have some odd modes in the wild, but we don't
have such "overflowing modes".

On the other hand this edge case is also a golden opportunity we're not
likely to ever have again. We can't really change the git object format
at this point without *major* headaches, but in this case we have the
ability to encode arbitrary data into tree entries (e.g file metadata)
as long as the writer makes sure they overflow back to the valid
filemode :)

As hacky as that is I think it would be regrettable to forever close the
door on that to fix a hypothetical security bug, hypothetical because no
version of git.git has an issue with it. But it is a *potential* edge
case in overflowing any other tree parsing code out there in the wild
(which might be otherwise guarded by a stricter fsck check of ours).

END QUOTE

  parent reply	other threads:[~2023-01-23  8:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-21  9:36 [PATCH] tree-walk: disallow overflowing modes René Scharfe
2023-01-22  7:50 ` Jeff King
2023-01-22 10:03   ` René Scharfe
2023-01-22 16:36     ` Junio C Hamano
2023-01-22 22:02     ` Jeff King
2023-01-23  8:33 ` Ævar Arnfjörð Bjarmason [this message]
2023-01-24 18:53   ` René Scharfe
2023-01-24 20:44     ` Junio C Hamano
2023-01-26 11:36       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=230123.86a629tzgc.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).