git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Phillip Wood" <phillip.wood123@gmail.com>,
	"Thomas Bock" <bockthom@cs.uni-saarland.de>,
	"Derrick Stolee" <derrickstolee@github.com>,
	git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>
Subject: Re: [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp
Date: Wed, 26 Apr 2023 07:36:58 -0400	[thread overview]
Message-ID: <20230426113658.GC130148@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqttx43q08.fsf@gitster.g>

On Tue, Apr 25, 2023 at 09:06:47AM -0700, Junio C Hamano wrote:

> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
> > This probably doesn't matter in practice but we define our own
> > isspace() that does not treat '\v' and '\f' as whitespace. However
> > parse_timestamp() (which is just strtoumax()) uses the standard
> > library's isspace() which does treat those characters as whitespace
> > and is locale dependent. This means we can potentially stop at a
> > character that parse_timestamp() treats as whitespace and if there are
> > no digits after it we'll still walk past the end of the line. Using
> > Rene's suggestion of testing the character with isdigit() would fix
> > that. It would also avoid parsing negative timestamps as positive
> > numbers and reject any timestamps that begin with a locale dependent
> > digit.
> 
> A very interesting observation.  I wonder if a curious person can
> craft a malformed timestamp with "hash-object --literally" to do
> more than DoS themselves?

I think the answer is no, because the worst case is that they read to
the trailing NUL that we stick after any object content we read into
memory. So we'd mis-parse:

  committer name <email> \v\n

  123456 in the subject line

to read "123456" as the commit timestamp (so basically the same bug my
patch was trying to fix). But we'd never read out-of-bounds memory.
Still, it does not give me warm fuzzies, and I think is worth fixing.

> We are not going to put anything other than [ 0-9+-] after the '>'
> we scan for, and making sure '>' is followed by SP and then [0-9]
> would be sufficient to ensure strtoumax() to stop before the '\n'
> but does not ensure that the "signal a bad timestamp with 0"
> happens.  Perhaps that would be sufficient.  I dunno.

Any single non-whitespace character at all would be sufficient to avoid
the problem. And that's what the current iteration of the patch is
trying to do. It's just that our definition of "whitespace" has to agree
with strtoumax()'s for it to work. And as Phillip notes, that may even
include locale dependent characters. So I don't think we want to get
into trying to match them all (i.e., a "allow known" strategy).

Instead, we should go back to what the original iteration of the series
was doing, and make sure there is at least one digit (i.e., a "forbid
unknown" strategy). Assuming that there is no locale where ascii "1" is
considered whitespace. ;)

Note that will exclude a few cases that we do allow now, like:

  committer name <email> \v123456 +0000\n

Right now that parses as "123456", but we'd reject it as "0" after such
a patch.

The alternative is to check _all_ of the characters between ">" and the
newline and make sure there is some digit somewhere, which would be
sufficient to prevent strtoumax() from walking past the newline.

I guess it's not even any more expensive in the normal case (since the
very first non-whitespace entry should be a digit!). I'm not sure it's
worth caring about too much either way. Garbage making it into
name/email is an easy mistake to make (for users and implementations).
Putting whitespace control codes into your timestamp is not, and marking
them as "0" is an OK outcome.

-Peff

  reply	other threads:[~2023-04-26 11:37 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 11:37 Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Thomas Bock
2023-04-15  8:52 ` Jeff King
2023-04-15  8:59   ` Jeff King
2023-04-15 14:10   ` Kristoffer Haugsbakk
2023-04-17  5:40     ` Jeff King
2023-04-17  6:20       ` Kristoffer Haugsbakk
2023-04-17  7:41         ` Jeff King
2023-04-27 22:32           ` Kristoffer Haugsbakk
2023-04-17  9:51   ` Junio C Hamano
2023-04-18  4:12     ` Jeff King
2023-04-18 14:02       ` Derrick Stolee
2023-04-21 14:51         ` Thomas Bock
2023-04-22 13:41           ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-22 13:42             ` [PATCH 1/3] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-22 13:47             ` [PATCH 2/3] parse_commit(): parse timestamp from end of line Jeff King
2023-04-24 17:05               ` Junio C Hamano
2023-04-25  5:23                 ` Jeff King
2023-04-24 16:39             ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Junio C Hamano
2023-04-25  5:52             ` [PATCH v2 " Jeff King
2023-04-25  5:54               ` Jeff King
2023-04-25  5:54               ` [PATCH v2 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-25  5:54               ` [PATCH v2 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-25  5:54               ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-25 10:11                 ` Phillip Wood
2023-04-25 16:06                   ` Junio C Hamano
2023-04-26 11:36                     ` Jeff King [this message]
2023-04-26 15:32                       ` Junio C Hamano
2023-04-27  8:13                         ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27  8:14                           ` [PATCH v3 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-27  8:14                           ` [PATCH v3 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-27  8:17                           ` [PATCH v3 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-27 10:11                             ` Phillip Wood
2023-04-27 11:55                               ` Phillip Wood
2023-04-27 16:46                                 ` Jeff King
2023-04-27 16:20                               ` Junio C Hamano
2023-04-27 16:55                                 ` Jeff King
2023-04-27 16:25                             ` Junio C Hamano
2023-04-27 16:57                               ` Jeff King
2023-04-27  8:17                           ` [PATCH v3 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-27  8:18                           ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27 16:32                           ` Junio C Hamano
2023-04-26 14:06                     ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Phillip Wood
2023-04-26 14:31                       ` Andreas Schwab
2023-04-26 14:44                         ` Phillip Wood
2023-04-25  5:55               ` [PATCH v2 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-22 13:52         ` Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230426113658.GC130148@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=bockthom@cs.uni-saarland.de \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).