From: Jeff King <peff@peff.net>
To: Thomas Bock <bockthom@cs.uni-saarland.de>
Cc: Derrick Stolee <derrickstolee@github.com>,
Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org
Subject: Re: [PATCH v2 0/3] fixing some parse_commit() timestamp corner cases
Date: Tue, 25 Apr 2023 01:54:42 -0400 [thread overview]
Message-ID: <20230425055442.GA4015600@coredump.intra.peff.net> (raw)
In-Reply-To: <20230425055244.GA4014505@coredump.intra.peff.net>
On Tue, Apr 25, 2023 at 01:52:45AM -0400, Jeff King wrote:
> Here's a v2 of my series. The behavior should be identical, but I've
> incorporated some comment and small code tweaks based on feedback from
> the first round.
>
> I also added a fourth patch which adds a new comment explaining some of
> the cases that were alluded to in the earlier round's patch 3.
>
> [1/4]: t4212: avoid putting git on left-hand side of pipe
> [2/4]: parse_commit(): parse timestamp from end of line
> [3/4]: parse_commit(): handle broken whitespace-only timestamp
> [4/4]: parse_commit(): describe more date-parsing failure modes
>
> commit.c | 47 +++++++++++++++++++++++++++++++++++-------
> t/t4212-log-corrupt.sh | 39 +++++++++++++++++++++++++++++++++--
> 2 files changed, 76 insertions(+), 10 deletions(-)
Whoops, forgot my range-diff (though nothing should be too surprising
based on the round 1 discussion):
1: 07932cf666 = 1: ac38ce133d t4212: avoid putting git on left-hand side of pipe
2: 7ee34c7d5f ! 2: f59e61262d parse_commit(): parse timestamp from end of line
@@ Commit message
parse back to the final ">". In theory we could use split_ident_line()
here, but it's actually a bit more strict. In particular, it requires a
valid time-zone token, too. That should be present, of course, but we
- wouldn't want to break --until for malformed cases that are working
- currently.
+ wouldn't want to break --until for cases that are working currently.
We might want to teach split_ident_line() to become more lenient there,
but it would require checking its many callers (since right now they can
@@ commit.c: static timestamp_t parse_commit_date(const char *buf, const char *tail
- if (buf >= tail)
+
+ /*
-+ * parse to end-of-line and then walk backwards, which
-+ * handles some malformed cases.
++ * Jump to end-of-line so that we can walk backwards to find the
++ * end-of-email ">". This is more forgiving of malformed cases
++ * because unexpected characters tend to be in the name and email
++ * fields.
+ */
+ eol = memchr(buf, '\n', tail - buf);
+ if (!eol)
return 0;
- dateptr = buf;
- while (buf < tail && *buf++ != '\n')
-+ for (dateptr = eol; dateptr > buf && dateptr[-1] != '>'; dateptr--)
- /* nada */;
+- /* nada */;
- if (buf >= tail)
++ dateptr = eol;
++ while (dateptr > buf && dateptr[-1] != '>')
++ dateptr--;
+ if (dateptr == buf || dateptr == eol)
return 0;
- /* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */
3: e8e94083f5 ! 3: c62fc59bf1 parse_commit(): handle broken whitespace-only timestamp
@@ Commit message
It's not subject to the same bug, because it insists that there be one
or more digits in the timestamp.
- We can use the same logic here. If there's a non-whitespace but
- non-digit value (say "committer name <email> foo"), then
- parse_timestamp() would already have returned 0 anyway. So the only
- change should be for this "whitespace only" case.
-
Signed-off-by: Jeff King <peff@peff.net>
## commit.c ##
@@ commit.c: static timestamp_t parse_commit_date(const char *buf, const char *tail)
- if (dateptr == buf || dateptr == eol)
+ dateptr = eol;
+ while (dateptr > buf && dateptr[-1] != '>')
+ dateptr--;
+- if (dateptr == buf || dateptr == eol)
++ if (dateptr == buf)
return 0;
+- /* dateptr < eol && *eol == '\n', so parsing will stop at eol */
+ /*
-+ * trim leading whitespace; parse_timestamp() will do this itself, but
-+ * it will walk past the newline at eol while doing so. So we insist
-+ * that there is at least one digit here.
++ * Trim leading whitespace; parse_timestamp() will do this itself, but
++ * if we have _only_ whitespace, it will walk right past the newline
++ * while doing so.
+ */
+ while (dateptr < eol && isspace(*dateptr))
+ dateptr++;
-+ if (!strchr("0123456789", *dateptr))
++ if (dateptr == eol)
+ return 0;
+
- /* dateptr < eol && *eol == '\n', so parsing will stop at eol */
++ /*
++ * We know there is at least one non-whitespace character, so we'll
++ * begin parsing there and stop at worst case at eol.
++ */
return parse_timestamp(dateptr, NULL, 10);
}
+
## t/t4212-log-corrupt.sh ##
@@ t/t4212-log-corrupt.sh: test_expect_success 'absurdly far-in-future date' '
-: ---------- > 4: 28ed51a2ca parse_commit(): describe more date-parsing failure modes
next prev parent reply other threads:[~2023-04-25 5:55 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-14 11:37 Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Thomas Bock
2023-04-15 8:52 ` Jeff King
2023-04-15 8:59 ` Jeff King
2023-04-15 14:10 ` Kristoffer Haugsbakk
2023-04-17 5:40 ` Jeff King
2023-04-17 6:20 ` Kristoffer Haugsbakk
2023-04-17 7:41 ` Jeff King
2023-04-27 22:32 ` Kristoffer Haugsbakk
2023-04-17 9:51 ` Junio C Hamano
2023-04-18 4:12 ` Jeff King
2023-04-18 14:02 ` Derrick Stolee
2023-04-21 14:51 ` Thomas Bock
2023-04-22 13:41 ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-22 13:42 ` [PATCH 1/3] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-22 13:47 ` [PATCH 2/3] parse_commit(): parse timestamp from end of line Jeff King
2023-04-24 17:05 ` Junio C Hamano
2023-04-25 5:23 ` Jeff King
2023-04-24 16:39 ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Junio C Hamano
2023-04-25 5:52 ` [PATCH v2 " Jeff King
2023-04-25 5:54 ` Jeff King [this message]
2023-04-25 5:54 ` [PATCH v2 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-25 5:54 ` [PATCH v2 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-25 5:54 ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-25 10:11 ` Phillip Wood
2023-04-25 16:06 ` Junio C Hamano
2023-04-26 11:36 ` Jeff King
2023-04-26 15:32 ` Junio C Hamano
2023-04-27 8:13 ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27 8:14 ` [PATCH v3 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-27 8:14 ` [PATCH v3 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-27 8:17 ` [PATCH v3 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-27 10:11 ` Phillip Wood
2023-04-27 11:55 ` Phillip Wood
2023-04-27 16:46 ` Jeff King
2023-04-27 16:20 ` Junio C Hamano
2023-04-27 16:55 ` Jeff King
2023-04-27 16:25 ` Junio C Hamano
2023-04-27 16:57 ` Jeff King
2023-04-27 8:17 ` [PATCH v3 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-27 8:18 ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27 16:32 ` Junio C Hamano
2023-04-26 14:06 ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Phillip Wood
2023-04-26 14:31 ` Andreas Schwab
2023-04-26 14:44 ` Phillip Wood
2023-04-25 5:55 ` [PATCH v2 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-22 13:52 ` Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230425055442.GA4015600@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=bockthom@cs.uni-saarland.de \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).