git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Thomas Bock <bockthom@cs.uni-saarland.de>,
	Derrick Stolee <derrickstolee@github.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 2/3] parse_commit(): parse timestamp from end of line
Date: Tue, 25 Apr 2023 01:23:59 -0400	[thread overview]
Message-ID: <20230425052359.GA4007491@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqcz3tfbx5.fsf@gitster.g>

On Mon, Apr 24, 2023 at 10:05:42AM -0700, Junio C Hamano wrote:

> > +	/*
> > +	 * parse to end-of-line and then walk backwards, which
> > +	 * handles some malformed cases.
> > +	 */
> 
> I would say "parse to" -> "jump to", but technically moving forward
> looking for a LF byte is still "parsing".  "some" malformed cases
> being "most plausible" ones (due to how ident.c::fmt_ident() is what
> writes '>' after the string end-user gave as e-mail) may be worth
> mentioning.

I'll expand this to:

  /*
   * Jump to end-of-line so that we can walk backwards to find the
   * end-of-email ">". This is more forgiving of malformed cases
   * because unexpected characters tend to be in the name and email
   * fields.
   */

> > -	dateptr = buf;
> > -	while (buf < tail && *buf++ != '\n')
> > +	for (dateptr = eol; dateptr > buf && dateptr[-1] != '>'; dateptr--)
> >  		/* nada */;
> 
> OK.  Just a style thing, but I found that "; /* nada */" is easier
> to spot that there is an empty statement there.

I found it ugly, too, but it's from earlier. Since the point is to
advance "dateptr", it may be better to turn it into a while loop anyway,
which side-steps the empty statement altogether:

  dateptr = eol;
  while (dateptr > buf && dateptr[-1] != '>')
	dateptr--;

> > -	if (buf >= tail)
> > +	if (dateptr == buf || dateptr == eol)
> >  		return 0;
> 
> Curious when dateptr that wanted to scan back from eol is still at
> eol after the loop.  It is when the ident line ends with ">" without
> any timestamp/tz info.   And the reason why we need to check that
> here is ...

Yeah, though as you saw in the next patch, it is not really sufficient. :)

I think it may be redundant, though.

In the original we already bailed earlier if we didn't find a ">"
(because "buf >= tail" after we advanced it looking for ">"). Here
"dateptr == buf" is checking the same thing (because we walked
backwards).

In the original we'd bail if we failed to find a newline as part of this
loop (because "buf >= tail" after looking for a newline). But that can't
happen here; we've already bailed after memchr() failed to find a
newline). So "dateptr == eol" only triggers if there were no characters
in "buf" to look at.

So I added it only to keep this comment trivially true:

> > -	/* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */
> > +
> > +	/* dateptr < eol && *eol == '\n', so parsing will stop at eol */
> >  	return parse_timestamp(dateptr, NULL, 10);
> 
> ... because parse_timestamp() is merely strtoumax() and would
> happily skip over arbitrary number of leading "whitespace" without
> stopping if (dateptr == eol && *eol == '\n').  OK, sad but correct.

But it could also read "dateptr <= eol" and still be true (which is to
say it is mostly accurate, but not quite because of the "soaking up
whitespace" problem fixed by the next patch.

I'll leave the extra condition in this patch, since it's orthogonal to
what this patch is fixing. But in the next one I'll remove it and expand
the comment to explain a bit more.

-Peff

  reply	other threads:[~2023-04-25  5:24 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 11:37 Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Thomas Bock
2023-04-15  8:52 ` Jeff King
2023-04-15  8:59   ` Jeff King
2023-04-15 14:10   ` Kristoffer Haugsbakk
2023-04-17  5:40     ` Jeff King
2023-04-17  6:20       ` Kristoffer Haugsbakk
2023-04-17  7:41         ` Jeff King
2023-04-27 22:32           ` Kristoffer Haugsbakk
2023-04-17  9:51   ` Junio C Hamano
2023-04-18  4:12     ` Jeff King
2023-04-18 14:02       ` Derrick Stolee
2023-04-21 14:51         ` Thomas Bock
2023-04-22 13:41           ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-22 13:42             ` [PATCH 1/3] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-22 13:47             ` [PATCH 2/3] parse_commit(): parse timestamp from end of line Jeff King
2023-04-24 17:05               ` Junio C Hamano
2023-04-25  5:23                 ` Jeff King [this message]
2023-04-24 16:39             ` [PATCH 0/3] fixing some parse_commit() timestamp corner cases Junio C Hamano
2023-04-25  5:52             ` [PATCH v2 " Jeff King
2023-04-25  5:54               ` Jeff King
2023-04-25  5:54               ` [PATCH v2 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-25  5:54               ` [PATCH v2 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-25  5:54               ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-25 10:11                 ` Phillip Wood
2023-04-25 16:06                   ` Junio C Hamano
2023-04-26 11:36                     ` Jeff King
2023-04-26 15:32                       ` Junio C Hamano
2023-04-27  8:13                         ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27  8:14                           ` [PATCH v3 1/4] t4212: avoid putting git on left-hand side of pipe Jeff King
2023-04-27  8:14                           ` [PATCH v3 2/4] parse_commit(): parse timestamp from end of line Jeff King
2023-04-27  8:17                           ` [PATCH v3 3/4] parse_commit(): handle broken whitespace-only timestamp Jeff King
2023-04-27 10:11                             ` Phillip Wood
2023-04-27 11:55                               ` Phillip Wood
2023-04-27 16:46                                 ` Jeff King
2023-04-27 16:20                               ` Junio C Hamano
2023-04-27 16:55                                 ` Jeff King
2023-04-27 16:25                             ` Junio C Hamano
2023-04-27 16:57                               ` Jeff King
2023-04-27  8:17                           ` [PATCH v3 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-27  8:18                           ` [PATCH v3 0/4] fixing some parse_commit() timestamp corner cases Jeff King
2023-04-27 16:32                           ` Junio C Hamano
2023-04-26 14:06                     ` [PATCH v2 3/4] parse_commit(): handle broken whitespace-only timestamp Phillip Wood
2023-04-26 14:31                       ` Andreas Schwab
2023-04-26 14:44                         ` Phillip Wood
2023-04-25  5:55               ` [PATCH v2 4/4] parse_commit(): describe more date-parsing failure modes Jeff King
2023-04-22 13:52         ` Weird behavior of 'git log --before' or 'git log --date-order': Commits from 2011 are treated to be before 1980 Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230425052359.GA4007491@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=bockthom@cs.uni-saarland.de \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).