git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* Mishandling of fractional seconds in ISO 8601 format
@ 2020-04-14  0:03 brian m. carlson
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
  0 siblings, 1 reply; 49+ messages in thread
From: brian m. carlson @ 2020-04-14  0:03 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

git-commit(1) claims we support ISO 8601 format.  However, our
approxidate code misparses some dates with fractional seconds.

Reproduction:

  git init foo
  cd foo
  echo abc > abc.txt
  git add .
  git commit --date="2020-04-03T12:43:55.019-04:00”

This should produce Fri Apr 3 12:43:55 2020 -0400, but actually produces
Sun Apr 19 12:43:55 2020 +0000 (at least on my system, which uses UTC).
Note the different date, which is 16 days away from what is expected.

Since we claim to support ISO 8601, we need to either reject fractional
seconds with an error, or accept and ignore them.  If what we really
support is RFC 3339 (which I suspect it is), we need to do the latter,
since that profile explicitly permits them in the syntax, as well as
update the documentation accordingly.

This was originally reported at
https://stackoverflow.com/questions/61193896/how-does-git-parse-date-string/61197722.
I don't plan to send a patch for this right now, but I wanted to make
sure it was reported to the list.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 0/2] More ISO-8601 support
  2020-04-14  0:03 Mishandling of fractional seconds in ISO 8601 format brian m. carlson
@ 2020-04-14  9:31 ` Đoàn Trần Công Danh
  2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
                     ` (6 more replies)
  0 siblings, 7 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-14  9:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, brian m. carlson

On 2020-04-14 00:03:24+0000, "brian m. carlson" <sandals@crustytoothpaste.net> wrote:
> git-commit(1) claims we support ISO 8601 format.  However, our
> approxidate code misparses some dates with fractional seconds.

We have never fully supported ISO-8601, those representation are
perfectly valid ISO-8601 (for 2020-04-14) but it would require
more code to parse correctly, perhap "tm_wday" and "tm_yday",
and mktime may help.

	2020-W16-2
	--04-14
	2020-105

I don't think not many people interested in those formats.
If someone really need them, we can get back to them later.

> Reproduction:
> 
>   git init foo
>   cd foo
>   echo abc > abc.txt
>   git add .
>   git commit --date="2020-04-03T12:43:55.019-04:00”
> 
> This should produce Fri Apr 3 12:43:55 2020 -0400, but actually produces
> Sun Apr 19 12:43:55 2020 +0000 (at least on my system, which uses UTC).
> Note the different date, which is 16 days away from what is expected.
> 
> Since we claim to support ISO 8601, we need to either reject fractional
> seconds with an error, or accept and ignore them.  If what we really
> support is RFC 3339 (which I suspect it is), we need to do the latter,
> since that profile explicitly permits them in the syntax, as well as
> update the documentation accordingly.
> 
> This was originally reported at
> https://stackoverflow.com/questions/61193896/how-does-git-parse-date-string/61197722.
> I don't plan to send a patch for this right now, but I wanted to make

If we plan on supporting more format from ISO-8601, these 2 patches
can be used.

Đoàn Trần Công Danh (2):
  date.c: allow fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

 date.c          | 20 ++++++++++++++++++++
 t/t0006-date.sh |  4 ++++
 2 files changed, 24 insertions(+)

-- 
2.26.0.485.g518673e55f


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-14  9:31   ` Đoàn Trần Công Danh
  2020-04-14 20:16     ` Jeff King
  2020-04-14 20:17     ` Jeff King
  2020-04-14  9:31   ` [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
                     ` (5 subsequent siblings)
  6 siblings, 2 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-14  9:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Brian M . Carlson

git-commit(1) says ISO-8601 is one of our supported date format.
However, we only support RFC-3339 date format.

We can either:
- Update documentation from ISO-8601 to RFC-3339
- Add full support for ISO-8601

This series will try to add full support for ISO-8601.

Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 2 ++
 t/t0006-date.sh | 1 +
 2 files changed, 3 insertions(+)

diff --git a/date.c b/date.c
index b0d9a8421d..2f37397beb 100644
--- a/date.c
+++ b/date.c
@@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	case ':':
 		if (num3 < 0)
 			num3 = 0;
+		else if (*end == '.' && isdigit(end[1]))
+			strtol(end + 1, &end, 10);
 		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
 			tm->tm_hour = num;
 			tm->tm_min = num2;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d9fcc829a9..05c914a200 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,7 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
-- 
2.26.0.485.g518673e55f


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-14  9:31   ` Đoàn Trần Công Danh
  2020-04-14 20:24     ` Jeff King
  2020-04-14 23:45   ` [PATCH 0/2] More ISO-8601 support brian m. carlson
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-14  9:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 18 ++++++++++++++++++
 t/t0006-date.sh |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/date.c b/date.c
index 2f37397beb..e02d8e191a 100644
--- a/date.c
+++ b/date.c
@@ -666,6 +666,24 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
 		n++;
 	} while (isdigit(date[n]));
 
+	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
+	if (n == 8) {
+		tm->tm_year = num / 10000 - 1900;
+		tm->tm_mon = (num % 10000) / 100 - 1;
+		tm->tm_mday = num % 100;
+		return n;
+	}
+
+	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
+	if (n == 6) {
+		tm->tm_hour = num / 10000;
+		tm->tm_min = (num % 10000) / 100;
+		tm->tm_sec = num % 100;
+		if (*end == '.' && isdigit(end[1]))
+			strtoul(end + 1, &end, 10);
+		return end - date;
+	}
+
 	/* Four-digit year or a timezone? */
 	if (n == 4) {
 		if (num <= 1400 && *offset == -1) {
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 05c914a200..eeb11070a6 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,9 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
-- 
2.26.0.485.g518673e55f


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-14 20:16     ` Jeff King
  2020-04-15  2:15       ` Danh Doan
  2020-04-14 20:17     ` Jeff King
  1 sibling, 1 reply; 49+ messages in thread
From: Jeff King @ 2020-04-14 20:16 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Brian M . Carlson

On Tue, Apr 14, 2020 at 04:31:54PM +0700, Đoàn Trần Công Danh wrote:

> git-commit(1) says ISO-8601 is one of our supported date format.
> However, we only support RFC-3339 date format.
> 
> We can either:
> - Update documentation from ISO-8601 to RFC-3339
> - Add full support for ISO-8601
> 
> This series will try to add full support for ISO-8601.

That seems more like a cover letter for the full series. The important
thing in this patch is more like:

  ISO-8601 allows timestamps to have a fractional number of seconds. We
  represent time only in terms of whole seconds, so we never bothered
  parsing fractional seconds. However, it's better for us to parse and
  throw away the fractional part than to refuse to parse the timestamp
  at all.

> diff --git a/date.c b/date.c
> index b0d9a8421d..2f37397beb 100644
> --- a/date.c
> +++ b/date.c
> @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
>  	case ':':
>  		if (num3 < 0)
>  			num3 = 0;
> +		else if (*end == '.' && isdigit(end[1]))
> +			strtol(end + 1, &end, 10);

I was a bit worried about this change hurting other cases where people
might use dots to separate numbers (but not mean a fraction). But the
two common ones I can think of should be OK:

  - 5.seconds.ago would never match because you look for a digit after
    the dot

  - a date like 2019.10.12 wouldn't match, because we're only looking
    after a ":", which heavily implies a time.

It might be worth covering those cases in tests if we're not already
doing so elsewhere.

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-14 20:16     ` Jeff King
@ 2020-04-14 20:17     ` Jeff King
  2020-04-14 23:49       ` brian m. carlson
  1 sibling, 1 reply; 49+ messages in thread
From: Jeff King @ 2020-04-14 20:17 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Brian M . Carlson

On Tue, Apr 14, 2020 at 04:31:54PM +0700, Đoàn Trần Công Danh wrote:

> diff --git a/date.c b/date.c
> index b0d9a8421d..2f37397beb 100644
> --- a/date.c
> +++ b/date.c
> @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
>  	case ':':
>  		if (num3 < 0)
>  			num3 = 0;
> +		else if (*end == '.' && isdigit(end[1]))
> +			strtol(end + 1, &end, 10);

This just throws away the fractional part. I doubt anybody cares much
either way, but another option would be to round num3 up or down.

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-14  9:31   ` [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
@ 2020-04-14 20:24     ` Jeff King
  2020-04-15  2:12       ` Danh Doan
  2020-04-15 15:03       ` Junio C Hamano
  0 siblings, 2 replies; 49+ messages in thread
From: Jeff King @ 2020-04-14 20:24 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

On Tue, Apr 14, 2020 at 04:31:55PM +0700, Đoàn Trần Công Danh wrote:

> @@ -666,6 +666,24 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
>  		n++;
>  	} while (isdigit(date[n]));
>  
> +	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
> +	if (n == 8) {
> +		tm->tm_year = num / 10000 - 1900;
> +		tm->tm_mon = (num % 10000) / 100 - 1;
> +		tm->tm_mday = num % 100;
> +		return n;
> +	}

I worry a little this may conflict with other approxidate heuristics.
The only one I can think of is an actual unix timestamp, though, and we
already require that to have at least 9 digits (plus anybody wanting to
use one robustly really should be using @12345678).

And it looks like we'd exit early from the function for anything longer
than 4 digits anyway, ignoring the value.

We could probably tighten the heuristics a bit by insisting that the
month and day be sensible. Or even year (see the 1900 to 2100 magic for
the 4-digit year guess).

> +	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
> +	if (n == 6) {
> +		tm->tm_hour = num / 10000;
> +		tm->tm_min = (num % 10000) / 100;
> +		tm->tm_sec = num % 100;
> +		if (*end == '.' && isdigit(end[1]))
> +			strtoul(end + 1, &end, 10);
> +		return end - date;
> +	}

And likewise here that the hour, minute, and second be reasonable.

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/2] More ISO-8601 support
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-14  9:31   ` [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
@ 2020-04-14 23:45   ` brian m. carlson
  2020-04-15  3:31   ` [PATCH v2 " Đoàn Trần Công Danh
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 49+ messages in thread
From: brian m. carlson @ 2020-04-14 23:45 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1105 bytes --]

On 2020-04-14 at 09:31:53, Đoàn Trần Công Danh wrote:
> On 2020-04-14 00:03:24+0000, "brian m. carlson" <sandals@crustytoothpaste.net> wrote:
> > git-commit(1) claims we support ISO 8601 format.  However, our
> > approxidate code misparses some dates with fractional seconds.
> 
> We have never fully supported ISO-8601, those representation are
> perfectly valid ISO-8601 (for 2020-04-14) but it would require
> more code to parse correctly, perhap "tm_wday" and "tm_yday",
> and mktime may help.
> 
> 	2020-W16-2
> 	--04-14
> 	2020-105
> 
> I don't think not many people interested in those formats.
> If someone really need them, we can get back to them later.

Yeah, I agree we don't need to support those.  I would be happy if we
said "RFC 3339", which is a more limited profile, and implemented that.
The problem, as you alluded to (and I did as well in my original
report), is that ISO 8601 provides a huge number of potential formats,
most of which are not interesting for typical software.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14 20:17     ` Jeff King
@ 2020-04-14 23:49       ` brian m. carlson
  2020-04-15  2:17         ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: brian m. carlson @ 2020-04-14 23:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Đoàn Trần Công Danh, git

[-- Attachment #1: Type: text/plain, Size: 1037 bytes --]

On 2020-04-14 at 20:17:33, Jeff King wrote:
> On Tue, Apr 14, 2020 at 04:31:54PM +0700, Đoàn Trần Công Danh wrote:
> 
> > diff --git a/date.c b/date.c
> > index b0d9a8421d..2f37397beb 100644
> > --- a/date.c
> > +++ b/date.c
> > @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
> >  	case ':':
> >  		if (num3 < 0)
> >  			num3 = 0;
> > +		else if (*end == '.' && isdigit(end[1]))
> > +			strtol(end + 1, &end, 10);
> 
> This just throws away the fractional part. I doubt anybody cares much
> either way, but another option would be to round num3 up or down.

I'm happy with truncation.  It's simple and straightforward and will be
widely and intuitively understood by programmers, who are probably the
primary users of our ISO 8601 handling.

I do appreciate that you pointed this out, because it's a good point.

And overall, I agree that this seems like a good fix for the problem.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-14 20:24     ` Jeff King
@ 2020-04-15  2:12       ` Danh Doan
  2020-04-15 15:03       ` Junio C Hamano
  1 sibling, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-15  2:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 2020-04-14 16:24:01-0400, Jeff King <peff@peff.net> wrote:
> On Tue, Apr 14, 2020 at 04:31:55PM +0700, Đoàn Trần Công Danh wrote:
> 
> > @@ -666,6 +666,24 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
> >  		n++;
> >  	} while (isdigit(date[n]));
> >  
> > +	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
> > +	if (n == 8) {
> > +		tm->tm_year = num / 10000 - 1900;
> > +		tm->tm_mon = (num % 10000) / 100 - 1;
> > +		tm->tm_mday = num % 100;
> > +		return n;
> > +	}
> 
> I worry a little this may conflict with other approxidate heuristics.
> The only one I can think of is an actual unix timestamp, though, and we
> already require that to have at least 9 digits (plus anybody wanting to
> use one robustly really should be using @12345678).
> 
> And it looks like we'd exit early from the function for anything longer
> than 4 digits anyway, ignoring the value.
> 
> We could probably tighten the heuristics a bit by insisting that the
> month and day be sensible. Or even year (see the 1900 to 2100 magic for
> the 4-digit year guess).

Yeah, It's make sense to tighten the heuristics.
While 1900 is lower bound for year makes sense to me,
but I don't think we should limit upper bound for tm_year.

> > +	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
> > +	if (n == 6) {
> > +		tm->tm_hour = num / 10000;
> > +		tm->tm_min = (num % 10000) / 100;
> > +		tm->tm_sec = num % 100;
> > +		if (*end == '.' && isdigit(end[1]))
> > +			strtoul(end + 1, &end, 10);
> > +		return end - date;
> > +	}
> 
> And likewise here that the hour, minute, and second be reasonable.
> 
> -Peff

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14 20:16     ` Jeff King
@ 2020-04-15  2:15       ` Danh Doan
  0 siblings, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-15  2:15 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Brian M . Carlson

On 2020-04-14 16:16:06-0400, Jeff King <peff@peff.net> wrote:
> On Tue, Apr 14, 2020 at 04:31:54PM +0700, Đoàn Trần Công Danh wrote:
> 
> > git-commit(1) says ISO-8601 is one of our supported date format.
> > However, we only support RFC-3339 date format.
> > 
> > We can either:
> > - Update documentation from ISO-8601 to RFC-3339
> > - Add full support for ISO-8601
> > 
> > This series will try to add full support for ISO-8601.
> 
> That seems more like a cover letter for the full series. The important
> thing in this patch is more like:
> 
>   ISO-8601 allows timestamps to have a fractional number of seconds. We
>   represent time only in terms of whole seconds, so we never bothered
>   parsing fractional seconds. However, it's better for us to parse and
>   throw away the fractional part than to refuse to parse the timestamp
>   at all.

Yeah, this sounds better. I intended write only this patch.
I made the second patch when the comment about 8 digits for YYYYMMDD
come to my eyes, hence this clumsy.

> > diff --git a/date.c b/date.c
> > index b0d9a8421d..2f37397beb 100644
> > --- a/date.c
> > +++ b/date.c
> > @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
> >  	case ':':
> >  		if (num3 < 0)
> >  			num3 = 0;
> > +		else if (*end == '.' && isdigit(end[1]))
> > +			strtol(end + 1, &end, 10);
> 
> I was a bit worried about this change hurting other cases where people
> might use dots to separate numbers (but not mean a fraction). But the
> two common ones I can think of should be OK:
> 
>   - 5.seconds.ago would never match because you look for a digit after
>     the dot

This is covered.

>   - a date like 2019.10.12 wouldn't match, because we're only looking
>     after a ":", which heavily implies a time.

Yeah, It's worth to add this test.


-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/2] date.c: allow fractional second part of ISO-8601
  2020-04-14 23:49       ` brian m. carlson
@ 2020-04-15  2:17         ` Danh Doan
  0 siblings, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-15  2:17 UTC (permalink / raw)
  To: brian m. carlson, Jeff King, git

On 2020-04-14 23:49:58+0000, "brian m. carlson" <sandals@crustytoothpaste.net> wrote:
> On 2020-04-14 at 20:17:33, Jeff King wrote:
> > On Tue, Apr 14, 2020 at 04:31:54PM +0700, Đoàn Trần Công Danh wrote:
> > 
> > > diff --git a/date.c b/date.c
> > > index b0d9a8421d..2f37397beb 100644
> > > --- a/date.c
> > > +++ b/date.c
> > > @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
> > >  	case ':':
> > >  		if (num3 < 0)
> > >  			num3 = 0;
> > > +		else if (*end == '.' && isdigit(end[1]))
> > > +			strtol(end + 1, &end, 10);
> > 
> > This just throws away the fractional part. I doubt anybody cares much
> > either way, but another option would be to round num3 up or down.
> 
> I'm happy with truncation.  It's simple and straightforward and will be
> widely and intuitively understood by programmers, who are probably the
> primary users of our ISO 8601 handling.
> 
> I do appreciate that you pointed this out, because it's a good point.
> 
> And overall, I agree that this seems like a good fix for the problem.

I'll reword the subject to "throw away fractional second part of ISO-8601"

1 second is not that big, and and the logic is simpler.

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 0/2] More ISO-8601 support
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
                     ` (2 preceding siblings ...)
  2020-04-14 23:45   ` [PATCH 0/2] More ISO-8601 support brian m. carlson
@ 2020-04-15  3:31   ` Đoàn Trần Công Danh
  2020-04-15  3:31     ` [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-15  3:31     ` [PATCH v2 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  2020-04-22 13:15   ` [PATCH v3 0/2] More ISO-8601 support Đoàn Trần Công Danh
                     ` (2 subsequent siblings)
  6 siblings, 2 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-15  3:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, brian m. carlson, Jeff King

This series aims to extend support for ISO-8601 datetime format
to allow compact version, and fractional part of ISO-8601.

Đoàn Trần Công Danh (2):
  date.c: skip fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

 Documentation/date-formats.txt |  2 +-
 date.c                         | 24 ++++++++++++++++++++++++
 t/t0006-date.sh                |  5 +++++
 3 files changed, 30 insertions(+), 1 deletion(-)

Range-diff against v1:
1:  d7d74b03cc ! 1:  03f3e9968b date.c: allow fractional second part of ISO-8601
    @@ Metadata
     Author: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
      ## Commit message ##
    -    date.c: allow fractional second part of ISO-8601
    +    date.c: skip fractional second part of ISO-8601
     
         git-commit(1) says ISO-8601 is one of our supported date format.
    -    However, we only support RFC-3339 date format.
     
    -    We can either:
    -    - Update documentation from ISO-8601 to RFC-3339
    -    - Add full support for ISO-8601
    +    ISO-8601 allows timestamps to have a fractional number of seconds.
    +    We represent time only in terms of whole seconds, so we never bothered
    +    parsing fractional seconds. However, it's better for us to parse and
    +    throw away the fractional part than to refuse to parse the timestamp
    +    at all.
     
    -    This series will try to add full support for ISO-8601.
    +    And refusing parsing fractional second part may confuse the parse to
    +    think fractional and timezone as day and month in this example:
    +
    +            2008-02-14 20:30:45.019-04:00
     
         Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
    + ## Documentation/date-formats.txt ##
    +@@ Documentation/date-formats.txt: RFC 2822::
    + ISO 8601::
    + 	Time and date specified by the ISO 8601 standard, for example
    + 	`2005-04-07T22:13:13`. The parser accepts a space instead of the
    +-	`T` character as well.
    ++	`T` character as well. The fractional part will be ignored.
    + +
    + NOTE: In addition, the date part is accepted in the following formats:
    + `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
    +
      ## date.c ##
     @@ date.c: static int match_multi_number(timestamp_t num, char c, const char *date,
      	case ':':
    @@ t/t0006-date.sh: check_parse 2008-02 bad
      check_parse 2008-02-14 bad
      check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
      check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
    ++check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
     +check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
      check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
      check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
2:  48284386c9 < -:  ---------- date.c: allow compact version of ISO-8601 datetime
-:  ---------- > 2:  36517af872 date.c: allow compact version of ISO-8601 datetime
-- 
2.26.0.485.g42af89717d


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-15  3:31   ` [PATCH v2 " Đoàn Trần Công Danh
@ 2020-04-15  3:31     ` Đoàn Trần Công Danh
  2020-04-15 10:17       ` Torsten Bögershausen
  2020-04-15  3:31     ` [PATCH v2 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  1 sibling, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-15  3:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Brian M . Carlson

git-commit(1) says ISO-8601 is one of our supported date format.

ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.

And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:

	2008-02-14 20:30:45.019-04:00

Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/date-formats.txt | 2 +-
 date.c                         | 2 ++
 t/t0006-date.sh                | 2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
index 6926e0a4c8..6f69ba2ddd 100644
--- a/Documentation/date-formats.txt
+++ b/Documentation/date-formats.txt
@@ -20,7 +20,7 @@ RFC 2822::
 ISO 8601::
 	Time and date specified by the ISO 8601 standard, for example
 	`2005-04-07T22:13:13`. The parser accepts a space instead of the
-	`T` character as well.
+	`T` character as well. The fractional part will be ignored.
 +
 NOTE: In addition, the date part is accepted in the following formats:
 `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
diff --git a/date.c b/date.c
index b0d9a8421d..2f37397beb 100644
--- a/date.c
+++ b/date.c
@@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	case ':':
 		if (num3 < 0)
 			num3 = 0;
+		else if (*end == '.' && isdigit(end[1]))
+			strtol(end + 1, &end, 10);
 		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
 			tm->tm_hour = num;
 			tm->tm_min = num2;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d9fcc829a9..d582dea5c5 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,8 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
-- 
2.26.0.485.g42af89717d


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-15  3:31   ` [PATCH v2 " Đoàn Trần Công Danh
  2020-04-15  3:31     ` [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-15  3:31     ` Đoàn Trần Công Danh
  1 sibling, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-15  3:31 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 22 ++++++++++++++++++++++
 t/t0006-date.sh |  3 +++
 2 files changed, 25 insertions(+)

diff --git a/date.c b/date.c
index 2f37397beb..fc42382cfa 100644
--- a/date.c
+++ b/date.c
@@ -666,6 +666,28 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
 		n++;
 	} while (isdigit(date[n]));
 
+	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
+	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
+	if (n == 8 || n == 6) {
+		unsigned int num1 = num / 10000;
+		unsigned int num2 = (num % 10000) / 100;
+		unsigned int num3 = num % 100;
+		if (n == 8 && num1 > 1900 &&
+		    num2 > 0 && num2 <= 12 &&
+		    num3 > 0  && num3 <= 31) {
+			tm->tm_year = num1 - 1900;
+			tm->tm_mon  = num2 - 1;
+			tm->tm_mday = num3;
+		} else if (n == 6 && num1 < 60 && num2 < 60 && num3 <= 60) {
+			tm->tm_hour = num1;
+			tm->tm_min  = num2;
+			tm->tm_sec  = num3;
+			if (*end == '.' && isdigit(end[1]))
+				strtoul(end + 1, &end, 10);
+		}
+		return end - date;
+	}
+
 	/* Four-digit year or a timezone? */
 	if (n == 4) {
 		if (num <= 1400 && *offset == -1) {
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d582dea5c5..1e169c8d15 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -82,6 +82,9 @@ check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
 check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
-- 
2.26.0.485.g42af89717d


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-15  3:31     ` [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-15 10:17       ` Torsten Bögershausen
  2020-04-16 10:04         ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: Torsten Bögershausen @ 2020-04-15 10:17 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Brian M . Carlson

On Wed, Apr 15, 2020 at 10:31:27AM +0700, Đoàn Trần Công Danh wrote:
> git-commit(1) says ISO-8601 is one of our supported date format.
>
> ISO-8601 allows timestamps to have a fractional number of seconds.
> We represent time only in terms of whole seconds, so we never bothered
> parsing fractional seconds. However, it's better for us to parse and
> throw away the fractional part than to refuse to parse the timestamp
> at all.
>
> And refusing parsing fractional second part may confuse the parse to
> think fractional and timezone as day and month in this example:
>
> 	2008-02-14 20:30:45.019-04:00
>
> Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  Documentation/date-formats.txt | 2 +-
>  date.c                         | 2 ++
>  t/t0006-date.sh                | 2 ++
>  3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
> index 6926e0a4c8..6f69ba2ddd 100644
> --- a/Documentation/date-formats.txt
> +++ b/Documentation/date-formats.txt
> @@ -20,7 +20,7 @@ RFC 2822::
>  ISO 8601::
>  	Time and date specified by the ISO 8601 standard, for example
>  	`2005-04-07T22:13:13`. The parser accepts a space instead of the
> -	`T` character as well.
> +	`T` character as well. The fractional part will be ignored.

When somebody has read the whole patch series, it is clear what
"fractional part" means. Otherwise it is not clear, what fractional part
means. The following may be easier to understand ?

  ISO 8601::
  	Time and date specified by the ISO 8601 standard, for example
  	`2005-04-07T22:13:13`. The parser accepts a space instead of the
  	`T` character as well.
  	Fractional parts of a second like `2005-04-07T22:13:13.091`
  	will be ignored and treated as `2005-04-07T22:13:13`

>  +
>  NOTE: In addition, the date part is accepted in the following formats:
>  `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
> diff --git a/date.c b/date.c
> index b0d9a8421d..2f37397beb 100644
> --- a/date.c
> +++ b/date.c
> @@ -556,6 +556,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
>  	case ':':
>  		if (num3 < 0)
>  			num3 = 0;
> +		else if (*end == '.' && isdigit(end[1]))
> +			strtol(end + 1, &end, 10);
>  		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
>  			tm->tm_hour = num;
>  			tm->tm_min = num2;
> diff --git a/t/t0006-date.sh b/t/t0006-date.sh
> index d9fcc829a9..d582dea5c5 100755
> --- a/t/t0006-date.sh
> +++ b/t/t0006-date.sh
> @@ -81,6 +81,8 @@ check_parse 2008-02 bad
>  check_parse 2008-02-14 bad
>  check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
>  check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
> +check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
> +check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
>  check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
>  check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
>  check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
> --
> 2.26.0.485.g42af89717d
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-14 20:24     ` Jeff King
  2020-04-15  2:12       ` Danh Doan
@ 2020-04-15 15:03       ` Junio C Hamano
  2020-04-15 15:41         ` Jeff King
  1 sibling, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-15 15:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Đoàn Trần Công Danh, git

Jeff King <peff@peff.net> writes:

> On Tue, Apr 14, 2020 at 04:31:55PM +0700, Đoàn Trần Công Danh wrote:
>
>> @@ -666,6 +666,24 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
>>  		n++;
>>  	} while (isdigit(date[n]));
>>  
>> +	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
>> +	if (n == 8) {
>> +		tm->tm_year = num / 10000 - 1900;
>> +		tm->tm_mon = (num % 10000) / 100 - 1;
>> +		tm->tm_mday = num % 100;
>> +		return n;
>> +	}
>
> I worry a little this may conflict with other approxidate heuristics.
>
> The only one I can think of is an actual unix timestamp, though, and we
> already require that to have at least 9 digits (plus anybody wanting to
> use one robustly really should be using @12345678).

I am OK with 1/2, but I'd worry a LOT about this one, if we didn't
require 9 digits.  60/100 * 60/100 * 24/100 ~= 8.6% so limiting the
hour/min/sec to sensible values is not a useful protection against
ambiguity.  We'd essentially be declaring that a raw UNIX timestamp
without @ prefix is now hit-and-miss if we take this change, were
there not the "must be at least 9 digits" requirement.

Somebody was thinking when he wrote the very beginning part of the
match_digit() function ;-)

> We could probably tighten the heuristics a bit by insisting that the
> month and day be sensible. Or even year (see the 1900 to 2100 magic for
> the 4-digit year guess).

I do agree that we'd want 0 <= hr <= 24, 0 <= min <= 59, and 0 <=
sec <= 60 (or should this be 61?), but it is for correctness of the
new code.  It wouldn't have any value in disambiguation from other
formats, I would think.  So, from that point of view, I would buy
something like 1969 as a lower bound, but I am not sure if we
necessarily want an upper bound for the year.  What mistake are we
protecting us against?

Thanks.

>> +	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
>> +	if (n == 6) {
>> +		tm->tm_hour = num / 10000;
>> +		tm->tm_min = (num % 10000) / 100;
>> +		tm->tm_sec = num % 100;
>> +		if (*end == '.' && isdigit(end[1]))
>> +			strtoul(end + 1, &end, 10);
>> +		return end - date;
>> +	}
>
> And likewise here that the hour, minute, and second be reasonable.
>
> -Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-15 15:03       ` Junio C Hamano
@ 2020-04-15 15:41         ` Jeff King
  2020-04-15 15:58           ` Junio C Hamano
  2020-04-16 11:16           ` Danh Doan
  0 siblings, 2 replies; 49+ messages in thread
From: Jeff King @ 2020-04-15 15:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Đoàn Trần Công Danh, git

On Wed, Apr 15, 2020 at 08:03:11AM -0700, Junio C Hamano wrote:

> > I worry a little this may conflict with other approxidate heuristics.
> >
> > The only one I can think of is an actual unix timestamp, though, and we
> > already require that to have at least 9 digits (plus anybody wanting to
> > use one robustly really should be using @12345678).
> 
> I am OK with 1/2, but I'd worry a LOT about this one, if we didn't
> require 9 digits.  60/100 * 60/100 * 24/100 ~= 8.6% so limiting the
> hour/min/sec to sensible values is not a useful protection against
> ambiguity.  We'd essentially be declaring that a raw UNIX timestamp
> without @ prefix is now hit-and-miss if we take this change, were
> there not the "must be at least 9 digits" requirement.

Yeah, that was exactly why I poked at it.

Curiously, a1a5a6347b (Accept dates before 2000/01/01 when specified as
seconds since the epoch, 2007-06-06), the commit which introduced the
9-digit rule, specifically says:

  There is now still a limit, 100000000, which is 1973/03/03 09:46:40
  UTC, in order to allow that dates are represented as 8 digits.

but running "test-date 20100403" even back at that commit does not seem
to work (nor does it work now).

Doubly curious, some nearby code blames to 9f2b6d2936 (date/time: do not
get confused by fractional seconds, 2008-08-16). So why don't fractional
seconds work right now?

They sure don't seem to:

  [the 17 becomes a day-of-month]
  $ t/helper/test-tool date approxidate 12:43:50.17
  12:43:50.17 -> 2020-04-17 16:43:50 +0000

But I wonder if the new patch breaks the example from that commit
message, which does work now:

  [current version; the 17 attaches to "days ago"]
  $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
  12:43:50.17.days.ago -> 2020-03-29 16:43:50 +0000

It looks like the answer is yes:

  [with patch 1/2 applied; now it's a fractional second]
  $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
  12:43:50.17.days.ago -> 2020-04-15 16:43:50 +0000

TBH I'm not sure how big a deal it is. The input is rather ambiguous for
a strict left-to-right parser, and I'm not sure this case is worth doing
more look-ahead parsing. But it's worth making a conscious decision
about it.

One alternative would be to restrict the fractional second handling only
to the parse_date_basic(), which is quite a bit stricter. It shares
match_digit() with approxidate(), so we'd probably have to pass in an
extra flag to match_digit() to change the rules. It also means that:

  2020-04-14T12:43:50.17

would parse a fractional second, but:

  yesterday at 12:43:50.17

wouldn't.

> > We could probably tighten the heuristics a bit by insisting that the
> > month and day be sensible. Or even year (see the 1900 to 2100 magic for
> > the 4-digit year guess).
> 
> I do agree that we'd want 0 <= hr <= 24, 0 <= min <= 59, and 0 <=
> sec <= 60 (or should this be 61?), but it is for correctness of the
> new code.  It wouldn't have any value in disambiguation from other
> formats, I would think.  So, from that point of view, I would buy
> something like 1969 as a lower bound, but I am not sure if we
> necessarily want an upper bound for the year.  What mistake are we
> protecting us against?

No particular mistake. I was just suggesting to keep the heuristics in
the new code as tight as possible to leave room for later changes (since
approxidate by its nature is heuristics and hacks piled atop each
other).

I do agree that year bounds are a questionable restriction. Right now
Git's date code can only handle dates between 1970 and 2099, but it
would be nice to change that.

-Peff

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-15 15:41         ` Jeff King
@ 2020-04-15 15:58           ` Junio C Hamano
  2020-04-16 11:16           ` Danh Doan
  1 sibling, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2020-04-15 15:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Đoàn Trần Công Danh, git

Jeff King <peff@peff.net> writes:

> But I wonder if the new patch breaks the example from that commit
> message, which does work now:
>
>   [current version; the 17 attaches to "days ago"]
>   $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
>   12:43:50.17.days.ago -> 2020-03-29 16:43:50 +0000
>
> It looks like the answer is yes:
>
>   [with patch 1/2 applied; now it's a fractional second]
>   $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
>   12:43:50.17.days.ago -> 2020-04-15 16:43:50 +0000
>
> TBH I'm not sure how big a deal it is. The input is rather ambiguous for
> a strict left-to-right parser, and I'm not sure this case is worth doing
> more look-ahead parsing. But it's worth making a conscious decision
> about it.

True.  I do agree that this would qualify as a regression, but I do
not think we are breaking an important use case here.

> One alternative would be to restrict the fractional second handling only
> to the parse_date_basic(), which is quite a bit stricter. It shares
> match_digit() with approxidate(), so we'd probably have to pass in an
> extra flag to match_digit() to change the rules. It also means that:
>
>   2020-04-14T12:43:50.17
>
> would parse a fractional second, but:
>
>   yesterday at 12:43:50.17
>
> wouldn't.

That sounds less useful.  Somehow "yesterday at 12.43:50.17" looks a
lot more plausible real-user input that we would want to support
better than "12:43:50.17.day.ago".

> I do agree that year bounds are a questionable restriction. Right now
> Git's date code can only handle dates between 1970 and 2099, but it
> would be nice to change that.

Sure.  In both directions.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-15 10:17       ` Torsten Bögershausen
@ 2020-04-16 10:04         ` Danh Doan
  0 siblings, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-16 10:04 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git, Brian M . Carlson

On 2020-04-15 12:17:48+0200, Torsten Bögershausen <tboegi@web.de> wrote:
> On Wed, Apr 15, 2020 at 10:31:27AM +0700, Đoàn Trần Công Danh wrote:
> > git-commit(1) says ISO-8601 is one of our supported date format.
> >
> > ISO-8601 allows timestamps to have a fractional number of seconds.
> > We represent time only in terms of whole seconds, so we never bothered
> > parsing fractional seconds. However, it's better for us to parse and
> > throw away the fractional part than to refuse to parse the timestamp
> > at all.
> >
> > And refusing parsing fractional second part may confuse the parse to
> > think fractional and timezone as day and month in this example:
> >
> > 	2008-02-14 20:30:45.019-04:00
> >
> > Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> >  Documentation/date-formats.txt | 2 +-
> >  date.c                         | 2 ++
> >  t/t0006-date.sh                | 2 ++
> >  3 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
> > index 6926e0a4c8..6f69ba2ddd 100644
> > --- a/Documentation/date-formats.txt
> > +++ b/Documentation/date-formats.txt
> > @@ -20,7 +20,7 @@ RFC 2822::
> >  ISO 8601::
> >  	Time and date specified by the ISO 8601 standard, for example
> >  	`2005-04-07T22:13:13`. The parser accepts a space instead of the
> > -	`T` character as well.
> > +	`T` character as well. The fractional part will be ignored.
> 
> When somebody has read the whole patch series, it is clear what
> "fractional part" means. Otherwise it is not clear, what fractional part
> means. The following may be easier to understand ?
> 
>   ISO 8601::
>   	Time and date specified by the ISO 8601 standard, for example
>   	`2005-04-07T22:13:13`. The parser accepts a space instead of the
>   	`T` character as well.
>   	Fractional parts of a second like `2005-04-07T22:13:13.091`
>   	will be ignored and treated as `2005-04-07T22:13:13`

Yes, this read better.
I think I'll rephase to

	Fractional parts of a second will be ignored, for example
	`2005-04-07T22:13:13.019` will be treated as
	`2005-04-07T22:13:13`

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-15 15:41         ` Jeff King
  2020-04-15 15:58           ` Junio C Hamano
@ 2020-04-16 11:16           ` Danh Doan
  1 sibling, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-16 11:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2020-04-15 11:41:57-0400, Jeff King <peff@peff.net> wrote:
> On Wed, Apr 15, 2020 at 08:03:11AM -0700, Junio C Hamano wrote:
> 
> > > I worry a little this may conflict with other approxidate heuristics.
> > >
> > > The only one I can think of is an actual unix timestamp, though, and we
> > > already require that to have at least 9 digits (plus anybody wanting to
> > > use one robustly really should be using @12345678).
> > 
> > I am OK with 1/2, but I'd worry a LOT about this one, if we didn't
> > require 9 digits.  60/100 * 60/100 * 24/100 ~= 8.6% so limiting the
> > hour/min/sec to sensible values is not a useful protection against
> > ambiguity.  We'd essentially be declaring that a raw UNIX timestamp
> > without @ prefix is now hit-and-miss if we take this change, were
> > there not the "must be at least 9 digits" requirement.
> 
> Yeah, that was exactly why I poked at it.
> 
> Curiously, a1a5a6347b (Accept dates before 2000/01/01 when specified as
> seconds since the epoch, 2007-06-06), the commit which introduced the
> 9-digit rule, specifically says:
> 
>   There is now still a limit, 100000000, which is 1973/03/03 09:46:40
>   UTC, in order to allow that dates are represented as 8 digits.
> 
> but running "test-date 20100403" even back at that commit does not seem
> to work (nor does it work now).

check-parse 2010-04-03 doesn't work either.
check-approxiate is going through different code path via
approxidate_digit, but I'm not really sure which action should be take

> Doubly curious, some nearby code blames to 9f2b6d2936 (date/time: do not
> get confused by fractional seconds, 2008-08-16). So why don't fractional
> seconds work right now?

It doesn't work in that time, either:

	$ git checkout 9f2b6d2936
	$ make
	$ ./git --version
	git version 1.6.0.3.g9f2b6
	$ GIT_AUTHOR_DATE='2019-04-16 00:02:03.19-04:00' ./git-commit --allow-empty -m msg
	$ ./git-log --pretty=format:%aD -1
	Fri, 19 Apr 2019 00:02:03 +0700

The different is whether we stick the timezone (west of UTC) into the
fractional or not. If we have both those conditions:

- Specify the timezone west of UTC
- Stick it with the fractional part
- The fractional part is parsed to  number less than 32

The the parser will think it's US style of month/day

Judging from the test code, it's likely people that write the code and
the test get used to separate time and timezone.

And it's unlikely anyone will  write fractional parts with only
1 digit (yield 100% positive with sticking timezone), or 2 digits
(which yields 32% positive with sticking timezone),
for 3 digits, the positive rate is 3.2%

> 
> They sure don't seem to:
> 
>   [the 17 becomes a day-of-month]
>   $ t/helper/test-tool date approxidate 12:43:50.17
>   12:43:50.17 -> 2020-04-17 16:43:50 +0000
> 
> But I wonder if the new patch breaks the example from that commit
> message, which does work now:
> 
>   [current version; the 17 attaches to "days ago"]
>   $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
>   12:43:50.17.days.ago -> 2020-03-29 16:43:50 +0000
> 
> It looks like the answer is yes:
> 
>   [with patch 1/2 applied; now it's a fractional second]
>   $ t/helper/test-tool date approxidate 12:43:50.17.days.ago
>   12:43:50.17.days.ago -> 2020-04-15 16:43:50 +0000
> 
> TBH I'm not sure how big a deal it is. The input is rather ambiguous for
> a strict left-to-right parser, and I'm not sure this case is worth doing
> more look-ahead parsing. But it's worth making a conscious decision
> about it.

Hm, I don't think "12:43:50.17.days.ago" is a normal input from user
but I think "12:43:50.3.days.ago" on Monday wouldn't have much less
user when compare with "yesterday at 12:43:50.17"

> One alternative would be to restrict the fractional second handling only
> to the parse_date_basic(), which is quite a bit stricter. It shares
> match_digit() with approxidate(), so we'd probably have to pass in an
> extra flag to match_digit() to change the rules. It also means that:
> 
>   2020-04-14T12:43:50.17
> 
> would parse a fractional second, but:
> 
>   yesterday at 12:43:50.17
> 
> wouldn't.
> 
> > > We could probably tighten the heuristics a bit by insisting that the
> > > month and day be sensible. Or even year (see the 1900 to 2100 magic for
> > > the 4-digit year guess).
> > 
> > I do agree that we'd want 0 <= hr <= 24, 0 <= min <= 59, and 0 <=
> > sec <= 60 (or should this be 61?), but it is for correctness of the
> > new code.  It wouldn't have any value in disambiguation from other
> > formats, I would think.  So, from that point of view, I would buy
> > something like 1969 as a lower bound, but I am not sure if we
> > necessarily want an upper bound for the year.  What mistake are we
> > protecting us against?
> 
> No particular mistake. I was just suggesting to keep the heuristics in
> the new code as tight as possible to leave room for later changes (since
> approxidate by its nature is heuristics and hacks piled atop each
> other).
> 
> I do agree that year bounds are a questionable restriction. Right now
> Git's date code can only handle dates between 1970 and 2099, but it
> would be nice to change that.

2100 is used inside is_date, which is the place we try to guess what
     does those number could mean in multiple format.

I don't think we have ambiguity when parsing ISO-8601 compact date.

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3 0/2] More ISO-8601 support
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
                     ` (3 preceding siblings ...)
  2020-04-15  3:31   ` [PATCH v2 " Đoàn Trần Công Danh
@ 2020-04-22 13:15   ` Đoàn Trần Công Danh
  2020-04-22 13:15     ` [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-22 13:15     ` [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
  6 siblings, 2 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-22 13:15 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, brian m. carlson,
	Jeff King, Torsten Bögershausen, Junio C Hamano

This series aims to extend support for ISO-8601 datetime format
to allow compact version, and fractional part of ISO-8601.

Changes from v2:
* Add example for fractional parts of second in documentation
* Add/Fix regression test on 12:34:56.7.days.ago

Đoàn Trần Công Danh (2):
  date.c: skip fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

 Documentation/date-formats.txt |  5 ++++-
 date.c                         | 32 +++++++++++++++++++++++++++++++-
 t/t0006-date.sh                |  6 ++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

Range-diff against v2:
1:  03f3e9968b ! 1:  c6d42785bb date.c: skip fractional second part of ISO-8601
    @@ Documentation/date-formats.txt: RFC 2822::
      	Time and date specified by the ISO 8601 standard, for example
      	`2005-04-07T22:13:13`. The parser accepts a space instead of the
     -	`T` character as well.
    -+	`T` character as well. The fractional part will be ignored.
    ++	`T` character as well. Fractional parts of a second will be ignored,
    ++	for example `2005-04-07T22:13:13.019` will be treated as
    ++	`2005-04-07T22:13:13`
    ++
      +
      NOTE: In addition, the date part is accepted in the following formats:
      `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
     
      ## date.c ##
     @@ date.c: static int match_multi_number(timestamp_t num, char c, const char *date,
    + 	/* Time? Date? */
    + 	switch (c) {
      	case ':':
    - 		if (num3 < 0)
    +-		if (num3 < 0)
    ++		if (num3 < 0) {
      			num3 = 0;
    -+		else if (*end == '.' && isdigit(end[1]))
    ++		} else if (*end == '.' && isdigit(end[1]) &&
    ++			   tm->tm_year != -1 && tm->tm_mon != -1 &&
    ++			   tm->tm_mday != -1) {
    ++			/* Attempt to guess meaning of <num> in HHMMSS.<num>
    ++			 * only interpret as fractional when %Y %m %d is known.
    ++			 */
     +			strtol(end + 1, &end, 10);
    ++		}
      		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
      			tm->tm_hour = num;
      			tm->tm_min = num2;
    @@ t/t0006-date.sh: check_parse 2008-02 bad
      check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
      check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
      check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
    +@@ t/t0006-date.sh: check_approxidate 5.seconds.ago '2009-08-30 19:19:55'
    + check_approxidate 10.minutes.ago '2009-08-30 19:10:00'
    + check_approxidate yesterday '2009-08-29 19:20:00'
    + check_approxidate 3.days.ago '2009-08-27 19:20:00'
    ++check_approxidate '12:34:56.3.days.ago' '2009-08-27 12:34:56'
    + check_approxidate 3.weeks.ago '2009-08-09 19:20:00'
    + check_approxidate 3.months.ago '2009-05-30 19:20:00'
    + check_approxidate 2.years.3.months.ago '2007-05-30 19:20:00'
2:  36517af872 = 2:  225b6401bd date.c: allow compact version of ISO-8601 datetime
-- 
2.26.2.303.gf8c07b1a78


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-22 13:15   ` [PATCH v3 0/2] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-22 13:15     ` Đoàn Trần Công Danh
  2020-04-22 17:05       ` Junio C Hamano
  2020-04-22 13:15     ` [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  1 sibling, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-22 13:15 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Brian M . Carlson

git-commit(1) says ISO-8601 is one of our supported date format.

ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.

And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:

	2008-02-14 20:30:45.019-04:00

Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/date-formats.txt |  5 ++++-
 date.c                         | 10 +++++++++-
 t/t0006-date.sh                |  3 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
index 6926e0a4c8..7e7eaba643 100644
--- a/Documentation/date-formats.txt
+++ b/Documentation/date-formats.txt
@@ -20,7 +20,10 @@ RFC 2822::
 ISO 8601::
 	Time and date specified by the ISO 8601 standard, for example
 	`2005-04-07T22:13:13`. The parser accepts a space instead of the
-	`T` character as well.
+	`T` character as well. Fractional parts of a second will be ignored,
+	for example `2005-04-07T22:13:13.019` will be treated as
+	`2005-04-07T22:13:13`
+
 +
 NOTE: In addition, the date part is accepted in the following formats:
 `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
diff --git a/date.c b/date.c
index b0d9a8421d..62f23b4702 100644
--- a/date.c
+++ b/date.c
@@ -554,8 +554,16 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	/* Time? Date? */
 	switch (c) {
 	case ':':
-		if (num3 < 0)
+		if (num3 < 0) {
 			num3 = 0;
+		} else if (*end == '.' && isdigit(end[1]) &&
+			   tm->tm_year != -1 && tm->tm_mon != -1 &&
+			   tm->tm_mday != -1) {
+			/* Attempt to guess meaning of <num> in HHMMSS.<num>
+			 * only interpret as fractional when %Y %m %d is known.
+			 */
+			strtol(end + 1, &end, 10);
+		}
 		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
 			tm->tm_hour = num;
 			tm->tm_min = num2;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d9fcc829a9..80917c81c3 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,8 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
@@ -103,6 +105,7 @@ check_approxidate 5.seconds.ago '2009-08-30 19:19:55'
 check_approxidate 10.minutes.ago '2009-08-30 19:10:00'
 check_approxidate yesterday '2009-08-29 19:20:00'
 check_approxidate 3.days.ago '2009-08-27 19:20:00'
+check_approxidate '12:34:56.3.days.ago' '2009-08-27 12:34:56'
 check_approxidate 3.weeks.ago '2009-08-09 19:20:00'
 check_approxidate 3.months.ago '2009-05-30 19:20:00'
 check_approxidate 2.years.3.months.ago '2007-05-30 19:20:00'
-- 
2.26.2.303.gf8c07b1a78


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-22 13:15   ` [PATCH v3 0/2] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-22 13:15     ` [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-22 13:15     ` Đoàn Trần Công Danh
  2020-04-22 17:17       ` Junio C Hamano
  1 sibling, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-22 13:15 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 22 ++++++++++++++++++++++
 t/t0006-date.sh |  3 +++
 2 files changed, 25 insertions(+)

diff --git a/date.c b/date.c
index 62f23b4702..882242c2db 100644
--- a/date.c
+++ b/date.c
@@ -672,6 +672,28 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
 		n++;
 	} while (isdigit(date[n]));
 
+	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
+	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
+	if (n == 8 || n == 6) {
+		unsigned int num1 = num / 10000;
+		unsigned int num2 = (num % 10000) / 100;
+		unsigned int num3 = num % 100;
+		if (n == 8 && num1 > 1900 &&
+		    num2 > 0 && num2 <= 12 &&
+		    num3 > 0  && num3 <= 31) {
+			tm->tm_year = num1 - 1900;
+			tm->tm_mon  = num2 - 1;
+			tm->tm_mday = num3;
+		} else if (n == 6 && num1 < 60 && num2 < 60 && num3 <= 60) {
+			tm->tm_hour = num1;
+			tm->tm_min  = num2;
+			tm->tm_sec  = num3;
+			if (*end == '.' && isdigit(end[1]))
+				strtoul(end + 1, &end, 10);
+		}
+		return end - date;
+	}
+
 	/* Four-digit year or a timezone? */
 	if (n == 4) {
 		if (num <= 1400 && *offset == -1) {
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 80917c81c3..75ee9a96b8 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -82,6 +82,9 @@ check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
 check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
-- 
2.26.2.303.gf8c07b1a78


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-22 13:15     ` [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-22 17:05       ` Junio C Hamano
  2020-04-23  1:18         ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-22 17:05 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Brian M . Carlson

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

>  	/* Time? Date? */
>  	switch (c) {
>  	case ':':
> -		if (num3 < 0)
> +		if (num3 < 0) {
>  			num3 = 0;
> +		} else if (*end == '.' && isdigit(end[1]) &&
> +			   tm->tm_year != -1 && tm->tm_mon != -1 &&
> +			   tm->tm_mday != -1) {
> +			/* Attempt to guess meaning of <num> in HHMMSS.<num>
> +			 * only interpret as fractional when %Y %m %d is known.
> +			 */
> +			strtol(end + 1, &end, 10);

OK, so we saw ':' and parsed <num2> after it, and then saw ':' and
<num3>, which we know is a good positive number.  We haven't checked
what <num2> is at this point, but it has to be 0 or more digits
(because we wouldn't have parsed for <num3> if it weren't terminated
with the same c, i.e. ':').

*end points at the byte that stopped <num3> and we make sure <num3>
is followed by "." and a digit.

Regardless of what <num2> is, we just discard the "fractional part
of seconds" (assuming that <num3> is the "seconds" part).

> +		}
>  		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
>  			tm->tm_hour = num;
>  			tm->tm_min = num2;

And after all that is done, if <num2> (and others) are within a
reasonable range, we use that as HH:MM:SS.  

OK.  If <num2> (or <num3>, or even <num> for that matter) weren't
reasonable, is it still sensible to discard the fractional part?
The answer is not immediately obvious to me.

To be safe, it might make sense to extract a helper function from
the next conditional, i.e.

static int is_hms(long num1, long num2, long num3)
{
	return (0 <= num1 && num1 < 25 &&
		0 <= num2 && num2 < 60 &&
		0 <= num3 && num3 <= 60);
}

and use it in the new "else if" block like so?


	} else if (*end == '.' && isdigit(end[1]) &&
		   is_date(tm->tm_year, tm->tm_mon, tm->tm_mday, NULL, now, tm) &&
		   is_hms(num, num2, num3)) {
		/* Discard ".<num4>" from "HH:MM:SS.<num4>" */
		(void) strtol(end + 1, &end, 10);
	}

	if (is_hms(num, num2, num3)) {
		...



> diff --git a/t/t0006-date.sh b/t/t0006-date.sh
> index d9fcc829a9..80917c81c3 100755
> --- a/t/t0006-date.sh
> +++ b/t/t0006-date.sh
> @@ -81,6 +81,8 @@ check_parse 2008-02 bad
>  check_parse 2008-02-14 bad
>  check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
>  check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
> +check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
> +check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
>  check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
>  check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
>  check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
> @@ -103,6 +105,7 @@ check_approxidate 5.seconds.ago '2009-08-30 19:19:55'
>  check_approxidate 10.minutes.ago '2009-08-30 19:10:00'
>  check_approxidate yesterday '2009-08-29 19:20:00'
>  check_approxidate 3.days.ago '2009-08-27 19:20:00'
> +check_approxidate '12:34:56.3.days.ago' '2009-08-27 12:34:56'
>  check_approxidate 3.weeks.ago '2009-08-09 19:20:00'
>  check_approxidate 3.months.ago '2009-05-30 19:20:00'
>  check_approxidate 2.years.3.months.ago '2007-05-30 19:20:00'

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-22 13:15     ` [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
@ 2020-04-22 17:17       ` Junio C Hamano
  2020-04-23  1:20         ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-22 17:17 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  date.c          | 22 ++++++++++++++++++++++
>  t/t0006-date.sh |  3 +++
>  2 files changed, 25 insertions(+)
>
> diff --git a/date.c b/date.c
> index 62f23b4702..882242c2db 100644
> --- a/date.c
> +++ b/date.c
> @@ -672,6 +672,28 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
>  		n++;
>  	} while (isdigit(date[n]));
>  
> +	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
> +	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
> +	if (n == 8 || n == 6) {
> +		unsigned int num1 = num / 10000;
> +		unsigned int num2 = (num % 10000) / 100;
> +		unsigned int num3 = num % 100;
> +		if (n == 8 && num1 > 1900 &&
> +		    num2 > 0 && num2 <= 12 &&
> +		    num3 > 0  && num3 <= 31) {
> +			tm->tm_year = num1 - 1900;
> +			tm->tm_mon  = num2 - 1;
> +			tm->tm_mday = num3;
> +		} else if (n == 6 && num1 < 60 && num2 < 60 && num3 <= 60) {
> +			tm->tm_hour = num1;
> +			tm->tm_min  = num2;
> +			tm->tm_sec  = num3;
> +			if (*end == '.' && isdigit(end[1]))
> +				strtoul(end + 1, &end, 10);
> +		}
> +		return end - date;
> +	}
> +

Looks sensible except that on our planet, one day has only 24 hours
;-).

I think we should try to reuse existing helpers as much as possible
in date.c to avoid such stupid errors.  During my review of [1/2] I
found is_date() would be a good thing to try reusing and also
extracted is_hms() as another candidate we could reuse.

>  	/* Four-digit year or a timezone? */
>  	if (n == 4) {
>  		if (num <= 1400 && *offset == -1) {
> diff --git a/t/t0006-date.sh b/t/t0006-date.sh
> index 80917c81c3..75ee9a96b8 100755
> --- a/t/t0006-date.sh
> +++ b/t/t0006-date.sh
> @@ -82,6 +82,9 @@ check_parse 2008-02-14 bad
>  check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
>  check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
>  check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
> +check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
> +check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
> +check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
>  check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
>  check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
>  check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-22 17:05       ` Junio C Hamano
@ 2020-04-23  1:18         ` Danh Doan
  2020-04-23 19:28           ` Junio C Hamano
  0 siblings, 1 reply; 49+ messages in thread
From: Danh Doan @ 2020-04-23  1:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brian M . Carlson

On 2020-04-22 10:05:54-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> >  	/* Time? Date? */
> >  	switch (c) {
> >  	case ':':
> > -		if (num3 < 0)
> > +		if (num3 < 0) {
> >  			num3 = 0;
> > +		} else if (*end == '.' && isdigit(end[1]) &&
> > +			   tm->tm_year != -1 && tm->tm_mon != -1 &&
> > +			   tm->tm_mday != -1) {
> > +			/* Attempt to guess meaning of <num> in HHMMSS.<num>
> > +			 * only interpret as fractional when %Y %m %d is known.
> > +			 */
> > +			strtol(end + 1, &end, 10);
> 
> OK, so we saw ':' and parsed <num2> after it, and then saw ':' and
> <num3>, which we know is a good positive number.  We haven't checked
> what <num2> is at this point, but it has to be 0 or more digits
> (because we wouldn't have parsed for <num3> if it weren't terminated
> with the same c, i.e. ':').
> 
> *end points at the byte that stopped <num3> and we make sure <num3>
> is followed by "." and a digit.
> 
> Regardless of what <num2> is, we just discard the "fractional part
> of seconds" (assuming that <num3> is the "seconds" part).
> 
> > +		}
> >  		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
> >  			tm->tm_hour = num;
> >  			tm->tm_min = num2;
> 
> And after all that is done, if <num2> (and others) are within a
> reasonable range, we use that as HH:MM:SS.  
> 
> OK.  If <num2> (or <num3>, or even <num> for that matter) weren't
> reasonable, is it still sensible to discard the fractional part?
> The answer is not immediately obvious to me.
> 
> To be safe, it might make sense to extract a helper function from
> the next conditional, i.e.
> 
> static int is_hms(long num1, long num2, long num3)

I'll make it `is_time` on par with is_date check.
I'll look into it and check if int or long is suitable for parameter's
type.

> {
> 	return (0 <= num1 && num1 < 25 &&
> 		0 <= num2 && num2 < 60 &&
> 		0 <= num3 && num3 <= 60);

Does it worth to add an explicit comment that we intentionally ignore
the old-and-defective 2nd leap seconds (i.e. second with value 61)?

I saw in one of your previous email doubting if we should check for
`num3 <= 61` or not.

> }
> 
> and use it in the new "else if" block like so?
> 
> 
> 	} else if (*end == '.' && isdigit(end[1]) &&
> 		   is_date(tm->tm_year, tm->tm_mon, tm->tm_mday, NULL, now, tm) &&

When running into this, the code patch for non-approxidate hasn't
initialised value for now, yet.

And for non-approxidate, when trying to parse date, we initialise now
when guessing which value is year, month, day.

Does it make sense to initialise now from parse_date_basic, passed it
to match_multi_number via match_digit?

To me, it's too much a pain.

I /think/ it isn't worth to warrant a time(NULL) and another is_date
call here. We've checked it one when we set it before, no?

> 		   is_hms(num, num2, num3)) {
> 		/* Discard ".<num4>" from "HH:MM:SS.<num4>" */

Yeah, <num4> is better comment, since num meant different thing in
this file.


-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime
  2020-04-22 17:17       ` Junio C Hamano
@ 2020-04-23  1:20         ` Danh Doan
  0 siblings, 0 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-23  1:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 2020-04-22 10:17:35-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> >  date.c          | 22 ++++++++++++++++++++++
> >  t/t0006-date.sh |  3 +++
> >  2 files changed, 25 insertions(+)
> >
> > diff --git a/date.c b/date.c
> > index 62f23b4702..882242c2db 100644
> > --- a/date.c
> > +++ b/date.c
> > @@ -672,6 +672,28 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
> >  		n++;
> >  	} while (isdigit(date[n]));
> >  
> > +	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
> > +	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
> > +	if (n == 8 || n == 6) {
> > +		unsigned int num1 = num / 10000;
> > +		unsigned int num2 = (num % 10000) / 100;
> > +		unsigned int num3 = num % 100;
> > +		if (n == 8 && num1 > 1900 &&
> > +		    num2 > 0 && num2 <= 12 &&
> > +		    num3 > 0  && num3 <= 31) {
> > +			tm->tm_year = num1 - 1900;
> > +			tm->tm_mon  = num2 - 1;
> > +			tm->tm_mday = num3;
> > +		} else if (n == 6 && num1 < 60 && num2 < 60 && num3 <= 60) {
> > +			tm->tm_hour = num1;
> > +			tm->tm_min  = num2;
> > +			tm->tm_sec  = num3;
> > +			if (*end == '.' && isdigit(end[1]))
> > +				strtoul(end + 1, &end, 10);
> > +		}
> > +		return end - date;
> > +	}
> > +
> 
> Looks sensible except that on our planet, one day has only 24 hours
> ;-).

My bad, I admit that I wouldn't run into this error if we have the
helper is_hms (or is_time)

> 
> I think we should try to reuse existing helpers as much as possible
> in date.c to avoid such stupid errors.  During my review of [1/2] I
> found is_date() would be a good thing to try reusing and also

I'll look into this and see which value should be passed to is_date

> extracted is_hms() as another candidate we could reuse.

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 0/4] More ISO-8601 support
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
                     ` (4 preceding siblings ...)
  2020-04-22 13:15   ` [PATCH v3 0/2] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-23 13:52   ` Đoàn Trần Công Danh
  2020-04-23 13:52     ` [PATCH v4 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
                       ` (3 more replies)
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
  6 siblings, 4 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-23 13:52 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, brian m. carlson,
	Jeff King, Torsten Bögershausen, Junio C Hamano

This series aims to extend support for ISO-8601 datetime format
to allow compact version, and fractional part of ISO-8601.

Changes in v4 from v3:
* s/is_date/set_date/ the function's name suggest it only does validation,
  but it does more than that. Junio suggested to me to use it for validation,
  When I looked more into it, I think it's better to not use it, and rename
  the function to reduce the confusion
* Extract the validate and set time to its own function
* Correct a check for time in compact ISO-8601

Changes in v3 from v2:
* Add example for fractional parts of second in documentation
* Add/Fix regression test on 12:34:56.7.days.ago

Đoàn Trần Công Danh (4):
  date.c: s/is_date/set_date/
  date.c: validate and set time in a helper function
  date.c: skip fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

 Documentation/date-formats.txt |  5 ++-
 date.c                         | 66 +++++++++++++++++++++++++---------
 t/t0006-date.sh                |  6 ++++
 3 files changed, 59 insertions(+), 18 deletions(-)

Range-diff against v3:
-:  ---------- > 1:  1fe69008fc date.c: s/is_date/set_date/
-:  ---------- > 2:  0d0e4d8edc date.c: validate and set time in a helper function
1:  c6d42785bb ! 3:  8b18d0ee5d date.c: skip fractional second part of ISO-8601
    @@ Commit message
     
                 2008-02-14 20:30:45.019-04:00
     
    +    While doing this, make sure that we only interpret the number after the
    +    second and the dot as fractional when and only when the date is known,
    +    since only ISO-8601 allows the fractional part, and we've taught our
    +    users to interpret "12:34:56.7.days.ago" as a way to specify a time
    +    relative to current time.
    +
         Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
    @@ date.c: static int match_multi_number(timestamp_t num, char c, const char *date,
     +		if (num3 < 0) {
      			num3 = 0;
     +		} else if (*end == '.' && isdigit(end[1]) &&
    -+			   tm->tm_year != -1 && tm->tm_mon != -1 &&
    -+			   tm->tm_mday != -1) {
    -+			/* Attempt to guess meaning of <num> in HHMMSS.<num>
    -+			 * only interpret as fractional when %Y %m %d is known.
    -+			 */
    ++			   tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1 &&
    ++			   set_time(num, num2, num3, tm) == 0) {
    ++			/* %Y%m%d is known, ignore fractional <num4> in HHMMSS.<num4> */
     +			strtol(end + 1, &end, 10);
     +		}
    - 		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
    - 			tm->tm_hour = num;
    - 			tm->tm_min = num2;
    + 		if (set_time(num, num2, num3, tm) == 0)
    + 			break;
    + 		return 0;
     
      ## t/t0006-date.sh ##
     @@ t/t0006-date.sh: check_parse 2008-02 bad
2:  225b6401bd ! 4:  2812439a26 date.c: allow compact version of ISO-8601 datetime
    @@ date.c: static int match_digit(const char *date, struct tm *tm, int *offset, int
     +		unsigned int num1 = num / 10000;
     +		unsigned int num2 = (num % 10000) / 100;
     +		unsigned int num3 = num % 100;
    -+		if (n == 8 && num1 > 1900 &&
    -+		    num2 > 0 && num2 <= 12 &&
    -+		    num3 > 0  && num3 <= 31) {
    -+			tm->tm_year = num1 - 1900;
    -+			tm->tm_mon  = num2 - 1;
    -+			tm->tm_mday = num3;
    -+		} else if (n == 6 && num1 < 60 && num2 < 60 && num3 <= 60) {
    -+			tm->tm_hour = num1;
    -+			tm->tm_min  = num2;
    -+			tm->tm_sec  = num3;
    -+			if (*end == '.' && isdigit(end[1]))
    -+				strtoul(end + 1, &end, 10);
    -+		}
    ++		if (n == 8)
    ++			set_date(num1, num2, num3, NULL, time(NULL), tm);
    ++		else if (n == 6 && set_time(num1, num2, num3, tm) == 0 &&
    ++			 *end == '.' && isdigit(end[1]))
    ++			strtoul(end + 1, &end, 10);
     +		return end - date;
     +	}
     +
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 1/4] date.c: s/is_date/set_date/
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-23 13:52     ` Đoàn Trần Công Danh
  2020-04-23 20:08       ` Junio C Hamano
  2020-04-23 13:52     ` [PATCH v4 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-23 13:52 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

The function is_date, confusingly also set tm_year. tm_mon, and tm_mday
after validating input.

Rename to set_date to reflect its real usage.

Also, change return value is 0 on success and -1 on failure following
our convention on function do some real work.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/date.c b/date.c
index b0d9a8421d..b67c5abe24 100644
--- a/date.c
+++ b/date.c
@@ -497,7 +497,7 @@ static int match_alpha(const char *date, struct tm *tm, int *offset)
 	return skip_alpha(date);
 }
 
-static int is_date(int year, int month, int day, struct tm *now_tm, time_t now, struct tm *tm)
+static int set_date(int year, int month, int day, struct tm *now_tm, time_t now, struct tm *tm)
 {
 	if (month > 0 && month < 13 && day > 0 && day < 32) {
 		struct tm check = *tm;
@@ -518,9 +518,9 @@ static int is_date(int year, int month, int day, struct tm *now_tm, time_t now,
 		else if (year < 38)
 			r->tm_year = year + 100;
 		else
-			return 0;
+			return -1;
 		if (!now_tm)
-			return 1;
+			return 0;
 
 		specified = tm_to_time_t(r);
 
@@ -529,14 +529,14 @@ static int is_date(int year, int month, int day, struct tm *now_tm, time_t now,
 		 * sure it is not later than ten days from now...
 		 */
 		if ((specified != -1) && (now + 10*24*3600 < specified))
-			return 0;
+			return -1;
 		tm->tm_mon = r->tm_mon;
 		tm->tm_mday = r->tm_mday;
 		if (year != -1)
 			tm->tm_year = r->tm_year;
-		return 1;
+		return 0;
 	}
-	return 0;
+	return -1;
 }
 
 static int match_multi_number(timestamp_t num, char c, const char *date,
@@ -575,10 +575,10 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 
 		if (num > 70) {
 			/* yyyy-mm-dd? */
-			if (is_date(num, num2, num3, NULL, now, tm))
+			if (set_date(num, num2, num3, NULL, now, tm) == 0)
 				break;
 			/* yyyy-dd-mm? */
-			if (is_date(num, num3, num2, NULL, now, tm))
+			if (set_date(num, num3, num2, NULL, now, tm) == 0)
 				break;
 		}
 		/* Our eastern European friends say dd.mm.yy[yy]
@@ -586,14 +586,14 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 		 * mm/dd/yy[yy] form only when separator is not '.'
 		 */
 		if (c != '.' &&
-		    is_date(num3, num, num2, refuse_future, now, tm))
+		    set_date(num3, num, num2, refuse_future, now, tm) == 0)
 			break;
 		/* European dd.mm.yy[yy] or funny US dd/mm/yy[yy] */
-		if (is_date(num3, num2, num, refuse_future, now, tm))
+		if (set_date(num3, num2, num, refuse_future, now, tm) == 0)
 			break;
 		/* Funny European mm.dd.yy */
 		if (c == '.' &&
-		    is_date(num3, num, num2, refuse_future, now, tm))
+		    set_date(num3, num, num2, refuse_future, now, tm) == 0)
 			break;
 		return 0;
 	}
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 2/4] date.c: validate and set time in a helper function
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-23 13:52     ` [PATCH v4 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
@ 2020-04-23 13:52     ` Đoàn Trần Công Danh
  2020-04-23 20:18       ` Junio C Hamano
  2020-04-23 13:52     ` [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-23 13:52     ` [PATCH v4 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  3 siblings, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-23 13:52 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, we will reuse this logic, move it to a helper, now.

While we're at it, explicit states that we intentionally ignore
old-and-defective 2nd leap second.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/date.c b/date.c
index b67c5abe24..f5d5a91208 100644
--- a/date.c
+++ b/date.c
@@ -539,6 +539,22 @@ static int set_date(int year, int month, int day, struct tm *now_tm, time_t now,
 	return -1;
 }
 
+static int set_time(long hour, long minute, long second, struct tm *tm)
+{
+	/* C90 and old POSIX accepts 2 leap seconds, it's a defect,
+	 * ignore second number 61
+	 */
+	if (0 <= hour && hour <= 24 &&
+	    0 <= minute && minute < 60 &&
+	    0 <= second && second <= 60) {
+		tm->tm_hour = hour;
+		tm->tm_min = minute;
+		tm->tm_sec = second;
+		return 0;
+	}
+	return -1;
+}
+
 static int match_multi_number(timestamp_t num, char c, const char *date,
 			      char *end, struct tm *tm, time_t now)
 {
@@ -556,12 +572,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	case ':':
 		if (num3 < 0)
 			num3 = 0;
-		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
-			tm->tm_hour = num;
-			tm->tm_min = num2;
-			tm->tm_sec = num3;
+		if (set_time(num, num2, num3, tm) == 0)
 			break;
-		}
 		return 0;
 
 	case '-':
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-23 13:52     ` [PATCH v4 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
  2020-04-23 13:52     ` [PATCH v4 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
@ 2020-04-23 13:52     ` Đoàn Trần Công Danh
  2020-04-23 20:29       ` Junio C Hamano
  2020-04-23 13:52     ` [PATCH v4 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  3 siblings, 1 reply; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-23 13:52 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh, Brian M . Carlson

git-commit(1) says ISO-8601 is one of our supported date format.

ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.

And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:

	2008-02-14 20:30:45.019-04:00

While doing this, make sure that we only interpret the number after the
second and the dot as fractional when and only when the date is known,
since only ISO-8601 allows the fractional part, and we've taught our
users to interpret "12:34:56.7.days.ago" as a way to specify a time
relative to current time.

Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/date-formats.txt | 5 ++++-
 date.c                         | 8 +++++++-
 t/t0006-date.sh                | 3 +++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
index 6926e0a4c8..7e7eaba643 100644
--- a/Documentation/date-formats.txt
+++ b/Documentation/date-formats.txt
@@ -20,7 +20,10 @@ RFC 2822::
 ISO 8601::
 	Time and date specified by the ISO 8601 standard, for example
 	`2005-04-07T22:13:13`. The parser accepts a space instead of the
-	`T` character as well.
+	`T` character as well. Fractional parts of a second will be ignored,
+	for example `2005-04-07T22:13:13.019` will be treated as
+	`2005-04-07T22:13:13`
+
 +
 NOTE: In addition, the date part is accepted in the following formats:
 `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
diff --git a/date.c b/date.c
index f5d5a91208..5d635031e0 100644
--- a/date.c
+++ b/date.c
@@ -570,8 +570,14 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	/* Time? Date? */
 	switch (c) {
 	case ':':
-		if (num3 < 0)
+		if (num3 < 0) {
 			num3 = 0;
+		} else if (*end == '.' && isdigit(end[1]) &&
+			   tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1 &&
+			   set_time(num, num2, num3, tm) == 0) {
+			/* %Y%m%d is known, ignore fractional <num4> in HHMMSS.<num4> */
+			strtol(end + 1, &end, 10);
+		}
 		if (set_time(num, num2, num3, tm) == 0)
 			break;
 		return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d9fcc829a9..80917c81c3 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,8 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
@@ -103,6 +105,7 @@ check_approxidate 5.seconds.ago '2009-08-30 19:19:55'
 check_approxidate 10.minutes.ago '2009-08-30 19:10:00'
 check_approxidate yesterday '2009-08-29 19:20:00'
 check_approxidate 3.days.ago '2009-08-27 19:20:00'
+check_approxidate '12:34:56.3.days.ago' '2009-08-27 12:34:56'
 check_approxidate 3.weeks.ago '2009-08-09 19:20:00'
 check_approxidate 3.months.ago '2009-05-30 19:20:00'
 check_approxidate 2.years.3.months.ago '2007-05-30 19:20:00'
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v4 4/4] date.c: allow compact version of ISO-8601 datetime
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
                       ` (2 preceding siblings ...)
  2020-04-23 13:52     ` [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-23 13:52     ` Đoàn Trần Công Danh
  3 siblings, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-23 13:52 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 14 ++++++++++++++
 t/t0006-date.sh |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/date.c b/date.c
index 5d635031e0..0c66e5a5ae 100644
--- a/date.c
+++ b/date.c
@@ -682,6 +682,20 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
 		n++;
 	} while (isdigit(date[n]));
 
+	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
+	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
+	if (n == 8 || n == 6) {
+		unsigned int num1 = num / 10000;
+		unsigned int num2 = (num % 10000) / 100;
+		unsigned int num3 = num % 100;
+		if (n == 8)
+			set_date(num1, num2, num3, NULL, time(NULL), tm);
+		else if (n == 6 && set_time(num1, num2, num3, tm) == 0 &&
+			 *end == '.' && isdigit(end[1]))
+			strtoul(end + 1, &end, 10);
+		return end - date;
+	}
+
 	/* Four-digit year or a timezone? */
 	if (n == 4) {
 		if (num <= 1400 && *offset == -1) {
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 80917c81c3..75ee9a96b8 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -82,6 +82,9 @@ check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
 check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-23  1:18         ` Danh Doan
@ 2020-04-23 19:28           ` Junio C Hamano
  2020-04-23 20:41             ` Philip Oakley
  0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-23 19:28 UTC (permalink / raw)
  To: Danh Doan; +Cc: git, Brian M . Carlson

Danh Doan <congdanhqx@gmail.com> writes:

>> >  		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
>> >  			tm->tm_hour = num;
>> >  			tm->tm_min = num2;
>> 
>> And after all that is done, if <num2> (and others) are within a
>> reasonable range, we use that as HH:MM:SS.  
>> 
>> OK.  If <num2> (or <num3>, or even <num> for that matter) weren't
>> reasonable, is it still sensible to discard the fractional part?
>> The answer is not immediately obvious to me.
>> 
>> To be safe, it might make sense to extract a helper function from
>> the next conditional, i.e.
>> 
>> static int is_hms(long num1, long num2, long num3)
>
> I'll make it `is_time` on par with is_date check.

That is probably a lot more readable name than is_hms().  

I do not worry too much if the name is not "on par with" it, though,
because is_date() does more than just "check", as you noticed below,
unlike the "is hour-minute-seconds are within reasonable range?"
check.

> I'll look into it and check if int or long is suitable for parameter's
> type.
>
>> {
>> 	return (0 <= num1 && num1 < 25 &&
>> 		0 <= num2 && num2 < 60 &&
>> 		0 <= num3 && num3 <= 60);
>
> Does it worth to add an explicit comment that we intentionally ignore
> the old-and-defective 2nd leap seconds (i.e. second with value 61)?
>
> I saw in one of your previous email doubting if we should check for
> `num3 <= 61` or not.

I wrote that without checking anything, even what our own code does.
As the existing code seems to want to work only with a second part
that is 60 or less, not allowing a minute with 62 seconds, I think
sticking to that and saying "0 <= num3 && num3 <= 60" is good.

>> }
>> 
>> and use it in the new "else if" block like so?
>> 
>> 
>> 	} else if (*end == '.' && isdigit(end[1]) &&
>> 		   is_date(tm->tm_year, tm->tm_mon, tm->tm_mday, NULL, now, tm) &&
>
> When running into this, the code patch for non-approxidate hasn't
> initialised value for now, yet.

We may want to separate the logic that relies on the value of 'now'
and 'now_tm->tm_year' out of is_date() to make it more easily
reusable.  In this generic codepath, for example, we do not
necessarily want to say "we refuse to parse timestamp that is 10
days or more into the future".

The longer I stare at is_date(), the more I am inclined to say it is
a bit of impedance mismatch, and we instead should have the is_hms()
helper as I suggested in the message you are responding to, plus
something like:

static int is_ymd(int year, int month, int day)
{
        /* like tm_to_time_t(), we only work between 1970-2099 */
	static const int days_in_month[] = {
		31, 28, 31, 30, ..., 31
	};

	return !(year < 1970 || 2100 <= year ||
		month <= 0 || 13 <= month ||
		day <= 0 || year / 4 + days_in_month[month-1] <= day);
}

and use it here.  I am not sure if we can reuse this inside
is_date(), but if we can do so that would be good (and if we cannot,
that is fine, too).

Thanks.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v4 1/4] date.c: s/is_date/set_date/
  2020-04-23 13:52     ` [PATCH v4 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
@ 2020-04-23 20:08       ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2020-04-23 20:08 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> The function is_date, confusingly also set tm_year. tm_mon, and tm_mday
> after validating input.
>
> Rename to set_date to reflect its real usage.

Yup, I think I can agree with the reasoning here.  It is a good
clean-up regardless of the remainder of the series.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v4 2/4] date.c: validate and set time in a helper function
  2020-04-23 13:52     ` [PATCH v4 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
@ 2020-04-23 20:18       ` Junio C Hamano
  2020-04-24 11:43         ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-23 20:18 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> In a later patch, we will reuse this logic, move it to a helper, now.
>
> While we're at it, explicit states that we intentionally ignore

"explicitly state", perhaps.

> old-and-defective 2nd leap second.
>
> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> ---
>  date.c | 22 +++++++++++++++++-----
>  1 file changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/date.c b/date.c
> index b67c5abe24..f5d5a91208 100644
> --- a/date.c
> +++ b/date.c
> @@ -539,6 +539,22 @@ static int set_date(int year, int month, int day, struct tm *now_tm, time_t now,
>  	return -1;
>  }
>  
> +static int set_time(long hour, long minute, long second, struct tm *tm)
> +{
> +	/* C90 and old POSIX accepts 2 leap seconds, it's a defect,
> +	 * ignore second number 61
> +	 */

	/*
	 * Style: our multi-line comments ought to be
	 * formatted like this.  Slash-asterisk that opens,
	 * and asterisk-slash that closes, are both on their
	 * own lines.
	 */

But I am not sure we want to even have a new comment here.  After
all we are extracting/reinventing exactly the same logic as the
original.  Why we allow "60" might be worth commenting, but if a
minute that has 62 seconds is a mere historical curiosity, then is
it worth explaining why "61", which we never even wrote in the code,
is missing from here?

> +	if (0 <= hour && hour <= 24 &&
> +	    0 <= minute && minute < 60 &&
> +	    0 <= second && second <= 60) {
> +		tm->tm_hour = hour;
> +		tm->tm_min = minute;
> +		tm->tm_sec = second;
> +		return 0;
> +	}
> +	return -1;
> +}

I am a bit surprised to see that you chose to unify with the "check
and set" interface of is_date (now set_date).  I was expecting to
see that we'd have "check-only" helper functions.

This is not a complaint, at least not yet until we see the result of
using it in new code; it may very well be possible that the "check
and set" interface would make the new caller(s) clearer.

>  static int match_multi_number(timestamp_t num, char c, const char *date,
>  			      char *end, struct tm *tm, time_t now)
>  {
> @@ -556,12 +572,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
>  	case ':':
>  		if (num3 < 0)
>  			num3 = 0;
> -		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
> -			tm->tm_hour = num;
> -			tm->tm_min = num2;
> -			tm->tm_sec = num3;
> +		if (set_time(num, num2, num3, tm) == 0)
>  			break;
> -		}
>  		return 0;

This caller does become easier to follow, I would say.  Nicely done.

>  	case '-':

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601
  2020-04-23 13:52     ` [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-23 20:29       ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2020-04-23 20:29 UTC (permalink / raw)
  To: Đoàn Trần Công Danh; +Cc: git, Brian M . Carlson

Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:

> -		if (num3 < 0)
> +		if (num3 < 0) {
>  			num3 = 0;
> +		} else if (*end == '.' && isdigit(end[1]) &&
> +			   tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1 &&
> +			   set_time(num, num2, num3, tm) == 0) {
> +			/* %Y%m%d is known, ignore fractional <num4> in HHMMSS.<num4> */
> +			strtol(end + 1, &end, 10);
> +		}
>  		if (set_time(num, num2, num3, tm) == 0)

Hmmm.  

While calling set_time() on the same 4-tuple might be idempotent
when it succeeds, the call with a potential new side effect in the
conditional part leaves a bad taste in my mouth.

I am wondering if the follwoing would be a more readable
alternative.

 date.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/date.c b/date.c
index f5d5a91208..3eb9b6032c 100644
--- a/date.c
+++ b/date.c
@@ -572,8 +572,16 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	case ':':
 		if (num3 < 0)
 			num3 = 0;
-		if (set_time(num, num2, num3, tm) == 0)
+		if (set_time(num, num2, num3, tm) == 0) {
+			/* 
+			 * Is the "HH:MM:SS" we just parsed followed by
+			 *  ".<num4>" (fractinal second)?  Discard it
+			 * only when we already have YY/MM/DD.
+			 */
+			if (*end == '.' && isdigit(end[1]) && date_known(tm))
+				(void) strtol(end + 1, &end, 10);
 			break;
+		}
 		return 0;
 
 	case '-':

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-23 19:28           ` Junio C Hamano
@ 2020-04-23 20:41             ` Philip Oakley
  2020-04-24  0:07               ` Danh Doan
  0 siblings, 1 reply; 49+ messages in thread
From: Philip Oakley @ 2020-04-23 20:41 UTC (permalink / raw)
  To: Junio C Hamano, Danh Doan; +Cc: git, Brian M . Carlson

On 23/04/2020 20:28, Junio C Hamano wrote:
> Danh Doan <congdanhqx@gmail.com> writes:
>
>>>>  		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
>>>>  			tm->tm_hour = num;
>>>>  			tm->tm_min = num2;
>>> And after all that is done, if <num2> (and others) are within a
>>> reasonable range, we use that as HH:MM:SS.  
>>>
>>> OK.  If <num2> (or <num3>, or even <num> for that matter) weren't
>>> reasonable, is it still sensible to discard the fractional part?
>>> The answer is not immediately obvious to me.
>>>
>>> To be safe, it might make sense to extract a helper function from
>>> the next conditional, i.e.
>>>
>>> static int is_hms(long num1, long num2, long num3)
>> I'll make it `is_time` on par with is_date check.
> That is probably a lot more readable name than is_hms().  

I didn't immediately recognise hms and ymd (below) as abbreviations.
>
> I do not worry too much if the name is not "on par with" it, though,
> because is_date() does more than just "check", as you noticed below,
> unlike the "is hour-minute-seconds are within reasonable range?"
> check.
>
>> I'll look into it and check if int or long is suitable for parameter's
>> type.
>>
>>> {
>>> 	return (0 <= num1 && num1 < 25 &&
>>> 		0 <= num2 && num2 < 60 &&
>>> 		0 <= num3 && num3 <= 60);
>> Does it worth to add an explicit comment that we intentionally ignore
>> the old-and-defective 2nd leap seconds (i.e. second with value 61)?
>>
>> I saw in one of your previous email doubting if we should check for
>> `num3 <= 61` or not.
> I wrote that without checking anything, even what our own code does.
> As the existing code seems to want to work only with a second part
> that is 60 or less, not allowing a minute with 62 seconds, I think
> sticking to that and saying "0 <= num3 && num3 <= 60" is good.
>
>>> }
>>>
>>> and use it in the new "else if" block like so?
>>>
>>>
>>> 	} else if (*end == '.' && isdigit(end[1]) &&
>>> 		   is_date(tm->tm_year, tm->tm_mon, tm->tm_mday, NULL, now, tm) &&
>> When running into this, the code patch for non-approxidate hasn't
>> initialised value for now, yet.
> We may want to separate the logic that relies on the value of 'now'
> and 'now_tm->tm_year' out of is_date() to make it more easily
> reusable.  In this generic codepath, for example, we do not
> necessarily want to say "we refuse to parse timestamp that is 10
> days or more into the future".
>
> The longer I stare at is_date(), the more I am inclined to say it is
> a bit of impedance mismatch, and we instead should have the is_hms()
> helper as I suggested in the message you are responding to, plus
> something like:

Would is_hhmmss() and is_yyyymmdd() be more obvious abbreviations for
most readers?

Now that I type them, they do feel that bit too long... , as naming is
hard, maybe stick with the yms and hms, though I do keep wanting to type
ytd for the former..
>
> static int is_ymd(int year, int month, int day)
> {
>         /* like tm_to_time_t(), we only work between 1970-2099 */
> 	static const int days_in_month[] = {
> 		31, 28, 31, 30, ..., 31
> 	};
>
> 	return !(year < 1970 || 2100 <= year ||
> 		month <= 0 || 13 <= month ||
> 		day <= 0 || year / 4 + days_in_month[month-1] <= day);
> }
>
> and use it here.  I am not sure if we can reuse this inside
> is_date(), but if we can do so that would be good (and if we cannot,
> that is fine, too).
>
> Thanks.
--
Philip

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-23 20:41             ` Philip Oakley
@ 2020-04-24  0:07               ` Danh Doan
  2020-04-24  0:46                 ` Junio C Hamano
  2020-04-24 17:30                 ` Philip Oakley
  0 siblings, 2 replies; 49+ messages in thread
From: Danh Doan @ 2020-04-24  0:07 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Junio C Hamano, git, Brian M . Carlson

On 2020-04-23 21:41:49+0100, Philip Oakley <philipoakley@iee.email> wrote:
> On 23/04/2020 20:28, Junio C Hamano wrote:
> > Danh Doan <congdanhqx@gmail.com> writes:
> Would is_hhmmss() and is_yyyymmdd() be more obvious abbreviations for
> most readers?
> 
> Now that I type them, they do feel that bit too long... , as naming is
> hard, maybe stick with the yms and hms, though I do keep wanting to type
> ytd for the former..

Not sure if I interpret your opinion correctly,
Did you mean s/yms/ymd/ and s/ytd/ymd/?

Even that, I couldn't grasp the meaning of the last phase?

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-24  0:07               ` Danh Doan
@ 2020-04-24  0:46                 ` Junio C Hamano
  2020-04-24 17:32                   ` Philip Oakley
  2020-04-24 17:30                 ` Philip Oakley
  1 sibling, 1 reply; 49+ messages in thread
From: Junio C Hamano @ 2020-04-24  0:46 UTC (permalink / raw)
  To: Danh Doan; +Cc: Philip Oakley, git, Brian M . Carlson

Danh Doan <congdanhqx@gmail.com> writes:

> On 2020-04-23 21:41:49+0100, Philip Oakley <philipoakley@iee.email> wrote:
>> On 23/04/2020 20:28, Junio C Hamano wrote:
>> > Danh Doan <congdanhqx@gmail.com> writes:
>> Would is_hhmmss() and is_yyyymmdd() be more obvious abbreviations for
>> most readers?
>> 
>> Now that I type them, they do feel that bit too long... , as naming is
>> hard, maybe stick with the yms and hms, though I do keep wanting to type
>> ytd for the former..
>
> Not sure if I interpret your opinion correctly,
> Did you mean s/yms/ymd/ and s/ytd/ymd/?
>
> Even that, I couldn't grasp the meaning of the last phase?

Here is how I understood it.

Philip thinks, and I admit I have to agree with, that HMS would not
be understood as hour-minute-seconds by most people, and YMD would
not be as yearh-month-day, either.

His "yms" in "stick with the yms and hms" is a typo of "ymd".  He is
saying that even though YYMMDD and HHMMSS would look a lot more
natural, it is too long to type so YMD and HMS may not be so
terrible a compromise.

With the "ytd" in the last one, he is saying that another downside
of saying "ymd" (other than that it is not how we usually spell
year-month-date), even though "ymd" might be an acceptable
compromise, is that it is too easy to get confused with year-to-date
that is commonly abbreviated as "YTD".

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v4 2/4] date.c: validate and set time in a helper function
  2020-04-23 20:18       ` Junio C Hamano
@ 2020-04-24 11:43         ` Danh Doan
  2020-04-24 20:29           ` Junio C Hamano
  0 siblings, 1 reply; 49+ messages in thread
From: Danh Doan @ 2020-04-24 11:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 2020-04-23 13:18:25-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Đoàn Trần Công Danh  <congdanhqx@gmail.com> writes:
> 
> > In a later patch, we will reuse this logic, move it to a helper, now.
> >
> > While we're at it, explicit states that we intentionally ignore
> 
> "explicitly state", perhaps.
> 
> > old-and-defective 2nd leap second.
> >
> > Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
> > ---
> >  date.c | 22 +++++++++++++++++-----
> >  1 file changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/date.c b/date.c
> > index b67c5abe24..f5d5a91208 100644
> > --- a/date.c
> > +++ b/date.c
> > @@ -539,6 +539,22 @@ static int set_date(int year, int month, int day, struct tm *now_tm, time_t now,
> >  	return -1;
> >  }
> >  
> > +static int set_time(long hour, long minute, long second, struct tm *tm)
> > +{
> > +	/* C90 and old POSIX accepts 2 leap seconds, it's a defect,
> > +	 * ignore second number 61
> > +	 */
> 
> 	/*
> 	 * Style: our multi-line comments ought to be
> 	 * formatted like this.  Slash-asterisk that opens,
> 	 * and asterisk-slash that closes, are both on their
> 	 * own lines.
> 	 */
> 
> But I am not sure we want to even have a new comment here.  After
> all we are extracting/reinventing exactly the same logic as the
> original.  Why we allow "60" might be worth commenting, but if a
> minute that has 62 seconds is a mere historical curiosity, then is
> it worth explaining why "61", which we never even wrote in the code,
> is missing from here?

I think single line like:

	/* We accept 61st second for the single? leap second */

Or something along the time, is good enough. Not sure if we want the
word "single" there, though.

I think majority of people don't even know about leap second.
Probability that know about 62nd second is rarer, I think.

> > +	if (0 <= hour && hour <= 24 &&
> > +	    0 <= minute && minute < 60 &&
> > +	    0 <= second && second <= 60) {
> > +		tm->tm_hour = hour;
> > +		tm->tm_min = minute;
> > +		tm->tm_sec = second;
> > +		return 0;
> > +	}
> > +	return -1;
> > +}
> 
> I am a bit surprised to see that you chose to unify with the "check
> and set" interface of is_date (now set_date).  I was expecting to
> see that we'd have "check-only" helper functions.
> 
> This is not a complaint, at least not yet until we see the result of
> using it in new code; it may very well be possible that the "check
> and set" interface would make the new caller(s) clearer.
> 
> >  static int match_multi_number(timestamp_t num, char c, const char *date,
> >  			      char *end, struct tm *tm, time_t now)
> >  {
> > @@ -556,12 +572,8 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
> >  	case ':':
> >  		if (num3 < 0)
> >  			num3 = 0;
> > -		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
> > -			tm->tm_hour = num;
> > -			tm->tm_min = num2;
> > -			tm->tm_sec = num3;
> > +		if (set_time(num, num2, num3, tm) == 0)
> >  			break;
> > -		}
> >  		return 0;
> 
> This caller does become easier to follow, I would say.  Nicely done.

Yes, when I looked around date.c

I saw that the only usecase for validate time is for setting it.
And the incoming patch also has that usage.

I chose to unify those code path to not buy me too much trouble.

I'll take that "Nicely done" means this unification is OK for this
series.

-- 
Danh

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v5 0/4] More ISO-8601 support
  2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
                     ` (5 preceding siblings ...)
  2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-24 15:07   ` Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
                       ` (3 more replies)
  6 siblings, 4 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-24 15:07 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, brian m. carlson,
	Jeff King, Torsten Bögershausen, Junio C Hamano

This series aims to extend support for ISO-8601 datetime format
to allow compact version, and fractional part of ISO-8601.

Change in v5 from v4:
* cleanup [3/4] following Junio's suggestion.

Changes in v4 from v3:
* s/is_date/set_date/ the function's name suggest it only does validation,
  but it does more than that. Junio suggested to me to use it for validation,
  When I looked more into it, I think it's better to not use it, and rename
  the function to reduce the confusion
* Extract the validate and set time to its own function
* Correct a check for time in compact ISO-8601

Changes in v3 from v2:
* Add example for fractional parts of second in documentation
* Add/Fix regression test on 12:34:56.7.days.ago

Đoàn Trần Công Danh (4):
  date.c: s/is_date/set_date/
  date.c: validate and set time in a helper function
  date.c: skip fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

Đoàn Trần Công Danh (4):
  date.c: s/is_date/set_date/
  date.c: validate and set time in a helper function
  date.c: skip fractional second part of ISO-8601
  date.c: allow compact version of ISO-8601 datetime

 Documentation/date-formats.txt |  5 ++-
 date.c                         | 67 ++++++++++++++++++++++++++--------
 t/t0006-date.sh                |  6 +++
 3 files changed, 62 insertions(+), 16 deletions(-)

Range-diff against v4:
1:  1fe69008fc = 1:  1fe69008fc date.c: s/is_date/set_date/
2:  0d0e4d8edc ! 2:  a6b97b19f2 date.c: validate and set time in a helper function
    @@ Commit message
     
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
    +
    + ## Notes ##
    +    I intentionally leave a pair of bracket around if (set_time(...))
    +    to reduce the noise in next patch
    +
      ## date.c ##
     @@ date.c: static int set_date(int year, int month, int day, struct tm *now_tm, time_t now,
      	return -1;
    @@ date.c: static int set_date(int year, int month, int day, struct tm *now_tm, tim
      
     +static int set_time(long hour, long minute, long second, struct tm *tm)
     +{
    -+	/* C90 and old POSIX accepts 2 leap seconds, it's a defect,
    -+	 * ignore second number 61
    -+	 */
    ++	/* We accept 61st second because of leap second */
     +	if (0 <= hour && hour <= 24 &&
     +	    0 <= minute && minute < 60 &&
     +	    0 <= second && second <= 60) {
    @@ date.c: static int match_multi_number(timestamp_t num, char c, const char *date,
     -			tm->tm_hour = num;
     -			tm->tm_min = num2;
     -			tm->tm_sec = num3;
    -+		if (set_time(num, num2, num3, tm) == 0)
    ++		if (set_time(num, num2, num3, tm) == 0) {
      			break;
    --		}
    + 		}
      		return 0;
    - 
    - 	case '-':
3:  8b18d0ee5d ! 3:  f21aa2dcf5 date.c: skip fractional second part of ISO-8601
    @@ Commit message
         relative to current time.
     
         Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
    +    Helped-by: Junio C Hamano <gitster@pobox.com>
         Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
     
      ## Documentation/date-formats.txt ##
    @@ Documentation/date-formats.txt: RFC 2822::
      `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
     
      ## date.c ##
    +@@ date.c: static int set_time(long hour, long minute, long second, struct tm *tm)
    + 	return -1;
    + }
    + 
    ++static int is_date_known(struct tm *tm)
    ++{
    ++	return tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1;
    ++}
    ++
    + static int match_multi_number(timestamp_t num, char c, const char *date,
    + 			      char *end, struct tm *tm, time_t now)
    + {
     @@ date.c: static int match_multi_number(timestamp_t num, char c, const char *date,
    - 	/* Time? Date? */
    - 	switch (c) {
    - 	case ':':
    --		if (num3 < 0)
    -+		if (num3 < 0) {
    + 		if (num3 < 0)
      			num3 = 0;
    -+		} else if (*end == '.' && isdigit(end[1]) &&
    -+			   tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1 &&
    -+			   set_time(num, num2, num3, tm) == 0) {
    -+			/* %Y%m%d is known, ignore fractional <num4> in HHMMSS.<num4> */
    -+			strtol(end + 1, &end, 10);
    -+		}
    - 		if (set_time(num, num2, num3, tm) == 0)
    + 		if (set_time(num, num2, num3, tm) == 0) {
    ++			/*
    ++			 * If %H:%M:%S was just parsed followed by: .<num4>
    ++			 * Consider (& discard) it as fractional second
    ++			 * if %Y%m%d is parsed before.
    ++			 */
    ++			if (*end == '.' && isdigit(end[1]) && is_date_known(tm))
    ++				strtol(end + 1, &end, 10);
      			break;
    + 		}
      		return 0;
     
      ## t/t0006-date.sh ##
4:  2812439a26 = 4:  51aa60c069 date.c: allow compact version of ISO-8601 datetime
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v5 1/4] date.c: s/is_date/set_date/
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
@ 2020-04-24 15:07     ` Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-24 15:07 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

The function is_date, confusingly also set tm_year. tm_mon, and tm_mday
after validating input.

Rename to set_date to reflect its real usage.

Also, change return value is 0 on success and -1 on failure following
our convention on function do some real work.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/date.c b/date.c
index b0d9a8421d..b67c5abe24 100644
--- a/date.c
+++ b/date.c
@@ -497,7 +497,7 @@ static int match_alpha(const char *date, struct tm *tm, int *offset)
 	return skip_alpha(date);
 }
 
-static int is_date(int year, int month, int day, struct tm *now_tm, time_t now, struct tm *tm)
+static int set_date(int year, int month, int day, struct tm *now_tm, time_t now, struct tm *tm)
 {
 	if (month > 0 && month < 13 && day > 0 && day < 32) {
 		struct tm check = *tm;
@@ -518,9 +518,9 @@ static int is_date(int year, int month, int day, struct tm *now_tm, time_t now,
 		else if (year < 38)
 			r->tm_year = year + 100;
 		else
-			return 0;
+			return -1;
 		if (!now_tm)
-			return 1;
+			return 0;
 
 		specified = tm_to_time_t(r);
 
@@ -529,14 +529,14 @@ static int is_date(int year, int month, int day, struct tm *now_tm, time_t now,
 		 * sure it is not later than ten days from now...
 		 */
 		if ((specified != -1) && (now + 10*24*3600 < specified))
-			return 0;
+			return -1;
 		tm->tm_mon = r->tm_mon;
 		tm->tm_mday = r->tm_mday;
 		if (year != -1)
 			tm->tm_year = r->tm_year;
-		return 1;
+		return 0;
 	}
-	return 0;
+	return -1;
 }
 
 static int match_multi_number(timestamp_t num, char c, const char *date,
@@ -575,10 +575,10 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 
 		if (num > 70) {
 			/* yyyy-mm-dd? */
-			if (is_date(num, num2, num3, NULL, now, tm))
+			if (set_date(num, num2, num3, NULL, now, tm) == 0)
 				break;
 			/* yyyy-dd-mm? */
-			if (is_date(num, num3, num2, NULL, now, tm))
+			if (set_date(num, num3, num2, NULL, now, tm) == 0)
 				break;
 		}
 		/* Our eastern European friends say dd.mm.yy[yy]
@@ -586,14 +586,14 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 		 * mm/dd/yy[yy] form only when separator is not '.'
 		 */
 		if (c != '.' &&
-		    is_date(num3, num, num2, refuse_future, now, tm))
+		    set_date(num3, num, num2, refuse_future, now, tm) == 0)
 			break;
 		/* European dd.mm.yy[yy] or funny US dd/mm/yy[yy] */
-		if (is_date(num3, num2, num, refuse_future, now, tm))
+		if (set_date(num3, num2, num, refuse_future, now, tm) == 0)
 			break;
 		/* Funny European mm.dd.yy */
 		if (c == '.' &&
-		    is_date(num3, num, num2, refuse_future, now, tm))
+		    set_date(num3, num, num2, refuse_future, now, tm) == 0)
 			break;
 		return 0;
 	}
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v5 2/4] date.c: validate and set time in a helper function
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
@ 2020-04-24 15:07     ` Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  3 siblings, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-24 15:07 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

In a later patch, we will reuse this logic, move it to a helper, now.

While we're at it, explicit states that we intentionally ignore
old-and-defective 2nd leap second.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---

Notes:
    I intentionally leave a pair of bracket around if (set_time(...))
    to reduce the noise in next patch

 date.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/date.c b/date.c
index b67c5abe24..fa39e5e8a5 100644
--- a/date.c
+++ b/date.c
@@ -539,6 +539,20 @@ static int set_date(int year, int month, int day, struct tm *now_tm, time_t now,
 	return -1;
 }
 
+static int set_time(long hour, long minute, long second, struct tm *tm)
+{
+	/* We accept 61st second because of leap second */
+	if (0 <= hour && hour <= 24 &&
+	    0 <= minute && minute < 60 &&
+	    0 <= second && second <= 60) {
+		tm->tm_hour = hour;
+		tm->tm_min = minute;
+		tm->tm_sec = second;
+		return 0;
+	}
+	return -1;
+}
+
 static int match_multi_number(timestamp_t num, char c, const char *date,
 			      char *end, struct tm *tm, time_t now)
 {
@@ -556,10 +570,7 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 	case ':':
 		if (num3 < 0)
 			num3 = 0;
-		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
-			tm->tm_hour = num;
-			tm->tm_min = num2;
-			tm->tm_sec = num3;
+		if (set_time(num, num2, num3, tm) == 0) {
 			break;
 		}
 		return 0;
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v5 3/4] date.c: skip fractional second part of ISO-8601
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
@ 2020-04-24 15:07     ` Đoàn Trần Công Danh
  2020-04-24 15:07     ` [PATCH v5 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
  3 siblings, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-24 15:07 UTC (permalink / raw)
  To: git
  Cc: Đoàn Trần Công Danh, Brian M . Carlson,
	Junio C Hamano

git-commit(1) says ISO-8601 is one of our supported date format.

ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.

And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:

	2008-02-14 20:30:45.019-04:00

While doing this, make sure that we only interpret the number after the
second and the dot as fractional when and only when the date is known,
since only ISO-8601 allows the fractional part, and we've taught our
users to interpret "12:34:56.7.days.ago" as a way to specify a time
relative to current time.

Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 Documentation/date-formats.txt |  5 ++++-
 date.c                         | 12 ++++++++++++
 t/t0006-date.sh                |  3 +++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt
index 6926e0a4c8..7e7eaba643 100644
--- a/Documentation/date-formats.txt
+++ b/Documentation/date-formats.txt
@@ -20,7 +20,10 @@ RFC 2822::
 ISO 8601::
 	Time and date specified by the ISO 8601 standard, for example
 	`2005-04-07T22:13:13`. The parser accepts a space instead of the
-	`T` character as well.
+	`T` character as well. Fractional parts of a second will be ignored,
+	for example `2005-04-07T22:13:13.019` will be treated as
+	`2005-04-07T22:13:13`
+
 +
 NOTE: In addition, the date part is accepted in the following formats:
 `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`.
diff --git a/date.c b/date.c
index fa39e5e8a5..2c9071d53f 100644
--- a/date.c
+++ b/date.c
@@ -553,6 +553,11 @@ static int set_time(long hour, long minute, long second, struct tm *tm)
 	return -1;
 }
 
+static int is_date_known(struct tm *tm)
+{
+	return tm->tm_year != -1 && tm->tm_mon != -1 && tm->tm_mday != -1;
+}
+
 static int match_multi_number(timestamp_t num, char c, const char *date,
 			      char *end, struct tm *tm, time_t now)
 {
@@ -571,6 +576,13 @@ static int match_multi_number(timestamp_t num, char c, const char *date,
 		if (num3 < 0)
 			num3 = 0;
 		if (set_time(num, num2, num3, tm) == 0) {
+			/*
+			 * If %H:%M:%S was just parsed followed by: .<num4>
+			 * Consider (& discard) it as fractional second
+			 * if %Y%m%d is parsed before.
+			 */
+			if (*end == '.' && isdigit(end[1]) && is_date_known(tm))
+				strtol(end + 1, &end, 10);
 			break;
 		}
 		return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index d9fcc829a9..80917c81c3 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -81,6 +81,8 @@ check_parse 2008-02 bad
 check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -5:' '2008-02-14 20:30:45 +0000'
@@ -103,6 +105,7 @@ check_approxidate 5.seconds.ago '2009-08-30 19:19:55'
 check_approxidate 10.minutes.ago '2009-08-30 19:10:00'
 check_approxidate yesterday '2009-08-29 19:20:00'
 check_approxidate 3.days.ago '2009-08-27 19:20:00'
+check_approxidate '12:34:56.3.days.ago' '2009-08-27 12:34:56'
 check_approxidate 3.weeks.ago '2009-08-09 19:20:00'
 check_approxidate 3.months.ago '2009-05-30 19:20:00'
 check_approxidate 2.years.3.months.ago '2007-05-30 19:20:00'
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v5 4/4] date.c: allow compact version of ISO-8601 datetime
  2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
                       ` (2 preceding siblings ...)
  2020-04-24 15:07     ` [PATCH v5 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
@ 2020-04-24 15:07     ` Đoàn Trần Công Danh
  3 siblings, 0 replies; 49+ messages in thread
From: Đoàn Trần Công Danh @ 2020-04-24 15:07 UTC (permalink / raw)
  To: git; +Cc: Đoàn Trần Công Danh

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 date.c          | 14 ++++++++++++++
 t/t0006-date.sh |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/date.c b/date.c
index 2c9071d53f..f9ea807b3a 100644
--- a/date.c
+++ b/date.c
@@ -687,6 +687,20 @@ static int match_digit(const char *date, struct tm *tm, int *offset, int *tm_gmt
 		n++;
 	} while (isdigit(date[n]));
 
+	/* 8 digits, compact style of ISO-8601's date: YYYYmmDD */
+	/* 6 digits, compact style of ISO-8601's time: HHMMSS */
+	if (n == 8 || n == 6) {
+		unsigned int num1 = num / 10000;
+		unsigned int num2 = (num % 10000) / 100;
+		unsigned int num3 = num % 100;
+		if (n == 8)
+			set_date(num1, num2, num3, NULL, time(NULL), tm);
+		else if (n == 6 && set_time(num1, num2, num3, tm) == 0 &&
+			 *end == '.' && isdigit(end[1]))
+			strtoul(end + 1, &end, 10);
+		return end - date;
+	}
+
 	/* Four-digit year or a timezone? */
 	if (n == 4) {
 		if (num <= 1400 && *offset == -1) {
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 80917c81c3..75ee9a96b8 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -82,6 +82,9 @@ check_parse 2008-02-14 bad
 check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 +0000'
 check_parse '2008-02-14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
 check_parse '2008.02.14 20:30:45 -0500' '2008-02-14 20:30:45 -0500'
+check_parse '20080214T203045-04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045 -04:00' '2008-02-14 20:30:45 -0400'
+check_parse '20080214T203045.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45.019-04:00' '2008-02-14 20:30:45 -0400'
 check_parse '2008-02-14 20:30:45 -0015' '2008-02-14 20:30:45 -0015'
 check_parse '2008-02-14 20:30:45 -5' '2008-02-14 20:30:45 +0000'
-- 
2.26.2.384.g435bf60bd5


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-24  0:07               ` Danh Doan
  2020-04-24  0:46                 ` Junio C Hamano
@ 2020-04-24 17:30                 ` Philip Oakley
  1 sibling, 0 replies; 49+ messages in thread
From: Philip Oakley @ 2020-04-24 17:30 UTC (permalink / raw)
  To: Danh Doan; +Cc: Junio C Hamano, git, Brian M . Carlson

Hi Danh,
On 24/04/2020 01:07, Danh Doan wrote:
> On 2020-04-23 21:41:49+0100, Philip Oakley <philipoakley@iee.email> wrote:
>> On 23/04/2020 20:28, Junio C Hamano wrote:
>>> Danh Doan <congdanhqx@gmail.com> writes:
>> Would is_hhmmss() and is_yyyymmdd() be more obvious abbreviations for
>> most readers?
>>
>> Now that I type them, they do feel that bit too long... , as naming is
>> hard, maybe stick with the yms and hms, though I do keep wanting to type
>> ytd for the former..
> Not sure if I interpret your opinion correctly,
> Did you mean s/yms/ymd/ and s/ytd/ymd/?
Yes, and then no.
>
> Even that, I couldn't grasp the meaning of the last phase?
To explain the last part. I don't immediately recognise 'ymd' as a
typical abbreviation. A common three letter abbreviation, for me, would
be 'ytd' (year to date).

Taken together (my miss-spelling and alternate abbreviation), it
indicates that I was not tuned-in to the chosen abbreviations.
--
Philip



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601
  2020-04-24  0:46                 ` Junio C Hamano
@ 2020-04-24 17:32                   ` Philip Oakley
  0 siblings, 0 replies; 49+ messages in thread
From: Philip Oakley @ 2020-04-24 17:32 UTC (permalink / raw)
  To: Junio C Hamano, Danh Doan; +Cc: git, Brian M . Carlson

Junio is correct.
On 24/04/2020 01:46, Junio C Hamano wrote:
> Danh Doan <congdanhqx@gmail.com> writes:
>
>> On 2020-04-23 21:41:49+0100, Philip Oakley <philipoakley@iee.email> wrote:
>>> On 23/04/2020 20:28, Junio C Hamano wrote:
>>>> Danh Doan <congdanhqx@gmail.com> writes:
>>> Would is_hhmmss() and is_yyyymmdd() be more obvious abbreviations for
>>> most readers?
>>>
>>> Now that I type them, they do feel that bit too long... , as naming is
>>> hard, maybe stick with the yms and hms, though I do keep wanting to type
>>> ytd for the former..
>> Not sure if I interpret your opinion correctly,
>> Did you mean s/yms/ymd/ and s/ytd/ymd/?
>>
>> Even that, I couldn't grasp the meaning of the last phase?
> Here is how I understood it.
>
> Philip thinks, and I admit I have to agree with, that HMS would not
> be understood as hour-minute-seconds by most people, and YMD would
> not be as yearh-month-day, either.
>
> His "yms" in "stick with the yms and hms" is a typo of "ymd".  He is
> saying that even though YYMMDD and HHMMSS would look a lot more
> natural, it is too long to type so YMD and HMS may not be so
> terrible a compromise.
>
> With the "ytd" in the last one, he is saying that another downside
> of saying "ymd" (other than that it is not how we usually spell
> year-month-date), even though "ymd" might be an acceptable
> compromise, is that it is too easy to get confused with year-to-date
> that is commonly abbreviated as "YTD".
True.
--
Philip

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v4 2/4] date.c: validate and set time in a helper function
  2020-04-24 11:43         ` Danh Doan
@ 2020-04-24 20:29           ` Junio C Hamano
  0 siblings, 0 replies; 49+ messages in thread
From: Junio C Hamano @ 2020-04-24 20:29 UTC (permalink / raw)
  To: Danh Doan; +Cc: git

Danh Doan <congdanhqx@gmail.com> writes:

> I think single line like:
>
> 	/* We accept 61st second for the single? leap second */
>
> Or something along the time, is good enough. Not sure if we want the
> word "single" there, though.

I do not particularly want to see the single but without it, the
single-one comment looks perfect.

>> > -		if (num < 25 && num2 >= 0 && num2 < 60 && num3 >= 0 && num3 <= 60) {
>> > -			tm->tm_hour = num;
>> > -			tm->tm_min = num2;
>> > -			tm->tm_sec = num3;
>> > +		if (set_time(num, num2, num3, tm) == 0)
>> >  			break;
>> > -		}
>> >  		return 0;
>> 
>> This caller does become easier to follow, I would say.  Nicely done.
>
> Yes, when I looked around date.c
>
> I saw that the only usecase for validate time is for setting it.
> And the incoming patch also has that usage.
>
> I chose to unify those code path to not buy me too much trouble.
>
> I'll take that "Nicely done" means this unification is OK for this
> series.

The OK was meant for this single place that was converted, not any
other place you'd use in the remainder of the series.

And I think it was not such a good idea to use it twice, but I think
with the suggested rewrite you took in v5, the other call site is
also OK.

Thanks.

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2020-04-24 20:33 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-14  0:03 Mishandling of fractional seconds in ISO 8601 format brian m. carlson
2020-04-14  9:31 ` [PATCH 0/2] More ISO-8601 support Đoàn Trần Công Danh
2020-04-14  9:31   ` [PATCH 1/2] date.c: allow fractional second part of ISO-8601 Đoàn Trần Công Danh
2020-04-14 20:16     ` Jeff King
2020-04-15  2:15       ` Danh Doan
2020-04-14 20:17     ` Jeff King
2020-04-14 23:49       ` brian m. carlson
2020-04-15  2:17         ` Danh Doan
2020-04-14  9:31   ` [PATCH 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
2020-04-14 20:24     ` Jeff King
2020-04-15  2:12       ` Danh Doan
2020-04-15 15:03       ` Junio C Hamano
2020-04-15 15:41         ` Jeff King
2020-04-15 15:58           ` Junio C Hamano
2020-04-16 11:16           ` Danh Doan
2020-04-14 23:45   ` [PATCH 0/2] More ISO-8601 support brian m. carlson
2020-04-15  3:31   ` [PATCH v2 " Đoàn Trần Công Danh
2020-04-15  3:31     ` [PATCH v2 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
2020-04-15 10:17       ` Torsten Bögershausen
2020-04-16 10:04         ` Danh Doan
2020-04-15  3:31     ` [PATCH v2 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
2020-04-22 13:15   ` [PATCH v3 0/2] More ISO-8601 support Đoàn Trần Công Danh
2020-04-22 13:15     ` [PATCH v3 1/2] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
2020-04-22 17:05       ` Junio C Hamano
2020-04-23  1:18         ` Danh Doan
2020-04-23 19:28           ` Junio C Hamano
2020-04-23 20:41             ` Philip Oakley
2020-04-24  0:07               ` Danh Doan
2020-04-24  0:46                 ` Junio C Hamano
2020-04-24 17:32                   ` Philip Oakley
2020-04-24 17:30                 ` Philip Oakley
2020-04-22 13:15     ` [PATCH v3 2/2] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
2020-04-22 17:17       ` Junio C Hamano
2020-04-23  1:20         ` Danh Doan
2020-04-23 13:52   ` [PATCH v4 0/4] More ISO-8601 support Đoàn Trần Công Danh
2020-04-23 13:52     ` [PATCH v4 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
2020-04-23 20:08       ` Junio C Hamano
2020-04-23 13:52     ` [PATCH v4 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
2020-04-23 20:18       ` Junio C Hamano
2020-04-24 11:43         ` Danh Doan
2020-04-24 20:29           ` Junio C Hamano
2020-04-23 13:52     ` [PATCH v4 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
2020-04-23 20:29       ` Junio C Hamano
2020-04-23 13:52     ` [PATCH v4 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh
2020-04-24 15:07   ` [PATCH v5 0/4] More ISO-8601 support Đoàn Trần Công Danh
2020-04-24 15:07     ` [PATCH v5 1/4] date.c: s/is_date/set_date/ Đoàn Trần Công Danh
2020-04-24 15:07     ` [PATCH v5 2/4] date.c: validate and set time in a helper function Đoàn Trần Công Danh
2020-04-24 15:07     ` [PATCH v5 3/4] date.c: skip fractional second part of ISO-8601 Đoàn Trần Công Danh
2020-04-24 15:07     ` [PATCH v5 4/4] date.c: allow compact version of ISO-8601 datetime Đoàn Trần Công Danh

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).