git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* timezone related bug of git
@ 2021-10-31  3:23 Dongsheng Song
  2021-10-31  8:53 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Dongsheng Song @ 2021-10-31  3:23 UTC (permalink / raw)
  To: Git Mailing List

Hi,

 I found a timezone related bug in the git:

1. git log 11990eba -1 --date=format:%s

commit 11990eba0be50d1ad0655ede4062b7130326c41f (HEAD -> trunk,
origin/trunk, origin/HEAD)
Author: rillig <rillig@NetBSD.org>
Date:   1635604878

    indent: move debugging functions to a separate section

2. git cat-file -p 11990eba

tree 5d62150f5e2bafd3db76641450ca5d902302a039
parent 892557a74bd49983fac28366b772b53c9216ca73
author rillig <rillig@NetBSD.org> 1635633678 +0000
committer rillig <rillig@NetBSD.org> 1635633678 +0000

indent: move debugging functions to a separate section

3. conclusion

The unix time stored in git repository not same as the git log output,
then there must be a timezone offset bug:

1635633678 - 1635604878 = 28800 = 8 hours (local timezone offset)

Best regards,
Cauchy Song

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-10-31  3:23 timezone related bug of git Dongsheng Song
@ 2021-10-31  8:53 ` Jeff King
  2021-10-31 13:18   ` Dongsheng Song
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2021-10-31  8:53 UTC (permalink / raw)
  To: Dongsheng Song; +Cc: Git Mailing List

On Sun, Oct 31, 2021 at 11:23:24AM +0800, Dongsheng Song wrote:

>  I found a timezone related bug in the git:
> 
> 1. git log 11990eba -1 --date=format:%s
> 
> commit 11990eba0be50d1ad0655ede4062b7130326c41f (HEAD -> trunk,
> origin/trunk, origin/HEAD)
> Author: rillig <rillig@NetBSD.org>
> Date:   1635604878
> 
>     indent: move debugging functions to a separate section
> 
> 2. git cat-file -p 11990eba
> 
> tree 5d62150f5e2bafd3db76641450ca5d902302a039
> parent 892557a74bd49983fac28366b772b53c9216ca73
> author rillig <rillig@NetBSD.org> 1635633678 +0000
> committer rillig <rillig@NetBSD.org> 1635633678 +0000
> 
> indent: move debugging functions to a separate section
> 
> 3. conclusion
> 
> The unix time stored in git repository not same as the git log output,
> then there must be a timezone offset bug:
> 
> 1635633678 - 1635604878 = 28800 = 8 hours (local timezone offset)

The short answer is: don't do that. Use --date=unix instead.

The longer one is:

The problem is that the strftime() "%s" specifier is a bit broken.
That function (which is what is interpreting your format) takes a
broken-down "struct tm", which can only be converted back to an epoch
time if you know which time zone it's in.

But we have no way to tell the function that; the standard indicates
that it always assumes the local system timezone, and there's no
provision at all for formatting times in other zones (which is what we
usually try to do, showing the date in the author's zone). There's no
field in the "struct tm" to carry any zone information[1].

Even when you're in the same timezone, there's a similar problem with
the is_dst field. There's some discussion in [2], including the
possibility of intercepting "%s" and handling it ourselves, like we do
for "%z". I don't think anybody has cared enough to work on it.

-Peff

[1] Some implementations (like glibc) actually _do_ carry this
    information in private fields of "struct tm". But we can't rely on
    it, and even where it's available, it's confusing (e.g., mktime()
    ignores it!). If you're a real masochist, you can read all of:

      https://lore.kernel.org/git/22824.29946.305300.380299@a1i15.kph.uni-mainz.de/

[2] This is a similar bug report from 2020:

      https://lore.kernel.org/git/CAGqZTUu2U6FFXGTXihC64O0gB5Bz_Z3MbD750kMoJWMciAGH6w@mail.gmail.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-10-31  8:53 ` Jeff King
@ 2021-10-31 13:18   ` Dongsheng Song
  2021-10-31 18:46     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Dongsheng Song @ 2021-10-31 13:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

Thank you for the clarification, it's really a disappointing answer.

Perhaps the manual needs to be clearer about this limitation.

On Sun, Oct 31, 2021 at 4:53 PM Jeff King <peff@peff.net> wrote:
>
> On Sun, Oct 31, 2021 at 11:23:24AM +0800, Dongsheng Song wrote:
>
> >  I found a timezone related bug in the git:
> >
> > 1. git log 11990eba -1 --date=format:%s
> >
> > commit 11990eba0be50d1ad0655ede4062b7130326c41f (HEAD -> trunk,
> > origin/trunk, origin/HEAD)
> > Author: rillig <rillig@NetBSD.org>
> > Date:   1635604878
> >
> >     indent: move debugging functions to a separate section
> >
> > 2. git cat-file -p 11990eba
> >
> > tree 5d62150f5e2bafd3db76641450ca5d902302a039
> > parent 892557a74bd49983fac28366b772b53c9216ca73
> > author rillig <rillig@NetBSD.org> 1635633678 +0000
> > committer rillig <rillig@NetBSD.org> 1635633678 +0000
> >
> > indent: move debugging functions to a separate section
> >
> > 3. conclusion
> >
> > The unix time stored in git repository not same as the git log output,
> > then there must be a timezone offset bug:
> >
> > 1635633678 - 1635604878 = 28800 = 8 hours (local timezone offset)
>
> The short answer is: don't do that. Use --date=unix instead.
>
> The longer one is:
>
> The problem is that the strftime() "%s" specifier is a bit broken.
> That function (which is what is interpreting your format) takes a
> broken-down "struct tm", which can only be converted back to an epoch
> time if you know which time zone it's in.
>
> But we have no way to tell the function that; the standard indicates
> that it always assumes the local system timezone, and there's no
> provision at all for formatting times in other zones (which is what we
> usually try to do, showing the date in the author's zone). There's no
> field in the "struct tm" to carry any zone information[1].
>
> Even when you're in the same timezone, there's a similar problem with
> the is_dst field. There's some discussion in [2], including the
> possibility of intercepting "%s" and handling it ourselves, like we do
> for "%z". I don't think anybody has cared enough to work on it.
>
> -Peff
>
> [1] Some implementations (like glibc) actually _do_ carry this
>     information in private fields of "struct tm". But we can't rely on
>     it, and even where it's available, it's confusing (e.g., mktime()
>     ignores it!). If you're a real masochist, you can read all of:
>
>       https://lore.kernel.org/git/22824.29946.305300.380299@a1i15.kph.uni-mainz.de/
>
> [2] This is a similar bug report from 2020:
>
>       https://lore.kernel.org/git/CAGqZTUu2U6FFXGTXihC64O0gB5Bz_Z3MbD750kMoJWMciAGH6w@mail.gmail.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-10-31 13:18   ` Dongsheng Song
@ 2021-10-31 18:46     ` Junio C Hamano
  2021-11-01  4:03       ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-10-31 18:46 UTC (permalink / raw)
  To: Dongsheng Song; +Cc: Jeff King, Git Mailing List

Dongsheng Song <dongsheng.song@gmail.com> writes:

> Thank you for the clarification, it's really a disappointing answer.

The situation may be disappointing, but I found the answer eminently
clear and helpful.

> Perhaps the manual needs to be clearer about this limitation.

Sounds like we have a volunteer ;-)?

Thanks for a report and discussion.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-10-31 18:46     ` Junio C Hamano
@ 2021-11-01  4:03       ` Jeff King
  2021-11-01 14:31         ` Dongsheng Song
  2021-11-01 18:18         ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2021-11-01  4:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dongsheng Song, Git Mailing List

On Sun, Oct 31, 2021 at 11:46:40AM -0700, Junio C Hamano wrote:

> Dongsheng Song <dongsheng.song@gmail.com> writes:
> 
> > Thank you for the clarification, it's really a disappointing answer.
> 
> The situation may be disappointing, but I found the answer eminently
> clear and helpful.

The most disappointing thing IMHO is the lousy state of system-level
date routines. ;)

I have some patches working towards allowing timestamps before 1970, and
the system routines are quite unreliable (both in giving insufficient
portable interfaces, but also just doing weird things with negative
values).

> > Perhaps the manual needs to be clearer about this limitation.
> 
> Sounds like we have a volunteer ;-)?

Yeah, I'd be happy if somebody wanted to note this in the manual. But if
anybody wants to pursue manually intercepting %s, I think the patch
below might point them in the right direction.

I won't be at all surprised if it has funny corner cases. Our
tm_to_time_t() is pretty basic and hacky. We can't use mktime() because
it only handles the current system timezone. OTOH, I think the tz_offset
we're undoing here originally came from comparing mktime() versus
tm_to_time_t() via local_time_tzoffset(), so it could be cancelling out
any bugs exactly. :)

So maybe the code below is sufficient, but we'd probably at least want
some tests on top. Maybe something somebody interested would like to
pick up and run with?

---
diff --git a/cache.h b/cache.h
index eba12487b9..aa6f380d10 100644
--- a/cache.h
+++ b/cache.h
@@ -1588,6 +1588,7 @@ timestamp_t approxidate_careful(const char *, int *);
 timestamp_t approxidate_relative(const char *date);
 void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(timestamp_t date);
+time_t tm_to_time_t(const struct tm *tm);
 
 #define IDENT_STRICT	       1
 #define IDENT_NO_DATE	       2
diff --git a/date.c b/date.c
index c55ea47e96..84bb4451c1 100644
--- a/date.c
+++ b/date.c
@@ -9,7 +9,7 @@
 /*
  * This is like mktime, but without normalization of tm_wday and tm_yday.
  */
-static time_t tm_to_time_t(const struct tm *tm)
+time_t tm_to_time_t(const struct tm *tm)
 {
 	static const int mdays[] = {
 	    0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
diff --git a/strbuf.c b/strbuf.c
index b22e981655..8b8b1900bc 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1019,6 +1019,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 			strbuf_addstr(&munged_fmt, "%%");
 			fmt++;
 			break;
+		case 's':
+			strbuf_addf(&munged_fmt, "%"PRItime,
+				    tm_to_time_t(tm) -
+				    3600 * (tz_offset / 100) -
+				    60 * (tz_offset % 100));
+			fmt++;
+			break;
 		case 'z':
 			strbuf_addf(&munged_fmt, "%+05d", tz_offset);
 			fmt++;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-11-01  4:03       ` Jeff King
@ 2021-11-01 14:31         ` Dongsheng Song
  2021-11-01 18:18         ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Dongsheng Song @ 2021-11-01 14:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List

Hi Jeff,

Your patch looks good to me.  Here is my test result:

$ patch -p1 < x.patch
patching file cache.h
patching file date.c
patching file strbuf.c

$ make

$ ~/git-testing/bin/git --version

$ ~/git-testing/bin/git log 11990eba -1 --date=format:%s
commit 11990eba0be50d1ad0655ede4062b7130326c41f (HEAD -> trunk,
origin/trunk, origin/HEAD)
Author: rillig <rillig@NetBSD.org>
Date:   1635633678

    indent: move debugging functions to a separate section


$ ~/git-testing/bin/git cat-file -p 11990eba
tree 5d62150f5e2bafd3db76641450ca5d902302a039
parent 892557a74bd49983fac28366b772b53c9216ca73
author rillig <rillig@NetBSD.org> 1635633678 +0000
committer rillig <rillig@NetBSD.org> 1635633678 +0000

indent: move debugging functions to a separate section


PS:
Thanks Junio for clarification, Jeff's explanation is very clear,
I'm just frustrated with the limitations of the system, I greatly
appreciate Jeff's work.

Best regards,
Dongsheng Song

On Mon, Nov 1, 2021 at 12:03 PM Jeff King <peff@peff.net> wrote:
>
> On Sun, Oct 31, 2021 at 11:46:40AM -0700, Junio C Hamano wrote:
>
> > Dongsheng Song <dongsheng.song@gmail.com> writes:
> >
> > > Thank you for the clarification, it's really a disappointing answer.
> >
> > The situation may be disappointing, but I found the answer eminently
> > clear and helpful.
>
> The most disappointing thing IMHO is the lousy state of system-level
> date routines. ;)
>
> I have some patches working towards allowing timestamps before 1970, and
> the system routines are quite unreliable (both in giving insufficient
> portable interfaces, but also just doing weird things with negative
> values).
>
> > > Perhaps the manual needs to be clearer about this limitation.
> >
> > Sounds like we have a volunteer ;-)?
>
> Yeah, I'd be happy if somebody wanted to note this in the manual. But if
> anybody wants to pursue manually intercepting %s, I think the patch
> below might point them in the right direction.
>
> I won't be at all surprised if it has funny corner cases. Our
> tm_to_time_t() is pretty basic and hacky. We can't use mktime() because
> it only handles the current system timezone. OTOH, I think the tz_offset
> we're undoing here originally came from comparing mktime() versus
> tm_to_time_t() via local_time_tzoffset(), so it could be cancelling out
> any bugs exactly. :)
>
> So maybe the code below is sufficient, but we'd probably at least want
> some tests on top. Maybe something somebody interested would like to
> pick up and run with?
>
> ---
> diff --git a/cache.h b/cache.h
> index eba12487b9..aa6f380d10 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1588,6 +1588,7 @@ timestamp_t approxidate_careful(const char *, int *);
>  timestamp_t approxidate_relative(const char *date);
>  void parse_date_format(const char *format, struct date_mode *mode);
>  int date_overflows(timestamp_t date);
> +time_t tm_to_time_t(const struct tm *tm);
>
>  #define IDENT_STRICT          1
>  #define IDENT_NO_DATE         2
> diff --git a/date.c b/date.c
> index c55ea47e96..84bb4451c1 100644
> --- a/date.c
> +++ b/date.c
> @@ -9,7 +9,7 @@
>  /*
>   * This is like mktime, but without normalization of tm_wday and tm_yday.
>   */
> -static time_t tm_to_time_t(const struct tm *tm)
> +time_t tm_to_time_t(const struct tm *tm)
>  {
>         static const int mdays[] = {
>             0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
> diff --git a/strbuf.c b/strbuf.c
> index b22e981655..8b8b1900bc 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1019,6 +1019,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
>                         strbuf_addstr(&munged_fmt, "%%");
>                         fmt++;
>                         break;
> +               case 's':
> +                       strbuf_addf(&munged_fmt, "%"PRItime,
> +                                   tm_to_time_t(tm) -
> +                                   3600 * (tz_offset / 100) -
> +                                   60 * (tz_offset % 100));
> +                       fmt++;
> +                       break;
>                 case 'z':
>                         strbuf_addf(&munged_fmt, "%+05d", tz_offset);
>                         fmt++;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-11-01  4:03       ` Jeff King
  2021-11-01 14:31         ` Dongsheng Song
@ 2021-11-01 18:18         ` Junio C Hamano
  2021-11-02  1:43           ` Jeff King
  2021-11-02 11:35           ` [PATCH] strbuf_addftime(): handle "%s" manually Jeff King
  1 sibling, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2021-11-01 18:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Dongsheng Song, Git Mailing List

Jeff King <peff@peff.net> writes:

> I won't be at all surprised if it has funny corner cases. Our
> tm_to_time_t() is pretty basic and hacky. We can't use mktime() because
> it only handles the current system timezone. OTOH, I think the tz_offset
> we're undoing here originally came from comparing mktime() versus
> tm_to_time_t() via local_time_tzoffset(), so it could be cancelling out
> any bugs exactly. :)
>
> So maybe the code below is sufficient, but we'd probably at least want
> some tests on top. Maybe something somebody interested would like to
> pick up and run with?

It would be very hard to write a code that does not work correctly
on a timestamp created in the same zone in the same season. It is
easy to get the direction of the offset wrong and not notice with
such a test, but with another test to show a timestamp from a zone
in a different zone (or across season boundary in an area where
daylight saving time is s thing), such an error can easily be
caught.

> ---
> diff --git a/cache.h b/cache.h
> index eba12487b9..aa6f380d10 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1588,6 +1588,7 @@ timestamp_t approxidate_careful(const char *, int *);
>  timestamp_t approxidate_relative(const char *date);
>  void parse_date_format(const char *format, struct date_mode *mode);
>  int date_overflows(timestamp_t date);
> +time_t tm_to_time_t(const struct tm *tm);
>  
>  #define IDENT_STRICT	       1
>  #define IDENT_NO_DATE	       2
> diff --git a/date.c b/date.c
> index c55ea47e96..84bb4451c1 100644
> --- a/date.c
> +++ b/date.c
> @@ -9,7 +9,7 @@
>  /*
>   * This is like mktime, but without normalization of tm_wday and tm_yday.
>   */
> -static time_t tm_to_time_t(const struct tm *tm)
> +time_t tm_to_time_t(const struct tm *tm)
>  {
>  	static const int mdays[] = {
>  	    0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
> diff --git a/strbuf.c b/strbuf.c
> index b22e981655..8b8b1900bc 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1019,6 +1019,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
>  			strbuf_addstr(&munged_fmt, "%%");
>  			fmt++;
>  			break;
> +		case 's':
> +			strbuf_addf(&munged_fmt, "%"PRItime,
> +				    tm_to_time_t(tm) -
> +				    3600 * (tz_offset / 100) -
> +				    60 * (tz_offset % 100));
> +			fmt++;
> +			break;

In show_date(), we start from UNIX time and go to "struct tm" using
either the system gmtime_r() (after adjusting the value with the tz
offset of the original timestamp) or localtime_r() (when we are
trying to show the value in our local timestamp), but this codepath
needs to undo that.  Our tm_to_time_t() indeed is basic but should
work correctly on a broken down UTC.  So the caller needs to further
compensate for the tz offset.

I have to wonder why gm_time_t() needs to use two separate codepaths
for positive and negative tz_offset, while the new code here can get
away without.  Does it have something to do with the direction of
truncation during division and modulo operation?

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: timezone related bug of git
  2021-11-01 18:18         ` Junio C Hamano
@ 2021-11-02  1:43           ` Jeff King
  2021-11-02 11:35           ` [PATCH] strbuf_addftime(): handle "%s" manually Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff King @ 2021-11-02  1:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dongsheng Song, Git Mailing List

On Mon, Nov 01, 2021 at 11:18:02AM -0700, Junio C Hamano wrote:

> I have to wonder why gm_time_t() needs to use two separate codepaths
> for positive and negative tz_offset, while the new code here can get
> away without.  Does it have something to do with the direction of
> truncation during division and modulo operation?

Hmm. Unless I am missing something, this part of gm_time_t() is simply
over-complicating things:

  minutes = tz < 0 ? -tz : tz;
  minutes = (minutes / 100)*60 + (minutes % 100);
  minutes = tz < 0 ? -minutes : minutes;

We switch to doing the computation in absolute-value units, but then
restore the sign. But just:

  minutes = (tz / 100) * 60 + (tz % 100);

is equivalent and shorter. If tz is negative, then both terms will be
negative, which is what you want (they sum to a larger absolute-value
negative number). This comes from f80cd783c6 (date.c: add "show_date()"
function., 2005-05-06), so I don't see any sign that there was specific
thought given to some obscure handling. And indeed later fixes like
fbab835c03 ([PATCH] fix show_date() for positive timezones, 2005-05-18)
imply to me that the original was just confused.

Later we do:

  if (minutes > 0) {
          if (unsigned_add_overflows(time, minutes * 60))
                  die("Timestamp+tz too large: %"PRItime" +%04d",
                      time, tz);
  } else if (time < -minutes * 60)
          die("Timestamp before Unix epoch: %"PRItime" %04d", time, tz);

And that does need separate paths for the overflow check, since we're
checking different boundaries. I suspect for the strftime() code that we
wouldn't need similar checks, because the earlier ones would have caught
any problems (i.e., we would not get as far as having a "struct tm" that
represented something outside the range of our time_t).

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] strbuf_addftime(): handle "%s" manually
  2021-11-01 18:18         ` Junio C Hamano
  2021-11-02  1:43           ` Jeff King
@ 2021-11-02 11:35           ` Jeff King
  2021-11-02 15:43             ` Jeff King
  2021-11-03 20:28             ` Junio C Hamano
  1 sibling, 2 replies; 12+ messages in thread
From: Jeff King @ 2021-11-02 11:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dongsheng Song, Git Mailing List

On Mon, Nov 01, 2021 at 11:18:02AM -0700, Junio C Hamano wrote:

> > diff --git a/strbuf.c b/strbuf.c
> > index b22e981655..8b8b1900bc 100644
> > --- a/strbuf.c
> > +++ b/strbuf.c
> > @@ -1019,6 +1019,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
> >  			strbuf_addstr(&munged_fmt, "%%");
> >  			fmt++;
> >  			break;
> > +		case 's':
> > +			strbuf_addf(&munged_fmt, "%"PRItime,
> > +				    tm_to_time_t(tm) -
> > +				    3600 * (tz_offset / 100) -
> > +				    60 * (tz_offset % 100));
> > +			fmt++;
> > +			break;
> 
> In show_date(), we start from UNIX time and go to "struct tm" using
> either the system gmtime_r() (after adjusting the value with the tz
> offset of the original timestamp) or localtime_r() (when we are
> trying to show the value in our local timestamp), but this codepath
> needs to undo that.  Our tm_to_time_t() indeed is basic but should
> work correctly on a broken down UTC.  So the caller needs to further
> compensate for the tz offset.

This made me wonder how things interact with format-local. There's some
subtlety, but I think all is well.

I hadn't planned to work further on this, but now having thought about
it some more, it seemed worth capturing that and putting the finishing
touches on a real commit.

-- >8 --
Subject: [PATCH] strbuf_addftime(): handle "%s" manually

The strftime() function has a non-standard "%s" extension, which prints
the number of seconds since the epoch. But the "struct tm" we get has
already been adjusted for a particular time zone; going back to an epoch
time requires knowing that zone offset. Since strftime() doesn't take
such an argument, round-tripping to a "struct tm" and back to the "%s"
format may produce the wrong value (off by tz_offset seconds).

Since we're already passing in the zone offset courtesy of c3fbf81a85
(strbuf: let strbuf_addftime handle %z and %Z itself, 2017-06-15), we
can use that same value to adjust our epoch seconds accordingly.

Note that the description above makes it sound like strftime()'s "%s" is
useless (and really, the issue is shared by mktime(), which is what
strftime() would use under the hood). But it gets the two cases for
which it's designed correct:

  - the result of gmtime() will have a zero offset, so no adjustment is
    necessary

  - the result of localtime() will be offset by the local zone offset,
    and mktime() and strftime() are defined to assume this offset when
    converting back (there's actually some magic here; some
    implementations record this in the "struct tm", but we can't
    portably access or manipulate it. But they somehow "know" whether a
    "struct tm" is from gmtime() or localtime()).

This latter point means that "format-local:%s" actually works correctly
already, because in that case we rely on the system routines due to
6eced3ec5e (date: use localtime() for "-local" time formats,
2017-06-15). Our problem comes when trying to show times in the author's
zone, as the system routines provide no mechanism for converting in
non-local zones. So in those cases we have a "struct tm" that came from
gmtime(), but has been manipulated according to our offset.

The tests cover the broken round-trip by formatting "%s" for a time in a
non-system timezone. We use the made-up "+1234" here, which has two
advantages. One, we know it won't ever be the real system zone (and so
we're actually testing a case that would break). And two, since it has a
minute component, we're testing the full decoding of the +HHMM zone into
a number of seconds. Likewise, we test the "-1234" variant to make sure
there aren't any sign mistakes.

There's one final test, which covers "format-local:%s". As noted, this
already passes, but it's important to check that we didn't regress this
case. In particular, the caller in show_date() is relying on localtime()
to have done the zone adjustment, independent of any tz_offset we
compute ourselves. These should match up, since our local_tzoffset() is
likewise built around localtime(). But it would be easy for a caller to
forget to pass in a correct tz_offset to strbuf_addftime(). Fortunately
show_date() does this correctly (it has to because of the existing
handling of %z), and the test continues to pass. So this one is just
future-proofing against a change in our assumptions.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h         |  1 +
 date.c          |  2 +-
 strbuf.c        | 14 +++++++++++++-
 t/t0006-date.sh |  4 ++++
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index eba12487b9..aa6f380d10 100644
--- a/cache.h
+++ b/cache.h
@@ -1588,6 +1588,7 @@ timestamp_t approxidate_careful(const char *, int *);
 timestamp_t approxidate_relative(const char *date);
 void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(timestamp_t date);
+time_t tm_to_time_t(const struct tm *tm);
 
 #define IDENT_STRICT	       1
 #define IDENT_NO_DATE	       2
diff --git a/date.c b/date.c
index c55ea47e96..84bb4451c1 100644
--- a/date.c
+++ b/date.c
@@ -9,7 +9,7 @@
 /*
  * This is like mktime, but without normalization of tm_wday and tm_yday.
  */
-static time_t tm_to_time_t(const struct tm *tm)
+time_t tm_to_time_t(const struct tm *tm)
 {
 	static const int mdays[] = {
 	    0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
diff --git a/strbuf.c b/strbuf.c
index b22e981655..a569b99ab9 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1006,7 +1006,12 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 
 	/*
 	 * There is no portable way to pass timezone information to
-	 * strftime, so we handle %z and %Z here.
+	 * strftime, so we handle %z and %Z here. Likewise '%s', because
+	 * going back to an epoch time requires knowing the zone.
+	 *
+	 * Note that tz_offset is in the "[-+]HHMM" decimal form; this is what
+	 * we want for %z, but the computation for %s has to convert to number
+	 * of seconds.
 	 */
 	for (;;) {
 		const char *percent = strchrnul(fmt, '%');
@@ -1019,6 +1024,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 			strbuf_addstr(&munged_fmt, "%%");
 			fmt++;
 			break;
+		case 's':
+			strbuf_addf(&munged_fmt, "%"PRItime,
+				    tm_to_time_t(tm) -
+				    3600 * (tz_offset / 100) -
+				    60 * (tz_offset % 100));
+			fmt++;
+			break;
 		case 'z':
 			strbuf_addf(&munged_fmt, "%+05d", tz_offset);
 			fmt++;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 6b757d7169..794186961e 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -63,6 +63,10 @@ check_show 'format-local:%%z' "$TIME" '%z'
 check_show 'format:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 16:13:20'
 check_show 'format-local:%Y-%m-%d %H:%M:%S' "$TIME" '2016-06-15 09:13:20' '' EST5
 
+check_show 'format:%s' '123456789 +1234' 123456789
+check_show 'format:%s' '123456789 -1234' 123456789
+check_show 'format-local:%s' '123456789 -1234' 123456789
+
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
 check_show iso       "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT
-- 
2.34.0.rc0.612.g98bfd90890


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] strbuf_addftime(): handle "%s" manually
  2021-11-02 11:35           ` [PATCH] strbuf_addftime(): handle "%s" manually Jeff King
@ 2021-11-02 15:43             ` Jeff King
  2021-11-03 20:28             ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff King @ 2021-11-02 15:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dongsheng Song, Git Mailing List

On Tue, Nov 02, 2021 at 07:35:35AM -0400, Jeff King wrote:

> @@ -1019,6 +1024,13 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
>  			strbuf_addstr(&munged_fmt, "%%");
>  			fmt++;
>  			break;
> +		case 's':
> +			strbuf_addf(&munged_fmt, "%"PRItime,
> +				    tm_to_time_t(tm) -
> +				    3600 * (tz_offset / 100) -
> +				    60 * (tz_offset % 100));
> +			fmt++;
> +			break;

Looks like we may need something like this squashed in:

diff --git a/strbuf.c b/strbuf.c
index 33015b33df..995394f38e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1026,7 +1026,7 @@ void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
 			break;
 		case 's':
 			strbuf_addf(&munged_fmt, "%"PRItime,
-				    tm_to_time_t(tm) -
+				    (timestamp_t)tm_to_time_t(tm) -
 				    3600 * (tz_offset / 100) -
 				    60 * (tz_offset % 100));
 			fmt++;

because tm_to_time_t() returns an actual time_t, which will vary in
size. The 32-bit CI job complains:

  strbuf.c:1028:29: error: format '%lld' expects argument of type 'long long int', but argument 3 has type 'long int' [-Werror=format=]

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] strbuf_addftime(): handle "%s" manually
  2021-11-02 11:35           ` [PATCH] strbuf_addftime(): handle "%s" manually Jeff King
  2021-11-02 15:43             ` Jeff King
@ 2021-11-03 20:28             ` Junio C Hamano
  2021-11-04  2:11               ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2021-11-03 20:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Dongsheng Song, Git Mailing List

I think this also needs squashing in?

 Documentation/rev-list-options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git c/Documentation/rev-list-options.txt w/Documentation/rev-list-options.txt
index 24569b06d1..43a86fa562 100644
--- c/Documentation/rev-list-options.txt
+++ w/Documentation/rev-list-options.txt
@@ -1047,7 +1047,7 @@ omitted.
 has no effect.
 
 `--date=format:...` feeds the format `...` to your system `strftime`,
-except for %z and %Z, which are handled internally.
+except for %s, %z, and %Z, which are handled internally.
 Use `--date=format:%c` to show the date in your system locale's
 preferred format.  See the `strftime` manual for a complete list of
 format placeholders. When using `-local`, the correct syntax is


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] strbuf_addftime(): handle "%s" manually
  2021-11-03 20:28             ` Junio C Hamano
@ 2021-11-04  2:11               ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2021-11-04  2:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dongsheng Song, Git Mailing List

On Wed, Nov 03, 2021 at 01:28:00PM -0700, Junio C Hamano wrote:

> I think this also needs squashing in?
> 
>  Documentation/rev-list-options.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git c/Documentation/rev-list-options.txt w/Documentation/rev-list-options.txt
> index 24569b06d1..43a86fa562 100644
> --- c/Documentation/rev-list-options.txt
> +++ w/Documentation/rev-list-options.txt
> @@ -1047,7 +1047,7 @@ omitted.
>  has no effect.
>  
>  `--date=format:...` feeds the format `...` to your system `strftime`,
> -except for %z and %Z, which are handled internally.
> +except for %s, %z, and %Z, which are handled internally.
>  Use `--date=format:%c` to show the date in your system locale's
>  preferred format.  See the `strftime` manual for a complete list of
>  format placeholders. When using `-local`, the correct syntax is

Ah, thanks. I didn't even think to look in the documentation, because I
didn't imagine that we would expose these implementation details. But
since we do mention %z there, I think adding %s makes sense.

BTW, I also noticed that stftime supports some locale modifiers. So
"%Es" ends up printing the epoch seconds, but eludes our manual
intervention (and so does the old, wrong thing). I'm fine with stopping
here, though. There's no reason to use %Es over %s (from what I gather,
the %E is about handling year eras for locales that support them, but
that's meaningless for an epoch time), and I'm not sure it is even a
portable thing (glibc does not mention it in the manpage along with
other %E values, but it does work; POSIX does not even define %s, so of
course does not mention %Es).

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-11-04  2:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-31  3:23 timezone related bug of git Dongsheng Song
2021-10-31  8:53 ` Jeff King
2021-10-31 13:18   ` Dongsheng Song
2021-10-31 18:46     ` Junio C Hamano
2021-11-01  4:03       ` Jeff King
2021-11-01 14:31         ` Dongsheng Song
2021-11-01 18:18         ` Junio C Hamano
2021-11-02  1:43           ` Jeff King
2021-11-02 11:35           ` [PATCH] strbuf_addftime(): handle "%s" manually Jeff King
2021-11-02 15:43             ` Jeff King
2021-11-03 20:28             ` Junio C Hamano
2021-11-04  2:11               ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).