git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Johannes Schindelin <johannes.schindelin@gmx.de>, git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Torsten Bögershausen" <tboegi@web.de>
Subject: Re: [PATCH v5 0/8] Introduce timestamp_t for timestamps
Date: Tue, 25 Apr 2017 23:54:46 +0200	[thread overview]
Message-ID: <7a2271ff-1386-18a6-5f6d-7eb13dc92509@web.de> (raw)
In-Reply-To: <cover.1493042239.git.johannes.schindelin@gmx.de>

Am 24.04.2017 um 15:57 schrieb Johannes Schindelin:
> Git v2.9.2 was released in a hurry to accomodate for platforms like
> Windows, where the `unsigned long` data type is 32-bit even for 64-bit
> setups.
> 
> The quick fix was to simply disable all the testing with "absurd" future
> dates.
> 
> However, we can do much better than that, as we already make use of
> 64-bit data types internally. There is no good reason why we should not
> use the same for timestamps. Hence, let's use uintmax_t for timestamps.
> 
> Note: while the `time_t` data type exists and is meant to be used for
> timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
> used `time_t` for that reason, but it came with a few serious downsides:
> as `time_t` can be signed (and indeed, on Windows it is an int64_t),
> Git's expectation that 0 is the minimal value does no longer hold true,
> introducing its own set of interesting challenges. Besides, if we *can*
> handle far in the future timestamps (except for formatting them using
> the system libraries), it is more consistent to do so.

time_t is signed on Linux and BSDs as well.

Using an unsigned type gives us the ability to represent times beyond
the 292 billion years in the future that int64_t would give us, but
prevents recording events that occurred before the Epoch.  That doesn't
sound like a good deal to me -- storing historical works (e.g. law
texts) with real time stamps is probably more interesting than fixing
the year 292277026596 problem within this decade.

> The upside of using `uintmax_t` for timestamps is that we do a much
> better job to support far in the future timestamps across all platforms,
> including 32-bit ones. The downside is that those platforms that use a
> 32-bit `time_t` will barf when parsing or formatting those timestamps.

IIUC this series has two aims: solving the year 2038 problem on 32-bit
Linux by replacing time_t (int32_t), and solving the year 2106 problem
on Windows by replacing unsigned long (uint32_t), right?  The latter one
sounds more interesting, because 32-bit platforms would still be unable
to fully use bigger time values as you wrote above.

Can we leave time_t alone and just do the part where you replace
unsigned long with timestamp_t defined as uint64_t?  That should already
help on Windows, correct?  When/if timestamp_t is later changed to a
signed type then we could easily convert the time_t cases to timestamp_t
as well, or the other way around.

> This iteration makes the date_overflows() check more stringent again.
> 
> It is arguably a bug to paper over too-large author/committer dates and
> to replace them with Jan 1 1970 without even telling the user that we do
> that, but this is the behavior that t4212 verifies, so I reinstated that
> behavior. The change in behavior was missed because of the missing
> unsigned_add_overflows() test.

I can't think of many ways to get future time stamps (broken clock,
broken CMOS battery, bit rot, time travel), so I wouldn't expect a
change towards better error reporting to affect a lot of users.  (Not
necessarily as part of this series, of course.)

> Changes since v4:
> 
> - in gm_time_t(), we now test specifically that the timezone adjustment
>    neither underflows nor overflows.
> 
> - the patch introduced in v4 that tried to defer the date_overflows()
>    check to gm_time_t() rather than replacing the ident timestamp by a 0
>    without any warning was dropped again: it broke t4212.
> 
> 
> Johannes Schindelin (8):
>    ref-filter: avoid using `unsigned long` for catch-all data type
>    t0006 & t5000: prepare for 64-bit timestamps
>    t0006 & t5000: skip "far in the future" test when time_t is too
>      limited
>    Specify explicitly where we parse timestamps
>    Introduce a new "printf format" for timestamps
>    Introduce a new data type for timestamps
>    Abort if the system time cannot handle one of our timestamps
>    Use uintmax_t for timestamps
> 
>   Documentation/technical/api-parse-options.txt |   8 +-
>   archive-tar.c                                 |   5 +-
>   archive-zip.c                                 |  12 ++-
>   archive.h                                     |   2 +-
>   builtin/am.c                                  |   4 +-
>   builtin/blame.c                               |  14 ++--
>   builtin/fsck.c                                |   6 +-
>   builtin/gc.c                                  |   2 +-
>   builtin/log.c                                 |   4 +-
>   builtin/merge-base.c                          |   2 +-
>   builtin/name-rev.c                            |   6 +-
>   builtin/pack-objects.c                        |   4 +-
>   builtin/prune.c                               |   4 +-
>   builtin/receive-pack.c                        |  14 ++--
>   builtin/reflog.c                              |  24 +++---
>   builtin/rev-list.c                            |   2 +-
>   builtin/rev-parse.c                           |   2 +-
>   builtin/show-branch.c                         |   4 +-
>   builtin/worktree.c                            |   4 +-
>   bundle.c                                      |   4 +-
>   cache.h                                       |  14 ++--
>   commit.c                                      |  18 ++--
>   commit.h                                      |   2 +-
>   config.c                                      |   2 +-
>   credential-cache--daemon.c                    |  12 +--
>   date.c                                        | 113 ++++++++++++++------------
>   fetch-pack.c                                  |   8 +-
>   fsck.c                                        |   2 +-
>   git-compat-util.h                             |   9 ++
>   http-backend.c                                |   4 +-
>   parse-options-cb.c                            |   4 +-
>   pretty.c                                      |   4 +-
>   reachable.c                                   |   9 +-
>   reachable.h                                   |   4 +-
>   ref-filter.c                                  |  22 ++---
>   reflog-walk.c                                 |   8 +-
>   refs.c                                        |  14 ++--
>   refs.h                                        |   8 +-
>   refs/files-backend.c                          |   8 +-
>   revision.c                                    |   6 +-
>   revision.h                                    |   4 +-
>   sha1_name.c                                   |   6 +-
>   t/helper/test-date.c                          |  18 ++--
>   t/helper/test-parse-options.c                 |   4 +-
>   t/helper/test-ref-store.c                     |   4 +-
>   t/t0006-date.sh                               |   4 +-
>   t/t5000-tar-tree.sh                           |   6 +-
>   t/test-lib.sh                                 |   3 +
>   tag.c                                         |   6 +-
>   tag.h                                         |   2 +-
>   upload-pack.c                                 |   8 +-
>   vcs-svn/fast_export.c                         |   8 +-
>   vcs-svn/fast_export.h                         |   4 +-
>   vcs-svn/svndump.c                             |   2 +-
>   wt-status.c                                   |   2 +-
>   55 files changed, 260 insertions(+), 219 deletions(-)

How did you find all the pieces of code that need to be touched?  Is
there a regex or something that can be used to spot new such places
that sneak in, e.g. through in-flight merges?

> base-commit: e2cb6ab84c94f147f1259260961513b40c36108a
> Published-As: https://github.com/dscho/git/releases/tag/time_t-may-be-int64-v5
> Fetch-It-Via: git fetch https://github.com/dscho/git time_t-may-be-int64-v5
> 
> Interdiff vs v4:
> 
>   diff --git a/date.c b/date.c
>   index 75f6335cd09..63fa99685e2 100644
>   --- a/date.c
>   +++ b/date.c
>   @@ -47,9 +47,16 @@ static time_t gm_time_t(timestamp_t time, int tz)
>    	minutes = (minutes / 100)*60 + (minutes % 100);
>    	minutes = tz < 0 ? -minutes : minutes;
>    
>   -	if (date_overflows(time + minutes * 60))
>   +	if (minutes > 0) {
>   +		if (unsigned_add_overflows(time, minutes * 60))
>   +			die("Timestamp+tz too large: %"PRItime" +%04d",
>   +			    time, tz);
>   +	} else if (time < -minutes * 60)
>   +		die("Timestamp before Unix epoch: %"PRItime" %04d", time, tz);
>   +	time += minutes * 60;
>   +	if (date_overflows(time))
>    		die("Timestamp too large for this system: %"PRItime, time);
>   -	return (time_t)time + minutes * 60;
>   +	return (time_t)time;
>    }
>    
>    /*
>   diff --git a/pretty.c b/pretty.c
>   index 35fd290096a..587d48371b0 100644
>   --- a/pretty.c
>   +++ b/pretty.c
>   @@ -410,10 +410,14 @@ const char *show_ident_date(const struct ident_split *ident,
>    
>    	if (ident->date_begin && ident->date_end)
>    		date = parse_timestamp(ident->date_begin, NULL, 10);
>   -	if (ident->tz_begin && ident->tz_end)
>   -		tz = strtol(ident->tz_begin, NULL, 10);
>   -	if (tz >= INT_MAX || tz <= INT_MIN)
>   -		tz = 0;
>   +	if (date_overflows(date))
>   +		date = 0;
>   +	else {
>   +		if (ident->tz_begin && ident->tz_end)
>   +			tz = strtol(ident->tz_begin, NULL, 10);
>   +		if (tz >= INT_MAX || tz <= INT_MIN)
>   +			tz = 0;
>   +	}
>    	return show_date(date, tz, mode);
>    }
>    
> 

  parent reply	other threads:[~2017-04-25 21:54 UTC|newest]

Thread overview: 113+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27 21:30 [PATCH 0/6] Use time_t Johannes Schindelin
2017-02-27 21:30 ` [PATCH 1/6] t0006 & t5000: prepare for 64-bit time_t Johannes Schindelin
2017-02-27 22:55   ` Junio C Hamano
2017-02-27 21:30 ` [PATCH 2/6] Specify explicitly where we parse timestamps Johannes Schindelin
2017-02-27 22:37   ` Junio C Hamano
2017-02-27 22:51     ` Junio C Hamano
2017-02-28 10:49       ` Johannes Schindelin
2017-02-27 21:31 ` [PATCH 3/6] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-03-01 18:20   ` Junio C Hamano
2017-03-01 19:53     ` Junio C Hamano
2017-02-27 21:31 ` [PATCH 4/6] Prepare for timestamps to use 64-bit signed types Johannes Schindelin
2017-02-27 21:31 ` [PATCH 5/6] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-02-27 21:31 ` [PATCH 6/6] Use time_t where appropriate Johannes Schindelin
2017-02-27 22:48 ` [PATCH 0/6] Use time_t Junio C Hamano
2017-02-28 11:32   ` Johannes Schindelin
2017-02-28 14:28 ` Jeff King
2017-02-28 15:01   ` Johannes Schindelin
2017-02-28 16:38   ` René Scharfe
2017-02-28 18:55     ` Junio C Hamano
2017-02-28 20:04       ` Jeff King
2017-02-28 20:54       ` Johannes Schindelin
2017-02-28 21:31         ` Jeff King
2017-02-28 21:31         ` René Scharfe
2017-02-28 23:10           ` Johannes Schindelin
2017-03-01  0:59             ` René Scharfe
2017-02-28 17:26   ` Junio C Hamano
2017-02-28 20:01     ` Jeff King
2017-02-28 22:27       ` Junio C Hamano
2017-02-28 22:33         ` Jeff King
2017-03-01 17:23           ` Junio C Hamano
2017-04-02 19:06 ` [PATCH v2 0/8] Introduce timestamp_t for timestamps Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 1/8] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-04-03  4:22     ` Torsten Bögershausen
2017-04-03 22:47       ` Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 2/8] t0006 & t5000: prepare for 64-bit timestamps Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 4/8] Specify explicitly where we parse timestamps Johannes Schindelin
2017-04-03  4:26     ` Torsten Bögershausen
2017-04-03 22:50       ` Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 5/8] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-04-02 19:06   ` [PATCH v2 6/8] Introduce a new data type " Johannes Schindelin
2017-04-02 19:07   ` [PATCH v2 7/8] Abort if the system time cannot handle one of our timestamps Johannes Schindelin
2017-04-02 19:07   ` [PATCH v2 8/8] Use uintmax_t for timestamps Johannes Schindelin
2017-04-20 20:52   ` [PATCH v3 0/8] Introduce timestamp_t " Johannes Schindelin
2017-04-20 20:52     ` [PATCH v3 1/8] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-04-20 20:52     ` [PATCH v3 2/8] t0006 & t5000: prepare for 64-bit timestamps Johannes Schindelin
2017-04-20 20:58     ` [PATCH v3 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited Johannes Schindelin
2017-04-20 20:58     ` [PATCH v3 4/8] Specify explicitly where we parse timestamps Johannes Schindelin
2017-04-20 20:58     ` [PATCH v3 5/8] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-04-20 20:58     ` [PATCH v3 6/8] Introduce a new data type " Johannes Schindelin
2017-04-20 20:58     ` [PATCH v3 7/8] Abort if the system time cannot handle one of our timestamps Johannes Schindelin
2017-04-20 20:59     ` [PATCH v3 8/8] Use uintmax_t for timestamps Johannes Schindelin
2017-04-21  6:05     ` [PATCH v3 0/8] Introduce timestamp_t " Junio C Hamano
2017-04-21 10:44       ` Johannes Schindelin
2017-04-21 10:45     ` [PATCH v4 0/9] " Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 1/9] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 2/9] t0006 & t5000: prepare for 64-bit timestamps Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 3/9] t0006 & t5000: skip "far in the future" test when time_t is too limited Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 4/9] Specify explicitly where we parse timestamps Johannes Schindelin
2017-04-24  3:19         ` Junio C Hamano
2017-04-21 10:45       ` [PATCH v4 5/9] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 6/9] Introduce a new data type " Johannes Schindelin
2017-04-21 10:45       ` [PATCH v4 7/9] Abort if the system time cannot handle one of our timestamps Johannes Schindelin
2017-04-24  3:16         ` Junio C Hamano
2017-04-24 13:57           ` Johannes Schindelin
2017-04-25  2:37             ` Junio C Hamano
2017-04-25  3:56             ` Junio C Hamano
2017-04-21 10:46       ` [PATCH v4 8/9] Use uintmax_t for timestamps Johannes Schindelin
2017-04-24  3:24         ` Junio C Hamano
2017-04-24 10:28           ` Johannes Schindelin
2017-04-25  3:59             ` Junio C Hamano
2017-04-25 20:10               ` Johannes Schindelin
2017-04-26  1:52                 ` Junio C Hamano
2017-04-26  3:45                   ` Junio C Hamano
2017-04-26  9:32                   ` Johannes Schindelin
2017-04-26 13:18                     ` Junio C Hamano
2017-04-21 10:46       ` [PATCH v4 9/9] show_date_ident(): defer date overflow check Johannes Schindelin
2017-04-24  3:29       ` [PATCH v4 0/9] Introduce timestamp_t for timestamps Junio C Hamano
2017-04-24  6:15         ` Jacob Keller
2017-04-24 14:02           ` Johannes Schindelin
2017-04-24 11:37         ` Jeff King
2017-04-25 20:13           ` Johannes Schindelin
2017-04-24 14:00         ` Johannes Schindelin
2017-04-24 13:57       ` [PATCH v5 0/8] " Johannes Schindelin
2017-04-24 13:57         ` [PATCH v5 1/8] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-04-24 13:57         ` [PATCH v5 2/8] t0006 & t5000: prepare for 64-bit timestamps Johannes Schindelin
2017-04-24 13:58         ` [PATCH v5 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited Johannes Schindelin
2017-04-24 13:58         ` [PATCH v5 4/8] Specify explicitly where we parse timestamps Johannes Schindelin
2017-04-25  5:59           ` Junio C Hamano
2017-04-24 13:58         ` [PATCH v5 5/8] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-04-24 13:58         ` [PATCH v5 6/8] Introduce a new data type " Johannes Schindelin
2017-04-26 16:43           ` Johannes Sixt
2017-04-26 19:18             ` Johannes Schindelin
2017-04-26 22:32           ` René Scharfe
2017-04-24 13:58         ` [PATCH v5 7/8] Abort if the system time cannot handle one of our timestamps Johannes Schindelin
2017-04-24 13:58         ` [PATCH v5 8/8] Use uintmax_t for timestamps Johannes Schindelin
2017-04-26 16:36           ` Johannes Sixt
2017-04-26 19:09             ` Johannes Schindelin
2017-04-25 21:54         ` René Scharfe [this message]
2017-04-25 22:22           ` [PATCH v5 0/8] Introduce timestamp_t " Johannes Schindelin
2017-04-26 22:09             ` René Scharfe
2017-04-26  1:56           ` Junio C Hamano
2017-04-26 19:20         ` [PATCH v6 " Johannes Schindelin
2017-04-26 19:26           ` [PATCH v6 1/8] ref-filter: avoid using `unsigned long` for catch-all data type Johannes Schindelin
2017-04-26 19:26           ` [PATCH v6 2/8] t0006 & t5000: prepare for 64-bit timestamps Johannes Schindelin
2017-04-26 19:26           ` [PATCH v6 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited Johannes Schindelin
2017-04-26 19:26           ` [PATCH v6 4/8] Specify explicitly where we parse timestamps Johannes Schindelin
2017-04-26 19:29           ` [PATCH v6 5/8] Introduce a new "printf format" for timestamps Johannes Schindelin
2017-04-26 19:29           ` [PATCH v6 6/8] Introduce a new data type " Johannes Schindelin
2017-05-20  5:47             ` [PATCH] name-rev: change a "long" variable to timestamp_t Junio C Hamano
2017-05-22 13:39               ` Johannes Schindelin
2017-04-26 19:29           ` [PATCH v6 7/8] Abort if the system time cannot handle one of our timestamps Johannes Schindelin
2017-04-26 19:29           ` [PATCH v6 8/8] Use uintmax_t for timestamps Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7a2271ff-1386-18a6-5f6d-7eb13dc92509@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).