git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] name-rev: avoid cutoff timestamp underflow
@ 2019-09-22 18:01 SZEDER Gábor
  2019-09-22 18:57 ` Phillip Wood
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-22 18:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, SZEDER Gábor

When 'git name-rev' is invoked with commit-ish parameters, it tries to
save some work, and doesn't visit commits older than the committer
date of the oldest given commit minus a one day worth of slop.  Since
our 'timestamp_t' is an unsigned type, this leads to a timestamp
underflow when the committer date of the oldest given commit is within
a day of the UNIX epoch.  As a result the cutoff timestamp ends up
far-far in the future, and 'git name-rev' doesn't visit any commits,
and names each given commit as 'undefined'.

Check whether substacting the slop from the oldest committer date
would lead to an underflow, and use a 0 as cutoff in that case.  This
way it will handle commits shortly after the epoch even if we were to
switch to a signed 'timestamp_t' (but then we'll have to worry about
signed underflow for very old commits).

Note that the type of the cutoff timestamp variable used to be signed
before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
2017-05-20).  The behavior was still the same even back then, but the
underflow didn't happen when substracting the slop from the oldest
committer date, but when comparing the signed cutoff timestamp with
unsigned committer dates in name_rev().  IOW, this underflow bug is as
old as 'git name-rev' itself.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

This patch adds a test at the end of 't6120-describe.sh', so it will
conflict with my non-recursive name-rev patch series, which adds a
test there as well, but the conflict should be wasy to resolve.

  https://public-inbox.org/git/20190919214712.7348-7-szeder.dev@gmail.com/

 builtin/name-rev.c  | 15 ++++++++++++---
 t/t6120-describe.sh | 15 +++++++++++++++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index c785fe16ba..a4d8d312ab 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -9,7 +9,11 @@
 #include "sha1-lookup.h"
 #include "commit-slab.h"
 
-#define CUTOFF_DATE_SLOP 86400 /* one day */
+/*
+ * One day.  See the 'name a rev close to epoch' test in t6120 when
+ * changing this value
+ */
+#define CUTOFF_DATE_SLOP 86400
 
 typedef struct rev_name {
 	const char *tip_name;
@@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
 		add_object_array(object, *argv, &revs);
 	}
 
-	if (cutoff)
-		cutoff = cutoff - CUTOFF_DATE_SLOP;
+	if (cutoff) {
+		/* check for undeflow */
+		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
+			cutoff = cutoff - CUTOFF_DATE_SLOP;
+		else
+			cutoff = 0;
+	}
 	for_each_ref(name_ref, &data);
 
 	if (transform_stdin) {
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 2b883d8174..965e633c32 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -424,4 +424,19 @@ test_expect_success 'describe complains about missing object' '
 	test_must_fail git describe $ZERO_OID
 '
 
+test_expect_success 'name-rev a rev shortly after epoch' '
+	test_when_finished "git checkout master" &&
+
+	git checkout --orphan no-timestamp-underflow &&
+	# Any date closer to epoch than the CUTOFF_DATE_SLOP constant
+	# in builtin/name-rev.c.
+	GIT_COMMITTER_DATE="@1234 +0000" \
+	git commit -m "committer date shortly after epoch" &&
+	near_commit_oid=$(git rev-parse HEAD) &&
+
+	echo "$near_commit_oid no-timestamp-underflow" >expect &&
+	git name-rev $near_commit_oid >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.23.0.597.gb58f4577a1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 18:01 [PATCH] name-rev: avoid cutoff timestamp underflow SZEDER Gábor
@ 2019-09-22 18:57 ` Phillip Wood
  2019-09-22 19:53   ` SZEDER Gábor
  2019-09-23  1:42 ` brian m. carlson
  2019-09-24  7:32 ` SZEDER Gábor
  2 siblings, 1 reply; 11+ messages in thread
From: Phillip Wood @ 2019-09-22 18:57 UTC (permalink / raw)
  To: SZEDER Gábor, Junio C Hamano; +Cc: git

Hi Gábor

On 22/09/2019 19:01, SZEDER Gábor wrote:
> When 'git name-rev' is invoked with commit-ish parameters, it tries to
> save some work, and doesn't visit commits older than the committer
> date of the oldest given commit minus a one day worth of slop.  Since
> our 'timestamp_t' is an unsigned type, this leads to a timestamp
> underflow when the committer date of the oldest given commit is within
> a day of the UNIX epoch.  As a result the cutoff timestamp ends up
> far-far in the future, and 'git name-rev' doesn't visit any commits,
> and names each given commit as 'undefined'.
> 
> Check whether substacting the slop from the oldest committer date
> would lead to an underflow, and use a 0 as cutoff in that case.  This
> way it will handle commits shortly after the epoch even if we were to
> switch to a signed 'timestamp_t' (but then we'll have to worry about
> signed underflow for very old commits).
> 
> Note that the type of the cutoff timestamp variable used to be signed
> before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
> 2017-05-20).  The behavior was still the same even back then, but the
> underflow didn't happen when substracting the slop from the oldest
> committer date, but when comparing the signed cutoff timestamp with
> unsigned committer dates in name_rev().  IOW, this underflow bug is as
> old as 'git name-rev' itself.
> 
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
> 
> This patch adds a test at the end of 't6120-describe.sh', so it will
> conflict with my non-recursive name-rev patch series, which adds a
> test there as well, but the conflict should be wasy to resolve.
> 
>    https://public-inbox.org/git/20190919214712.7348-7-szeder.dev@gmail.com/
> 
>   builtin/name-rev.c  | 15 ++++++++++++---
>   t/t6120-describe.sh | 15 +++++++++++++++
>   2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index c785fe16ba..a4d8d312ab 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -9,7 +9,11 @@
>   #include "sha1-lookup.h"
>   #include "commit-slab.h"
>   
> -#define CUTOFF_DATE_SLOP 86400 /* one day */
> +/*
> + * One day.  See the 'name a rev close to epoch' test in t6120 when
> + * changing this value
> + */
> +#define CUTOFF_DATE_SLOP 86400
>   
>   typedef struct rev_name {
>   	const char *tip_name;
> @@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>   		add_object_array(object, *argv, &revs);
>   	}
>   
> -	if (cutoff)
> -		cutoff = cutoff - CUTOFF_DATE_SLOP;
> +	if (cutoff) {
> +		/* check for undeflow */
> +		if (cutoff - CUTOFF_DATE_SLOP < cutoff)

Nice catch but wouldn't this be clearer as
   if (cutoff > CUTOFF_DATE_SLOP) ?

Best Wishes

Phillip
> +			cutoff = cutoff - CUTOFF_DATE_SLOP;
> +		else
> +			cutoff = 0;
> +	}
>   	for_each_ref(name_ref, &data);
>   
>   	if (transform_stdin) {
> diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
> index 2b883d8174..965e633c32 100755
> --- a/t/t6120-describe.sh
> +++ b/t/t6120-describe.sh
> @@ -424,4 +424,19 @@ test_expect_success 'describe complains about missing object' '
>   	test_must_fail git describe $ZERO_OID
>   '
>   
> +test_expect_success 'name-rev a rev shortly after epoch' '
> +	test_when_finished "git checkout master" &&
> +
> +	git checkout --orphan no-timestamp-underflow &&
> +	# Any date closer to epoch than the CUTOFF_DATE_SLOP constant
> +	# in builtin/name-rev.c.
> +	GIT_COMMITTER_DATE="@1234 +0000" \
> +	git commit -m "committer date shortly after epoch" &&
> +	near_commit_oid=$(git rev-parse HEAD) &&
> +
> +	echo "$near_commit_oid no-timestamp-underflow" >expect &&
> +	git name-rev $near_commit_oid >actual &&
> +	test_cmp expect actual
> +'
> +
>   test_done
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 18:57 ` Phillip Wood
@ 2019-09-22 19:53   ` SZEDER Gábor
  2019-09-22 21:01     ` Johannes Sixt
  0 siblings, 1 reply; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-22 19:53 UTC (permalink / raw)
  To: phillip.wood; +Cc: Junio C Hamano, git

On Sun, Sep 22, 2019 at 07:57:36PM +0100, Phillip Wood wrote:
> On 22/09/2019 19:01, SZEDER Gábor wrote:
> >When 'git name-rev' is invoked with commit-ish parameters, it tries to
> >save some work, and doesn't visit commits older than the committer
> >date of the oldest given commit minus a one day worth of slop.  Since
> >our 'timestamp_t' is an unsigned type, this leads to a timestamp
> >underflow when the committer date of the oldest given commit is within
> >a day of the UNIX epoch.  As a result the cutoff timestamp ends up
> >far-far in the future, and 'git name-rev' doesn't visit any commits,
> >and names each given commit as 'undefined'.
> >
> >Check whether substacting the slop from the oldest committer date
> >would lead to an underflow, and use a 0 as cutoff in that case.  This
> >way it will handle commits shortly after the epoch even if we were to
> >switch to a signed 'timestamp_t' (but then we'll have to worry about
> >signed underflow for very old commits).
> >
> >Note that the type of the cutoff timestamp variable used to be signed
> >before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
> >2017-05-20).  The behavior was still the same even back then, but the
> >underflow didn't happen when substracting the slop from the oldest
> >committer date, but when comparing the signed cutoff timestamp with
> >unsigned committer dates in name_rev().  IOW, this underflow bug is as
> >old as 'git name-rev' itself.
> >
> >Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> >---
> >
> >This patch adds a test at the end of 't6120-describe.sh', so it will
> >conflict with my non-recursive name-rev patch series, which adds a
> >test there as well, but the conflict should be wasy to resolve.
> >
> >   https://public-inbox.org/git/20190919214712.7348-7-szeder.dev@gmail.com/
> >
> >  builtin/name-rev.c  | 15 ++++++++++++---
> >  t/t6120-describe.sh | 15 +++++++++++++++
> >  2 files changed, 27 insertions(+), 3 deletions(-)
> >
> >diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> >index c785fe16ba..a4d8d312ab 100644
> >--- a/builtin/name-rev.c
> >+++ b/builtin/name-rev.c
> >@@ -9,7 +9,11 @@
> >  #include "sha1-lookup.h"
> >  #include "commit-slab.h"
> >-#define CUTOFF_DATE_SLOP 86400 /* one day */
> >+/*
> >+ * One day.  See the 'name a rev close to epoch' test in t6120 when
> >+ * changing this value
> >+ */
> >+#define CUTOFF_DATE_SLOP 86400
> >  typedef struct rev_name {
> >  	const char *tip_name;
> >@@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
> >  		add_object_array(object, *argv, &revs);
> >  	}
> >-	if (cutoff)
> >-		cutoff = cutoff - CUTOFF_DATE_SLOP;
> >+	if (cutoff) {
> >+		/* check for undeflow */
> >+		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
> 
> Nice catch but wouldn't this be clearer as
>   if (cutoff > CUTOFF_DATE_SLOP) ?

It would only be clearer now, with an unsigned 'timestamp_t'.  I
tried to future-proof for a signed 'timestamp_t' and a cutoff date
before the UNIX epoch.

> >+			cutoff = cutoff - CUTOFF_DATE_SLOP;
> >+		else
> >+			cutoff = 0;
> >+	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 19:53   ` SZEDER Gábor
@ 2019-09-22 21:01     ` Johannes Sixt
  2019-09-23  8:37       ` SZEDER Gábor
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Sixt @ 2019-09-22 21:01 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: phillip.wood, Junio C Hamano, git

Am 22.09.19 um 21:53 schrieb SZEDER Gábor:
> On Sun, Sep 22, 2019 at 07:57:36PM +0100, Phillip Wood wrote:
>> On 22/09/2019 19:01, SZEDER Gábor wrote:
>>> +/*
>>> + * One day.  See the 'name a rev close to epoch' test in t6120 when
>>> + * changing this value
>>> + */
>>> +#define CUTOFF_DATE_SLOP 86400
>>>  typedef struct rev_name {
>>>  	const char *tip_name;
>>> @@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>>>  		add_object_array(object, *argv, &revs);
>>>  	}
>>> -	if (cutoff)
>>> -		cutoff = cutoff - CUTOFF_DATE_SLOP;
>>> +	if (cutoff) {
>>> +		/* check for undeflow */
>>> +		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
>>
>> Nice catch but wouldn't this be clearer as
>>   if (cutoff > CUTOFF_DATE_SLOP) ?
> 
> It would only be clearer now, with an unsigned 'timestamp_t'.  I
> tried to future-proof for a signed 'timestamp_t' and a cutoff date
> before the UNIX epoch.

Huh? For signed cutoff and positive CUTOFF_DATE_SLOP,
cutoff - CUTOFF_DATE_SLOP < cutoff is ALWAYS true. Signed interger
underflow is undefined behavior and signed integer arithmetic does not
wrap around!

IOW, the new condition makes only sense today, because cutoff is an
unsigned type, but breaks down should we switch to a signed type.

You need this on top:

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index a4d8d312ab..2d83c2b172 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -487,10 +487,10 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
 
 	if (cutoff) {
 		/* check for undeflow */
-		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
+		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
 			cutoff = cutoff - CUTOFF_DATE_SLOP;
 		else
-			cutoff = 0;
+			cutoff = TIME_MIN;
 	}
 	for_each_ref(name_ref, &data);
 
diff --git a/git-compat-util.h b/git-compat-util.h
index c68c61d07c..1bdc21a069 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -344,6 +344,7 @@ typedef uintmax_t timestamp_t;
 #define PRItime PRIuMAX
 #define parse_timestamp strtoumax
 #define TIME_MAX UINTMAX_MAX
+#define TIME_MIN 0
 
 #ifndef PATH_SEP
 #define PATH_SEP ':'

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 18:01 [PATCH] name-rev: avoid cutoff timestamp underflow SZEDER Gábor
  2019-09-22 18:57 ` Phillip Wood
@ 2019-09-23  1:42 ` brian m. carlson
  2019-09-23  8:39   ` SZEDER Gábor
  2019-09-24  7:32 ` SZEDER Gábor
  2 siblings, 1 reply; 11+ messages in thread
From: brian m. carlson @ 2019-09-23  1:42 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

On 2019-09-22 at 18:01:43, SZEDER Gábor wrote:
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index c785fe16ba..a4d8d312ab 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -9,7 +9,11 @@
>  #include "sha1-lookup.h"
>  #include "commit-slab.h"
>  
> -#define CUTOFF_DATE_SLOP 86400 /* one day */
> +/*
> + * One day.  See the 'name a rev close to epoch' test in t6120 when

This piece of code says "close to"…

> +test_expect_success 'name-rev a rev shortly after epoch' '

…but this says "shortly after."

Overall, I think the idea is definitely sane, though.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 21:01     ` Johannes Sixt
@ 2019-09-23  8:37       ` SZEDER Gábor
  2019-09-23  9:30         ` Phillip Wood
  2019-09-23 19:16         ` Johannes Sixt
  0 siblings, 2 replies; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-23  8:37 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: phillip.wood, Junio C Hamano, git

On Sun, Sep 22, 2019 at 11:01:26PM +0200, Johannes Sixt wrote:
> Am 22.09.19 um 21:53 schrieb SZEDER Gábor:
> > On Sun, Sep 22, 2019 at 07:57:36PM +0100, Phillip Wood wrote:
> >> On 22/09/2019 19:01, SZEDER Gábor wrote:
> >>> +/*
> >>> + * One day.  See the 'name a rev close to epoch' test in t6120 when
> >>> + * changing this value
> >>> + */
> >>> +#define CUTOFF_DATE_SLOP 86400
> >>>  typedef struct rev_name {
> >>>  	const char *tip_name;
> >>> @@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
> >>>  		add_object_array(object, *argv, &revs);
> >>>  	}
> >>> -	if (cutoff)
> >>> -		cutoff = cutoff - CUTOFF_DATE_SLOP;
> >>> +	if (cutoff) {
> >>> +		/* check for undeflow */
> >>> +		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
> >>
> >> Nice catch but wouldn't this be clearer as
> >>   if (cutoff > CUTOFF_DATE_SLOP) ?
> > 
> > It would only be clearer now, with an unsigned 'timestamp_t'.  I
> > tried to future-proof for a signed 'timestamp_t' and a cutoff date
> > before the UNIX epoch.
> 
> Huh? For signed cutoff and positive CUTOFF_DATE_SLOP,
> cutoff - CUTOFF_DATE_SLOP < cutoff is ALWAYS true. Signed interger
> underflow is undefined behavior and signed integer arithmetic does not
> wrap around!
> 
> IOW, the new condition makes only sense today, because cutoff is an
> unsigned type, but breaks down should we switch to a signed type.

Yeah, that's what I meant with worrying about signed underflow in the
commit message.  As long as the cutoff is at least a day later than
the minimum value of our future signed 'timestamp_t', the condition
does the right thing.  And considering that oldest time a signed 64
bit timestamp can represent far exceeds the age of the universe, and
the oldest value of even a signed 32 bit timestamp is almost half the
age of the Earth, I wasn't too worried.

> You need this on top:
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index a4d8d312ab..2d83c2b172 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -487,10 +487,10 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>  
>  	if (cutoff) {
>  		/* check for undeflow */
> -		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
> +		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
>  			cutoff = cutoff - CUTOFF_DATE_SLOP;
>  		else
> -			cutoff = 0;
> +			cutoff = TIME_MIN;
>  	}
>  	for_each_ref(name_ref, &data);
>  
> diff --git a/git-compat-util.h b/git-compat-util.h
> index c68c61d07c..1bdc21a069 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -344,6 +344,7 @@ typedef uintmax_t timestamp_t;
>  #define PRItime PRIuMAX
>  #define parse_timestamp strtoumax
>  #define TIME_MAX UINTMAX_MAX
> +#define TIME_MIN 0
>  
>  #ifndef PATH_SEP
>  #define PATH_SEP ':'

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-23  1:42 ` brian m. carlson
@ 2019-09-23  8:39   ` SZEDER Gábor
  0 siblings, 0 replies; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-23  8:39 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, git

On Mon, Sep 23, 2019 at 01:42:30AM +0000, brian m. carlson wrote:
> On 2019-09-22 at 18:01:43, SZEDER Gábor wrote:
> > diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> > index c785fe16ba..a4d8d312ab 100644
> > --- a/builtin/name-rev.c
> > +++ b/builtin/name-rev.c
> > @@ -9,7 +9,11 @@
> >  #include "sha1-lookup.h"
> >  #include "commit-slab.h"
> >  
> > -#define CUTOFF_DATE_SLOP 86400 /* one day */
> > +/*
> > + * One day.  See the 'name a rev close to epoch' test in t6120 when
> 
> This piece of code says "close to"…
> 
> > +test_expect_success 'name-rev a rev shortly after epoch' '
> 
> …but this says "shortly after."

Thanks.  I did a last minute 'close to' -> 'shortly after' edit, but
apparently not everywhere.

> Overall, I think the idea is definitely sane, though.
> -- 
> brian m. carlson: Houston, Texas, US
> OpenPGP: https://keybase.io/bk2204



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-23  8:37       ` SZEDER Gábor
@ 2019-09-23  9:30         ` Phillip Wood
  2019-09-23 19:16         ` Johannes Sixt
  1 sibling, 0 replies; 11+ messages in thread
From: Phillip Wood @ 2019-09-23  9:30 UTC (permalink / raw)
  To: SZEDER Gábor, Johannes Sixt; +Cc: phillip.wood, Junio C Hamano, git

On 23/09/2019 09:37, SZEDER Gábor wrote:
> On Sun, Sep 22, 2019 at 11:01:26PM +0200, Johannes Sixt wrote:
>> Am 22.09.19 um 21:53 schrieb SZEDER Gábor:
>>> On Sun, Sep 22, 2019 at 07:57:36PM +0100, Phillip Wood wrote:
>>>> On 22/09/2019 19:01, SZEDER Gábor wrote:
>>>>> +/*
>>>>> + * One day.  See the 'name a rev close to epoch' test in t6120 when
>>>>> + * changing this value
>>>>> + */
>>>>> +#define CUTOFF_DATE_SLOP 86400
>>>>>   typedef struct rev_name {
>>>>>   	const char *tip_name;
>>>>> @@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>>>>>   		add_object_array(object, *argv, &revs);
>>>>>   	}
>>>>> -	if (cutoff)
>>>>> -		cutoff = cutoff - CUTOFF_DATE_SLOP;
>>>>> +	if (cutoff) {
>>>>> +		/* check for undeflow */
>>>>> +		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
>>>>
>>>> Nice catch but wouldn't this be clearer as
>>>>    if (cutoff > CUTOFF_DATE_SLOP) ?
>>>
>>> It would only be clearer now, with an unsigned 'timestamp_t'.  I
>>> tried to future-proof for a signed 'timestamp_t' and a cutoff date
>>> before the UNIX epoch.
>>
>> Huh? For signed cutoff and positive CUTOFF_DATE_SLOP,
>> cutoff - CUTOFF_DATE_SLOP < cutoff is ALWAYS true. Signed interger
>> underflow is undefined behavior and signed integer arithmetic does not
>> wrap around!
>>
>> IOW, the new condition makes only sense today, because cutoff is an
>> unsigned type, but breaks down should we switch to a signed type.
> 
> Yeah, that's what I meant with worrying about signed underflow in the
> commit message.  As long as the cutoff is at least a day later than
> the minimum value of our future signed 'timestamp_t', the condition
> does the right thing.  And considering that oldest time a signed 64
> bit timestamp can represent far exceeds the age of the universe, and
> the oldest value of even a signed 32 bit timestamp is almost half the
> age of the Earth, I wasn't too worried.


It's still going to trip up static analysers though isn't it? Also it 
will confuse readers who have to reason that it does not overflow in 
practice as we (probably?) never use very negative values

Best Wishes

Phillip

>> You need this on top:
>>
>> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
>> index a4d8d312ab..2d83c2b172 100644
>> --- a/builtin/name-rev.c
>> +++ b/builtin/name-rev.c
>> @@ -487,10 +487,10 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
>>   
>>   	if (cutoff) {
>>   		/* check for undeflow */
>> -		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
>> +		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
>>   			cutoff = cutoff - CUTOFF_DATE_SLOP;
>>   		else
>> -			cutoff = 0;
>> +			cutoff = TIME_MIN;
>>   	}
>>   	for_each_ref(name_ref, &data);
>>   
>> diff --git a/git-compat-util.h b/git-compat-util.h
>> index c68c61d07c..1bdc21a069 100644
>> --- a/git-compat-util.h
>> +++ b/git-compat-util.h
>> @@ -344,6 +344,7 @@ typedef uintmax_t timestamp_t;
>>   #define PRItime PRIuMAX
>>   #define parse_timestamp strtoumax
>>   #define TIME_MAX UINTMAX_MAX
>> +#define TIME_MIN 0
>>   
>>   #ifndef PATH_SEP
>>   #define PATH_SEP ':'

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-23  8:37       ` SZEDER Gábor
  2019-09-23  9:30         ` Phillip Wood
@ 2019-09-23 19:16         ` Johannes Sixt
  2019-09-24  7:21           ` SZEDER Gábor
  1 sibling, 1 reply; 11+ messages in thread
From: Johannes Sixt @ 2019-09-23 19:16 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: phillip.wood, Junio C Hamano, git

Am 23.09.19 um 10:37 schrieb SZEDER Gábor:
> On Sun, Sep 22, 2019 at 11:01:26PM +0200, Johannes Sixt wrote:
>> Huh? For signed cutoff and positive CUTOFF_DATE_SLOP,
>> cutoff - CUTOFF_DATE_SLOP < cutoff is ALWAYS true. Signed interger
>> underflow is undefined behavior and signed integer arithmetic does not
>> wrap around!
>>
>> IOW, the new condition makes only sense today, because cutoff is an
>> unsigned type, but breaks down should we switch to a signed type.
> 
> Yeah, that's what I meant with worrying about signed underflow in the
> commit message.  As long as the cutoff is at least a day later than
> the minimum value of our future signed 'timestamp_t', the condition
> does the right thing.  And considering that oldest time a signed 64
> bit timestamp can represent far exceeds the age of the universe, and
> the oldest value of even a signed 32 bit timestamp is almost half the
> age of the Earth, I wasn't too worried.

Note that commits and timestamps can be forged easily. I'm not worried
that Git does not work "correctly" with forged timestamps (as long as it
is not exploitable); but when it is simple to make foolproof, we should
do it.

BTW, 32-bit timestamps cover just ~135 years (not half the age of
Earth). That's too little for people who want to store historic
documents in a Git repository.

-- Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-23 19:16         ` Johannes Sixt
@ 2019-09-24  7:21           ` SZEDER Gábor
  0 siblings, 0 replies; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-24  7:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: phillip.wood, Junio C Hamano, git

On Mon, Sep 23, 2019 at 09:16:26PM +0200, Johannes Sixt wrote:
> Am 23.09.19 um 10:37 schrieb SZEDER Gábor:
> > On Sun, Sep 22, 2019 at 11:01:26PM +0200, Johannes Sixt wrote:
> >> Huh? For signed cutoff and positive CUTOFF_DATE_SLOP,
> >> cutoff - CUTOFF_DATE_SLOP < cutoff is ALWAYS true. Signed interger
> >> underflow is undefined behavior and signed integer arithmetic does not
> >> wrap around!
> >>
> >> IOW, the new condition makes only sense today, because cutoff is an
> >> unsigned type, but breaks down should we switch to a signed type.
> > 
> > Yeah, that's what I meant with worrying about signed underflow in the
> > commit message.  As long as the cutoff is at least a day later than
> > the minimum value of our future signed 'timestamp_t', the condition
> > does the right thing.  And considering that oldest time a signed 64
> > bit timestamp can represent far exceeds the age of the universe, and
> > the oldest value of even a signed 32 bit timestamp is almost half the
> > age of the Earth, I wasn't too worried.
> 
> Note that commits and timestamps can be forged easily. I'm not worried
> that Git does not work "correctly" with forged timestamps (as long as it
> is not exploitable); but when it is simple to make foolproof, we should
> do it.
> 
> BTW, 32-bit timestamps cover just ~135 years (not half the age of
> Earth).

Gah, forgot the division with seconds/year when calculating the range
of the 32bit timestamp.

> That's too little for people who want to store historic
> documents in a Git repository.

Indeed, but 'timestamp_t' is defined as 'uintmax_t', and we have the
Makefile knob 'NO_UINTMAX_T', in which case 'uintmax_t' is defined as
'uint32_t'...


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] name-rev: avoid cutoff timestamp underflow
  2019-09-22 18:01 [PATCH] name-rev: avoid cutoff timestamp underflow SZEDER Gábor
  2019-09-22 18:57 ` Phillip Wood
  2019-09-23  1:42 ` brian m. carlson
@ 2019-09-24  7:32 ` SZEDER Gábor
  2 siblings, 0 replies; 11+ messages in thread
From: SZEDER Gábor @ 2019-09-24  7:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Johannes Sixt, phillip.wood, brian m. carlson,
	SZEDER Gábor

When 'git name-rev' is invoked with commit-ish parameters, it tries to
save some work, and doesn't visit commits older than the committer
date of the oldest given commit minus a one day worth of slop.  Since
our 'timestamp_t' is an unsigned type, this leads to a timestamp
underflow when the committer date of the oldest given commit is within
a day of the UNIX epoch.  As a result the cutoff timestamp ends up
far-far in the future, and 'git name-rev' doesn't visit any commits,
and names each given commit as 'undefined'.

Check whether subtracting the slop from the oldest committer date
would lead to an underflow, and use no cutoff in that case.  We don't
have a TIME_MIN constant, dddbad728c (timestamp_t: a new data type for
timestamps, 2017-04-26) didn't add one, so do it now.

Note that the type of the cutoff timestamp variable used to be signed
before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
2017-05-20).  The behavior was still the same even back then, but the
underflow didn't happen when substracting the slop from the oldest
committer date, but when comparing the signed cutoff timestamp with
unsigned committer dates in name_rev().  IOW, this underflow bug is as
old as 'git name-rev' itself.

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
Range-diff:
1:  3bef6bdf7b ! 1:  b053181af9 name-rev: avoid cutoff timestamp underflow
    @@ Commit message
         far-far in the future, and 'git name-rev' doesn't visit any commits,
         and names each given commit as 'undefined'.
     
    -    Check whether substacting the slop from the oldest committer date
    -    would lead to an underflow, and use a 0 as cutoff in that case.  This
    -    way it will handle commits shortly after the epoch even if we were to
    -    switch to a signed 'timestamp_t' (but then we'll have to worry about
    -    signed underflow for very old commits).
    +    Check whether subtracting the slop from the oldest committer date
    +    would lead to an underflow, and use no cutoff in that case.  We don't
    +    have a TIME_MIN constant, dddbad728c (timestamp_t: a new data type for
    +    timestamps, 2017-04-26) didn't add one, so do it now.
     
         Note that the type of the cutoff timestamp variable used to be signed
         before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
    @@ Commit message
         unsigned committer dates in name_rev().  IOW, this underflow bug is as
         old as 'git name-rev' itself.
     
    +    Helped-by: Johannes Sixt <j6t@kdbg.org>
         Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
     
      ## builtin/name-rev.c ##
    @@ builtin/name-rev.c
      
     -#define CUTOFF_DATE_SLOP 86400 /* one day */
     +/*
    -+ * One day.  See the 'name a rev close to epoch' test in t6120 when
    ++ * One day.  See the 'name a rev shortly after epoch' test in t6120 when
     + * changing this value
     + */
     +#define CUTOFF_DATE_SLOP 86400
    @@ builtin/name-rev.c: int cmd_name_rev(int argc, const char **argv, const char *pr
     -		cutoff = cutoff - CUTOFF_DATE_SLOP;
     +	if (cutoff) {
     +		/* check for undeflow */
    -+		if (cutoff - CUTOFF_DATE_SLOP < cutoff)
    ++		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
     +			cutoff = cutoff - CUTOFF_DATE_SLOP;
     +		else
    -+			cutoff = 0;
    ++			cutoff = TIME_MIN;
     +	}
      	for_each_ref(name_ref, &data);
      
      	if (transform_stdin) {
     
    + ## git-compat-util.h ##
    +@@ git-compat-util.h: typedef uintmax_t timestamp_t;
    + #define PRItime PRIuMAX
    + #define parse_timestamp strtoumax
    + #define TIME_MAX UINTMAX_MAX
    ++#define TIME_MIN 0
    + 
    + #ifndef PATH_SEP
    + #define PATH_SEP ':'
    +
      ## t/t6120-describe.sh ##
     @@ t/t6120-describe.sh: test_expect_success 'describe complains about missing object' '
      	test_must_fail git describe $ZERO_OID
    @@ t/t6120-describe.sh: test_expect_success 'describe complains about missing objec
     +	# in builtin/name-rev.c.
     +	GIT_COMMITTER_DATE="@1234 +0000" \
     +	git commit -m "committer date shortly after epoch" &&
    -+	near_commit_oid=$(git rev-parse HEAD) &&
    ++	old_commit_oid=$(git rev-parse HEAD) &&
     +
    -+	echo "$near_commit_oid no-timestamp-underflow" >expect &&
    -+	git name-rev $near_commit_oid >actual &&
    ++	echo "$old_commit_oid no-timestamp-underflow" >expect &&
    ++	git name-rev $old_commit_oid >actual &&
     +	test_cmp expect actual
     +'
     +

 builtin/name-rev.c  | 15 ++++++++++++---
 git-compat-util.h   |  1 +
 t/t6120-describe.sh | 15 +++++++++++++++
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index c785fe16ba..b0f0776947 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -9,7 +9,11 @@
 #include "sha1-lookup.h"
 #include "commit-slab.h"
 
-#define CUTOFF_DATE_SLOP 86400 /* one day */
+/*
+ * One day.  See the 'name a rev shortly after epoch' test in t6120 when
+ * changing this value
+ */
+#define CUTOFF_DATE_SLOP 86400
 
 typedef struct rev_name {
 	const char *tip_name;
@@ -481,8 +485,13 @@ int cmd_name_rev(int argc, const char **argv, const char *prefix)
 		add_object_array(object, *argv, &revs);
 	}
 
-	if (cutoff)
-		cutoff = cutoff - CUTOFF_DATE_SLOP;
+	if (cutoff) {
+		/* check for undeflow */
+		if (cutoff > TIME_MIN + CUTOFF_DATE_SLOP)
+			cutoff = cutoff - CUTOFF_DATE_SLOP;
+		else
+			cutoff = TIME_MIN;
+	}
 	for_each_ref(name_ref, &data);
 
 	if (transform_stdin) {
diff --git a/git-compat-util.h b/git-compat-util.h
index f0d13e4e28..d2e884e46a 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -344,6 +344,7 @@ typedef uintmax_t timestamp_t;
 #define PRItime PRIuMAX
 #define parse_timestamp strtoumax
 #define TIME_MAX UINTMAX_MAX
+#define TIME_MIN 0
 
 #ifndef PATH_SEP
 #define PATH_SEP ':'
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 2b883d8174..45047d0a72 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -424,4 +424,19 @@ test_expect_success 'describe complains about missing object' '
 	test_must_fail git describe $ZERO_OID
 '
 
+test_expect_success 'name-rev a rev shortly after epoch' '
+	test_when_finished "git checkout master" &&
+
+	git checkout --orphan no-timestamp-underflow &&
+	# Any date closer to epoch than the CUTOFF_DATE_SLOP constant
+	# in builtin/name-rev.c.
+	GIT_COMMITTER_DATE="@1234 +0000" \
+	git commit -m "committer date shortly after epoch" &&
+	old_commit_oid=$(git rev-parse HEAD) &&
+
+	echo "$old_commit_oid no-timestamp-underflow" >expect &&
+	git name-rev $old_commit_oid >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.23.0.597.gb58f4577a1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-09-24  7:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-22 18:01 [PATCH] name-rev: avoid cutoff timestamp underflow SZEDER Gábor
2019-09-22 18:57 ` Phillip Wood
2019-09-22 19:53   ` SZEDER Gábor
2019-09-22 21:01     ` Johannes Sixt
2019-09-23  8:37       ` SZEDER Gábor
2019-09-23  9:30         ` Phillip Wood
2019-09-23 19:16         ` Johannes Sixt
2019-09-24  7:21           ` SZEDER Gábor
2019-09-23  1:42 ` brian m. carlson
2019-09-23  8:39   ` SZEDER Gábor
2019-09-24  7:32 ` SZEDER Gábor

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).