git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] getting rid of sha1_to_hex()
@ 2019-11-11  9:03 Jeff King
  2019-11-11  9:04 ` [PATCH 1/2] hex: drop sha1_to_hex_r() Jeff King
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jeff King @ 2019-11-11  9:03 UTC (permalink / raw)
  To: git; +Cc: brian m. carlson

I happened to be looking at the oid_to_hex() family of functions today,
and noticed that we can drop some of the sha1-specific ones.

It's a minor cleanup in terms of text, but it feels like a nice
milestone. :)

  [1/2]: hex: drop sha1_to_hex_r()
  [2/2]: hex: drop sha1_to_hex()

 cache.h                            |  4 +---
 contrib/coccinelle/object_id.cocci | 32 ------------------------------
 hex.c                              | 10 ----------
 sha1dc_git.c                       |  2 +-
 4 files changed, 2 insertions(+), 46 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] hex: drop sha1_to_hex_r()
  2019-11-11  9:03 [PATCH 0/2] getting rid of sha1_to_hex() Jeff King
@ 2019-11-11  9:04 ` Jeff King
  2019-11-11 18:30   ` Johannes Schindelin
  2019-11-11  9:04 ` [PATCH 2/2] hex: drop sha1_to_hex() Jeff King
  2019-11-11  9:09 ` [PATCH 0/2] getting rid of sha1_to_hex() Junio C Hamano
  2 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2019-11-11  9:04 UTC (permalink / raw)
  To: git; +Cc: brian m. carlson

There are no callers left; everybody uses oid_to_hex_r() or
hash_to_hex_algop_r(). This used to actually be the underlying
implementation for oid_to_hex_r(), but that's no longer the case since
47edb64997 (hex: introduce functions to print arbitrary hashes,
2018-11-14).

Let's get rid of it to de-clutter and to make sure nobody uses it.
Likewise we can drop the coccinelle rules that mention it, since the
compiler will make it quite clear that the code does not work.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h                            |  1 -
 contrib/coccinelle/object_id.cocci | 17 -----------------
 hex.c                              |  5 -----
 3 files changed, 23 deletions(-)

diff --git a/cache.h b/cache.h
index 04cabaac11..6a4eb221b3 100644
--- a/cache.h
+++ b/cache.h
@@ -1462,7 +1462,6 @@ int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
  *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
  */
 char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
-char *sha1_to_hex_r(char *out, const unsigned char *sha1);
 char *oid_to_hex_r(char *out, const struct object_id *oid);
 char *hash_to_hex_algop(const unsigned char *hash, const struct git_hash_algo *);	/* static buffer result! */
 char *sha1_to_hex(const unsigned char *sha1);						/* same static buffer */
diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci
index 3e536a9834..6c0d21d8e2 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -25,23 +25,6 @@ struct object_id *OIDPTR;
 + oid_to_hex(OIDPTR)
   ...>}
 
-@@
-expression E;
-struct object_id OID;
-@@
-- sha1_to_hex_r(E, OID.hash)
-+ oid_to_hex_r(E, &OID)
-
-@@
-identifier f != oid_to_hex_r;
-expression E;
-struct object_id *OIDPTR;
-@@
-   f(...) {<...
-- sha1_to_hex_r(E, OIDPTR->hash)
-+ oid_to_hex_r(E, OIDPTR)
-  ...>}
-
 @@
 struct object_id OID;
 @@
diff --git a/hex.c b/hex.c
index 7850a8879d..8c3f06a192 100644
--- a/hex.c
+++ b/hex.c
@@ -90,11 +90,6 @@ char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash,
 	return buffer;
 }
 
-char *sha1_to_hex_r(char *buffer, const unsigned char *sha1)
-{
-	return hash_to_hex_algop_r(buffer, sha1, &hash_algos[GIT_HASH_SHA1]);
-}
-
 char *oid_to_hex_r(char *buffer, const struct object_id *oid)
 {
 	return hash_to_hex_algop_r(buffer, oid->hash, the_hash_algo);
-- 
2.24.0.739.gb5632e4929


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-11  9:03 [PATCH 0/2] getting rid of sha1_to_hex() Jeff King
  2019-11-11  9:04 ` [PATCH 1/2] hex: drop sha1_to_hex_r() Jeff King
@ 2019-11-11  9:04 ` Jeff King
  2019-11-11 14:18   ` SZEDER Gábor
  2019-11-11  9:09 ` [PATCH 0/2] getting rid of sha1_to_hex() Junio C Hamano
  2 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2019-11-11  9:04 UTC (permalink / raw)
  To: git; +Cc: brian m. carlson

There's only a single caller left of sha1_to_hex(), since everybody now
uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
print a hex sha1 when we find a collision. This one will always be sha1,
regardless of the current hash algorithm, so we can't use oid_to_hex()
here. In practice we'd probably not be running sha1 at all if it isn't
the current algorithm, but it's possible we might still occasionally
need to compute a sha1 in a post-sha256 world.

Since sha1_to_hex() is just a wrapper for hash_to_hex_algop(), let's
call that ourselves. There's value in getting rid of the sha1-specific
wrapper to de-clutter the global namespace, and to make sure nobody uses
it (and as with sha1_to_hex_r() in the previous patch, we'll drop the
coccinelle transformations, too).

The sha1_to_hex() function is mentioned in a comment; we can easily swap
that out for oid_to_hex() to give a better example. It's also mentioned
in some test vectors in t4100, but that's not runnable code, so there's
no point in trying to clean it up.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h                            |  3 +--
 contrib/coccinelle/object_id.cocci | 15 ---------------
 hex.c                              |  5 -----
 sha1dc_git.c                       |  2 +-
 4 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/cache.h b/cache.h
index 6a4eb221b3..a2ab10503f 100644
--- a/cache.h
+++ b/cache.h
@@ -1459,12 +1459,11 @@ int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
  * The non-`_r` variant returns a static buffer, but uses a ring of 4
  * buffers, making it safe to make multiple calls for a single statement, like:
  *
- *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
  */
 char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
 char *oid_to_hex_r(char *out, const struct object_id *oid);
 char *hash_to_hex_algop(const unsigned char *hash, const struct git_hash_algo *);	/* static buffer result! */
-char *sha1_to_hex(const unsigned char *sha1);						/* same static buffer */
 char *hash_to_hex(const unsigned char *hash);						/* same static buffer */
 char *oid_to_hex(const struct object_id *oid);						/* same static buffer */
 
diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci
index 6c0d21d8e2..ddf4f22bd7 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -10,21 +10,6 @@ struct object_id *OIDPTR;
 - is_null_sha1(OIDPTR->hash)
 + is_null_oid(OIDPTR)
 
-@@
-struct object_id OID;
-@@
-- sha1_to_hex(OID.hash)
-+ oid_to_hex(&OID)
-
-@@
-identifier f != oid_to_hex;
-struct object_id *OIDPTR;
-@@
-  f(...) {<...
-- sha1_to_hex(OIDPTR->hash)
-+ oid_to_hex(OIDPTR)
-  ...>}
-
 @@
 struct object_id OID;
 @@
diff --git a/hex.c b/hex.c
index 8c3f06a192..fd7f00c43f 100644
--- a/hex.c
+++ b/hex.c
@@ -103,11 +103,6 @@ char *hash_to_hex_algop(const unsigned char *hash, const struct git_hash_algo *a
 	return hash_to_hex_algop_r(hexbuffer[bufno], hash, algop);
 }
 
-char *sha1_to_hex(const unsigned char *sha1)
-{
-	return hash_to_hex_algop(sha1, &hash_algos[GIT_HASH_SHA1]);
-}
-
 char *hash_to_hex(const unsigned char *hash)
 {
 	return hash_to_hex_algop(hash, the_hash_algo);
diff --git a/sha1dc_git.c b/sha1dc_git.c
index e0cc9d988c..5c300e812e 100644
--- a/sha1dc_git.c
+++ b/sha1dc_git.c
@@ -19,7 +19,7 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
 	if (!SHA1DCFinal(hash, ctx))
 		return;
 	die("SHA-1 appears to be part of a collision attack: %s",
-	    sha1_to_hex(hash));
+	    hash_to_hex_algop(hash, &hash_algos[GIT_HASH_SHA1]));
 }
 
 /*
-- 
2.24.0.739.gb5632e4929

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] getting rid of sha1_to_hex()
  2019-11-11  9:03 [PATCH 0/2] getting rid of sha1_to_hex() Jeff King
  2019-11-11  9:04 ` [PATCH 1/2] hex: drop sha1_to_hex_r() Jeff King
  2019-11-11  9:04 ` [PATCH 2/2] hex: drop sha1_to_hex() Jeff King
@ 2019-11-11  9:09 ` Junio C Hamano
  2019-11-11  9:21   ` Jeff King
  2 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2019-11-11  9:09 UTC (permalink / raw)
  To: Jeff King; +Cc: git, brian m. carlson

Jeff King <peff@peff.net> writes:

> I happened to be looking at the oid_to_hex() family of functions today,
> and noticed that we can drop some of the sha1-specific ones.
>
> It's a minor cleanup in terms of text, but it feels like a nice
> milestone. :)
>
>   [1/2]: hex: drop sha1_to_hex_r()
>   [2/2]: hex: drop sha1_to_hex()

Nice indeed.

Thanks, Brian! ;-)  oh, and Peff, of courese ;-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] getting rid of sha1_to_hex()
  2019-11-11  9:09 ` [PATCH 0/2] getting rid of sha1_to_hex() Junio C Hamano
@ 2019-11-11  9:21   ` Jeff King
  2019-11-11 23:53     ` brian m. carlson
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2019-11-11  9:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, brian m. carlson

On Mon, Nov 11, 2019 at 06:09:10PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I happened to be looking at the oid_to_hex() family of functions today,
> > and noticed that we can drop some of the sha1-specific ones.
> >
> > It's a minor cleanup in terms of text, but it feels like a nice
> > milestone. :)
> >
> >   [1/2]: hex: drop sha1_to_hex_r()
> >   [2/2]: hex: drop sha1_to_hex()
> 
> Nice indeed.
> 
> Thanks, Brian! ;-)  oh, and Peff, of courese ;-)

I actually worried that I might be stealing brian's thunder. But AFAICT
this cleanup has been waiting for almost a year, so I figured it was
fair game. :)

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-11  9:04 ` [PATCH 2/2] hex: drop sha1_to_hex() Jeff King
@ 2019-11-11 14:18   ` SZEDER Gábor
  2019-11-11 14:29     ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: SZEDER Gábor @ 2019-11-11 14:18 UTC (permalink / raw)
  To: Jeff King; +Cc: git, brian m. carlson

On Mon, Nov 11, 2019 at 04:04:18AM -0500, Jeff King wrote:
> There's only a single caller left of sha1_to_hex(), since everybody now
> uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
> print a hex sha1 when we find a collision. This one will always be sha1,
> regardless of the current hash algorithm, so we can't use oid_to_hex()

Nit: s/oid_to_hex/hash_to_hex/

We can't use oid_to_hex() because we don't have a 'struct object_id'
in the first place, as sha1dc only ever deals with 20 unsigned chars.

> here. In practice we'd probably not be running sha1 at all if it isn't
> the current algorithm, but it's possible we might still occasionally
> need to compute a sha1 in a post-sha256 world.
> 
> Since sha1_to_hex() is just a wrapper for hash_to_hex_algop(), let's
> call that ourselves. There's value in getting rid of the sha1-specific
> wrapper to de-clutter the global namespace, and to make sure nobody uses
> it (and as with sha1_to_hex_r() in the previous patch, we'll drop the
> coccinelle transformations, too).


> diff --git a/sha1dc_git.c b/sha1dc_git.c
> index e0cc9d988c..5c300e812e 100644
> --- a/sha1dc_git.c
> +++ b/sha1dc_git.c
> @@ -19,7 +19,7 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
>  	if (!SHA1DCFinal(hash, ctx))
>  		return;
>  	die("SHA-1 appears to be part of a collision attack: %s",
> -	    sha1_to_hex(hash));
> +	    hash_to_hex_algop(hash, &hash_algos[GIT_HASH_SHA1]));
>  }
>  
>  /*
> -- 
> 2.24.0.739.gb5632e4929

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-11 14:18   ` SZEDER Gábor
@ 2019-11-11 14:29     ` Jeff King
  2019-11-12  4:13       ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2019-11-11 14:29 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, brian m. carlson

On Mon, Nov 11, 2019 at 03:18:05PM +0100, SZEDER Gábor wrote:

> On Mon, Nov 11, 2019 at 04:04:18AM -0500, Jeff King wrote:
> > There's only a single caller left of sha1_to_hex(), since everybody now
> > uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
> > print a hex sha1 when we find a collision. This one will always be sha1,
> > regardless of the current hash algorithm, so we can't use oid_to_hex()
> 
> Nit: s/oid_to_hex/hash_to_hex/
> 
> We can't use oid_to_hex() because we don't have a 'struct object_id'
> in the first place, as sha1dc only ever deals with 20 unsigned chars.

Ah, you're right. I admit I am still getting up to speed on all of the
new hash-agnostic versions of the various functions.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] hex: drop sha1_to_hex_r()
  2019-11-11  9:04 ` [PATCH 1/2] hex: drop sha1_to_hex_r() Jeff King
@ 2019-11-11 18:30   ` Johannes Schindelin
  0 siblings, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2019-11-11 18:30 UTC (permalink / raw)
  To: Jeff King; +Cc: git, brian m. carlson

Hi Peff,

On Mon, 11 Nov 2019, Jeff King wrote:

> There are no callers left; everybody uses oid_to_hex_r() or
> hash_to_hex_algop_r(). This used to actually be the underlying
> implementation for oid_to_hex_r(), but that's no longer the case since
> 47edb64997 (hex: introduce functions to print arbitrary hashes,
> 2018-11-14).
>
> Let's get rid of it to de-clutter and to make sure nobody uses it.
> Likewise we can drop the coccinelle rules that mention it, since the
> compiler will make it quite clear that the code does not work.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  cache.h                            |  1 -
>  contrib/coccinelle/object_id.cocci | 17 -----------------
>  hex.c                              |  5 -----
>  3 files changed, 23 deletions(-)

I really like that diffstat.

Thanks,
Dscho

>
> diff --git a/cache.h b/cache.h
> index 04cabaac11..6a4eb221b3 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1462,7 +1462,6 @@ int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
>   *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
>   */
>  char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
> -char *sha1_to_hex_r(char *out, const unsigned char *sha1);
>  char *oid_to_hex_r(char *out, const struct object_id *oid);
>  char *hash_to_hex_algop(const unsigned char *hash, const struct git_hash_algo *);	/* static buffer result! */
>  char *sha1_to_hex(const unsigned char *sha1);						/* same static buffer */
> diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci
> index 3e536a9834..6c0d21d8e2 100644
> --- a/contrib/coccinelle/object_id.cocci
> +++ b/contrib/coccinelle/object_id.cocci
> @@ -25,23 +25,6 @@ struct object_id *OIDPTR;
>  + oid_to_hex(OIDPTR)
>    ...>}
>
> -@@
> -expression E;
> -struct object_id OID;
> -@@
> -- sha1_to_hex_r(E, OID.hash)
> -+ oid_to_hex_r(E, &OID)
> -
> -@@
> -identifier f != oid_to_hex_r;
> -expression E;
> -struct object_id *OIDPTR;
> -@@
> -   f(...) {<...
> -- sha1_to_hex_r(E, OIDPTR->hash)
> -+ oid_to_hex_r(E, OIDPTR)
> -  ...>}
> -
>  @@
>  struct object_id OID;
>  @@
> diff --git a/hex.c b/hex.c
> index 7850a8879d..8c3f06a192 100644
> --- a/hex.c
> +++ b/hex.c
> @@ -90,11 +90,6 @@ char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash,
>  	return buffer;
>  }
>
> -char *sha1_to_hex_r(char *buffer, const unsigned char *sha1)
> -{
> -	return hash_to_hex_algop_r(buffer, sha1, &hash_algos[GIT_HASH_SHA1]);
> -}
> -
>  char *oid_to_hex_r(char *buffer, const struct object_id *oid)
>  {
>  	return hash_to_hex_algop_r(buffer, oid->hash, the_hash_algo);
> --
> 2.24.0.739.gb5632e4929
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] getting rid of sha1_to_hex()
  2019-11-11  9:21   ` Jeff King
@ 2019-11-11 23:53     ` brian m. carlson
  0 siblings, 0 replies; 17+ messages in thread
From: brian m. carlson @ 2019-11-11 23:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 521 bytes --]

On 2019-11-11 at 09:21:44, Jeff King wrote:
> I actually worried that I might be stealing brian's thunder. But AFAICT
> this cleanup has been waiting for almost a year, so I figured it was
> fair game. :)

Oh, no, not at all.  I'm delighted that it can go, and thanks for
sending the patches.

I think I had intended to remove it but hadn't because I was worried
about affecting topics in flight, and then promptly forgot about it.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-11 14:29     ` Jeff King
@ 2019-11-12  4:13       ` Junio C Hamano
  2019-11-12 10:57         ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2019-11-12  4:13 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, git, brian m. carlson

Jeff King <peff@peff.net> writes:

> On Mon, Nov 11, 2019 at 03:18:05PM +0100, SZEDER Gábor wrote:
>
>> On Mon, Nov 11, 2019 at 04:04:18AM -0500, Jeff King wrote:
>> > There's only a single caller left of sha1_to_hex(), since everybody now
>> > uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
>> > print a hex sha1 when we find a collision. This one will always be sha1,
>> > regardless of the current hash algorithm, so we can't use oid_to_hex()
>> 
>> Nit: s/oid_to_hex/hash_to_hex/
>> 
>> We can't use oid_to_hex() because we don't have a 'struct object_id'
>> in the first place, as sha1dc only ever deals with 20 unsigned chars.
>
> Ah, you're right. I admit I am still getting up to speed on all of the
> new hash-agnostic versions of the various functions.

Thanks.  I've amended this one and the range diff since the pushout
yesterday looks like this.

1:  8a030f1796 ! 1:  02d21d4117 hex: drop sha1_to_hex()
    @@ Commit message
         hex: drop sha1_to_hex()
     
         There's only a single caller left of sha1_to_hex(), since everybody now
    -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
    +    uses hash_to_hex() instead. This case is in the sha1dc wrapper, where we
         print a hex sha1 when we find a collision. This one will always be sha1,
    -    regardless of the current hash algorithm, so we can't use oid_to_hex()
    +    regardless of the current hash algorithm, so we can't use hash_to_hex()
         here. In practice we'd probably not be running sha1 at all if it isn't
         the current algorithm, but it's possible we might still occasionally
         need to compute a sha1 in a post-sha256 world.
    @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
       * buffers, making it safe to make multiple calls for a single statement, like:
       *
     - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
    -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
    ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
       */
      char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
      char *oid_to_hex_r(char *out, const struct object_id *oid);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12  4:13       ` Junio C Hamano
@ 2019-11-12 10:57         ` Jeff King
  2019-11-12 11:44           ` SZEDER Gábor
  2019-11-12 11:49           ` Junio C Hamano
  0 siblings, 2 replies; 17+ messages in thread
From: Jeff King @ 2019-11-12 10:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git, brian m. carlson

On Tue, Nov 12, 2019 at 01:13:58PM +0900, Junio C Hamano wrote:

> >> We can't use oid_to_hex() because we don't have a 'struct object_id'
> >> in the first place, as sha1dc only ever deals with 20 unsigned chars.
> >
> > Ah, you're right. I admit I am still getting up to speed on all of the
> > new hash-agnostic versions of the various functions.
> 
> Thanks.  I've amended this one and the range diff since the pushout
> yesterday looks like this.

Thanks. This first hunk is what I would have done:

> 1:  8a030f1796 ! 1:  02d21d4117 hex: drop sha1_to_hex()
>     @@ Commit message
>          hex: drop sha1_to_hex()
>      
>          There's only a single caller left of sha1_to_hex(), since everybody now
>     -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
>     +    uses hash_to_hex() instead. This case is in the sha1dc wrapper, where we
>          print a hex sha1 when we find a collision. This one will always be sha1,
>     -    regardless of the current hash algorithm, so we can't use oid_to_hex()
>     +    regardless of the current hash algorithm, so we can't use hash_to_hex()
>          here. In practice we'd probably not be running sha1 at all if it isn't
>          the current algorithm, but it's possible we might still occasionally
>          need to compute a sha1 in a post-sha256 world.

This second one is OK, but not entirely necessary:

>     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
>        * buffers, making it safe to make multiple calls for a single statement, like:
>        *
>      - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
>     -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
>     ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
>        */
>       char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
>       char *oid_to_hex_r(char *out, const struct object_id *oid);

This one-liner leaves the types of "one" and "two" unspecified. :) So
it's not wrong to use hash_to_hex(), but maybe it's better to be pushing
people towards oid_to_hex() as their first choice? It probably doesn't
matter too much either way.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12 10:57         ` Jeff King
@ 2019-11-12 11:44           ` SZEDER Gábor
  2019-11-12 12:12             ` Jeff King
  2019-11-12 11:49           ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: SZEDER Gábor @ 2019-11-12 11:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, brian m. carlson

On Tue, Nov 12, 2019 at 05:57:59AM -0500, Jeff King wrote:
> On Tue, Nov 12, 2019 at 01:13:58PM +0900, Junio C Hamano wrote:
> 
> > >> We can't use oid_to_hex() because we don't have a 'struct object_id'
> > >> in the first place, as sha1dc only ever deals with 20 unsigned chars.
> > >
> > > Ah, you're right. I admit I am still getting up to speed on all of the
> > > new hash-agnostic versions of the various functions.
> > 
> > Thanks.  I've amended this one and the range diff since the pushout
> > yesterday looks like this.
> 
> Thanks. This first hunk is what I would have done:
> 
> > 1:  8a030f1796 ! 1:  02d21d4117 hex: drop sha1_to_hex()
> >     @@ Commit message
> >          hex: drop sha1_to_hex()
> >      
> >          There's only a single caller left of sha1_to_hex(), since everybody now
> >     -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
> >     +    uses hash_to_hex() instead. This case is in the sha1dc wrapper, where we
> >          print a hex sha1 when we find a collision. This one will always be sha1,
> >     -    regardless of the current hash algorithm, so we can't use oid_to_hex()
> >     +    regardless of the current hash algorithm, so we can't use hash_to_hex()
> >          here. In practice we'd probably not be running sha1 at all if it isn't
> >          the current algorithm, but it's possible we might still occasionally
> >          need to compute a sha1 in a post-sha256 world.
> 
> This second one is OK, but not entirely necessary:
> 
> >     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
> >        * buffers, making it safe to make multiple calls for a single statement, like:
> >        *
> >      - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
> >     -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
> >     ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
> >        */
> >       char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
> >       char *oid_to_hex_r(char *out, const struct object_id *oid);
> 
> This one-liner leaves the types of "one" and "two" unspecified. :) So
> it's not wrong to use hash_to_hex(), but maybe it's better to be pushing
> people towards oid_to_hex() as their first choice? It probably doesn't
> matter too much either way.

Yeah, most (over 96%) of the hashes that we print are actually object
ids, so oid_to_hex() is the right function to use most of the time.

And because of this the updated "since everybody uses hash_to_hex()"
in the first hunk sounds a bit wrong, because barely anybody actually
uses hash_to_hex().


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12 10:57         ` Jeff King
  2019-11-12 11:44           ` SZEDER Gábor
@ 2019-11-12 11:49           ` Junio C Hamano
  2019-11-12 12:15             ` Jeff King
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2019-11-12 11:49 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, git, brian m. carlson

Jeff King <peff@peff.net> writes:

>>     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
>>        * buffers, making it safe to make multiple calls for a single statement, like:
>>        *
>>      - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
>>     -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
>>     ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
>>        */
>>       char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
>>       char *oid_to_hex_r(char *out, const struct object_id *oid);
>
> This one-liner leaves the types of "one" and "two" unspecified. :) So
> it's not wrong to use hash_to_hex(), but maybe it's better to be pushing
> people towards oid_to_hex() as their first choice? It probably doesn't
> matter too much either way.

The pre-context of that comment reads:

 * Convert a binary hash to its hex equivalent. The `_r` variant is reentrant,
 * and writes the NUL-terminated output to the buffer `out`, which must be at
 * least `GIT_MAX_HEXSZ + 1` bytes, and returns a pointer to out for
 * convenience.

so I think the intent of the example that used to use sha1_to_hex()
has been the raw bytestring that is the object name, not its form
that is encapsulated in the "struct object_id()".


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12 11:44           ` SZEDER Gábor
@ 2019-11-12 12:12             ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2019-11-12 12:12 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git, brian m. carlson

On Tue, Nov 12, 2019 at 12:44:58PM +0100, SZEDER Gábor wrote:

> > > 1:  8a030f1796 ! 1:  02d21d4117 hex: drop sha1_to_hex()
> > >     @@ Commit message
> > >          hex: drop sha1_to_hex()
> > >      
> > >          There's only a single caller left of sha1_to_hex(), since everybody now
> > >     -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
> > >     +    uses hash_to_hex() instead. This case is in the sha1dc wrapper, where we
> > >          print a hex sha1 when we find a collision. This one will always be sha1,
> > >     -    regardless of the current hash algorithm, so we can't use oid_to_hex()
> > >     +    regardless of the current hash algorithm, so we can't use hash_to_hex()
> > >          here. In practice we'd probably not be running sha1 at all if it isn't
> > >          the current algorithm, but it's possible we might still occasionally
> > >          need to compute a sha1 in a post-sha256 world.
> [...]
> And because of this the updated "since everybody uses hash_to_hex()"
> in the first hunk sounds a bit wrong, because barely anybody actually
> uses hash_to_hex().

You're right, I should have read more carefully.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12 11:49           ` Junio C Hamano
@ 2019-11-12 12:15             ` Jeff King
  2019-11-13  1:09               ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2019-11-12 12:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git, brian m. carlson

On Tue, Nov 12, 2019 at 08:49:44PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >>     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
> >>        * buffers, making it safe to make multiple calls for a single statement, like:
> >>        *
> >>      - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
> >>     -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
> >>     ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
> >>        */
> >>       char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
> >>       char *oid_to_hex_r(char *out, const struct object_id *oid);
> >
> > This one-liner leaves the types of "one" and "two" unspecified. :) So
> > it's not wrong to use hash_to_hex(), but maybe it's better to be pushing
> > people towards oid_to_hex() as their first choice? It probably doesn't
> > matter too much either way.
> 
> The pre-context of that comment reads:
> 
>  * Convert a binary hash to its hex equivalent. The `_r` variant is reentrant,
>  * and writes the NUL-terminated output to the buffer `out`, which must be at
>  * least `GIT_MAX_HEXSZ + 1` bytes, and returns a pointer to out for
>  * convenience.
> 
> so I think the intent of the example that used to use sha1_to_hex()
> has been the raw bytestring that is the object name, not its form
> that is encapsulated in the "struct object_id()".

I guess you're keying on the phrase "binary hash" there (the
"GIT_MAX_HEXZ" bits only apply to the "_r" variants anyway). I'd read it
as encompassing all of the functions below, including oid_to_hex(). But
I'm OK with it either way.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-12 12:15             ` Jeff King
@ 2019-11-13  1:09               ` Junio C Hamano
  2019-11-13  1:15                 ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2019-11-13  1:09 UTC (permalink / raw)
  To: Jeff King; +Cc: SZEDER Gábor, git, brian m. carlson

Jeff King <peff@peff.net> writes:

> On Tue, Nov 12, 2019 at 08:49:44PM +0900, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> >>     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
>> >>        * buffers, making it safe to make multiple calls for a single statement, like:
>> >>        *
>> >>      - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
>> >>     -+ *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
>> >>     ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
>> >>        */
>> >>       char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);
>> >>       char *oid_to_hex_r(char *out, const struct object_id *oid);
>> >
>> > This one-liner leaves the types of "one" and "two" unspecified. :) So
>> > it's not wrong to use hash_to_hex(), but maybe it's better to be pushing
>> > people towards oid_to_hex() as their first choice? It probably doesn't
>> > matter too much either way.
>> 
>> The pre-context of that comment reads:
>> 
>>  * Convert a binary hash to its hex equivalent. The `_r` variant is reentrant,
>>  * and writes the NUL-terminated output to the buffer `out`, which must be at
>>  * least `GIT_MAX_HEXSZ + 1` bytes, and returns a pointer to out for
>>  * convenience.
>> 
>> so I think the intent of the example that used to use sha1_to_hex()
>> has been the raw bytestring that is the object name, not its form
>> that is encapsulated in the "struct object_id()".
>
> I guess you're keying on the phrase "binary hash" there (the
> "GIT_MAX_HEXZ" bits only apply to the "_r" variants anyway). I'd read it
> as encompassing all of the functions below, including oid_to_hex(). But
> I'm OK with it either way.

Yes.  "binary hash" is about "unsigned char[]".  I think that's
historical accident---we added "struct object_id *" variants without
updating the comment.

Here is another try.  The "everybody uses oid_to_hex" in the log
message has also been updated.

1:  8a030f1796 ! 1:  b19f3fe9dd hex: drop sha1_to_hex()
    @@ Metadata
      ## Commit message ##
         hex: drop sha1_to_hex()
     
    -    There's only a single caller left of sha1_to_hex(), since everybody now
    -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
    -    print a hex sha1 when we find a collision. This one will always be sha1,
    -    regardless of the current hash algorithm, so we can't use oid_to_hex()
    -    here. In practice we'd probably not be running sha1 at all if it isn't
    -    the current algorithm, but it's possible we might still occasionally
    +    There's only a single caller left of sha1_to_hex(), since everybody
    +    that has an object name in "unsigned char[]" now uses hash_to_hex()
    +    instead.
    +
    +    This case is in the sha1dc wrapper, where we print a hex sha1 when
    +    we find a collision. This one will always be sha1, regardless of the
    +    current hash algorithm, so we can't use hash_to_hex() here. In
    +    practice we'd probably not be running sha1 at all if it isn't the
    +    current algorithm, but it's possible we might still occasionally
         need to compute a sha1 in a post-sha256 world.
     
         Since sha1_to_hex() is just a wrapper for hash_to_hex_algop(), let's
    @@ Commit message
         it (and as with sha1_to_hex_r() in the previous patch, we'll drop the
         coccinelle transformations, too).
     
    -    The sha1_to_hex() function is mentioned in a comment; we can easily swap
    -    that out for oid_to_hex() to give a better example. It's also mentioned
    -    in some test vectors in t4100, but that's not runnable code, so there's
    -    no point in trying to clean it up.
    +    The sha1_to_hex() function is mentioned in a comment; we can easily
    +    swap that out for oid_to_hex() to give a better example.  Also
    +    update the comment that was left stale when we added "struct
    +    object_id *" as a way to name an object and added functions to
    +    convert it to hex.
    +
    +    The function is also mentioned in some test vectors in t4100, but
    +    that's not runnable code, so there's no point in trying to clean it
    +    up.
     
         Signed-off-by: Jeff King <peff@peff.net>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## cache.h ##
    +@@ cache.h: int get_oid_hex(const char *hex, struct object_id *sha1);
    + int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
    + 
    + /*
    +- * Convert a binary hash to its hex equivalent. The `_r` variant is reentrant,
    ++ * Convert a binary hash in "unsigned char []" or an object name in
    ++ * "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
    +  * and writes the NUL-terminated output to the buffer `out`, which must be at
    +  * least `GIT_MAX_HEXSZ + 1` bytes, and returns a pointer to out for
    +  * convenience.
     @@ cache.h: int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
       * The non-`_r` variant returns a static buffer, but uses a ring of 4
       * buffers, making it safe to make multiple calls for a single statement, like:
       *
     - *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
    ++ *   printf("%s -> %s", hash_to_hex(one), hash_to_hex(two));
     + *   printf("%s -> %s", oid_to_hex(one), oid_to_hex(two));
       */
      char *hash_to_hex_algop_r(char *buffer, const unsigned char *hash, const struct git_hash_algo *);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] hex: drop sha1_to_hex()
  2019-11-13  1:09               ` Junio C Hamano
@ 2019-11-13  1:15                 ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2019-11-13  1:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git, brian m. carlson

On Wed, Nov 13, 2019 at 10:09:23AM +0900, Junio C Hamano wrote:

> Yes.  "binary hash" is about "unsigned char[]".  I think that's
> historical accident---we added "struct object_id *" variants without
> updating the comment.
> 
> Here is another try.  The "everybody uses oid_to_hex" in the log
> message has also been updated.

This looks good to me (and thank you for cleaning up my messy commit
message).

> 1:  8a030f1796 ! 1:  b19f3fe9dd hex: drop sha1_to_hex()
>     @@ Metadata
>       ## Commit message ##
>          hex: drop sha1_to_hex()
>      
>     -    There's only a single caller left of sha1_to_hex(), since everybody now
>     -    uses oid_to_hex() instead. This case is in the sha1dc wrapper, where we
>     -    print a hex sha1 when we find a collision. This one will always be sha1,
>     -    regardless of the current hash algorithm, so we can't use oid_to_hex()
>     -    here. In practice we'd probably not be running sha1 at all if it isn't
>     -    the current algorithm, but it's possible we might still occasionally
>     +    There's only a single caller left of sha1_to_hex(), since everybody
>     +    that has an object name in "unsigned char[]" now uses hash_to_hex()
>     +    instead.

That makes sense. I mentioned oid_to_hex() originally because most of
the callers _did_ switch from sha1_to_hex(oid->hash) to oid_to_hex(oid).
But that happened far enough in the past that the more interesting
change is using the generic hash_to_hex().

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-11-13  1:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11  9:03 [PATCH 0/2] getting rid of sha1_to_hex() Jeff King
2019-11-11  9:04 ` [PATCH 1/2] hex: drop sha1_to_hex_r() Jeff King
2019-11-11 18:30   ` Johannes Schindelin
2019-11-11  9:04 ` [PATCH 2/2] hex: drop sha1_to_hex() Jeff King
2019-11-11 14:18   ` SZEDER Gábor
2019-11-11 14:29     ` Jeff King
2019-11-12  4:13       ` Junio C Hamano
2019-11-12 10:57         ` Jeff King
2019-11-12 11:44           ` SZEDER Gábor
2019-11-12 12:12             ` Jeff King
2019-11-12 11:49           ` Junio C Hamano
2019-11-12 12:15             ` Jeff King
2019-11-13  1:09               ` Junio C Hamano
2019-11-13  1:15                 ` Jeff King
2019-11-11  9:09 ` [PATCH 0/2] getting rid of sha1_to_hex() Junio C Hamano
2019-11-11  9:21   ` Jeff King
2019-11-11 23:53     ` brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).