git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 00/14] Hash function transition part 15
@ 2018-10-08 21:56 brian m. carlson
  2018-10-08 21:56 ` [PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation brian m. carlson
                   ` (13 more replies)
  0 siblings, 14 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

This is the fifteenth series in the ongoing hash function transition.

This series includes several conversions to use the_hash_algo, combined
with some use of parse_oid_hex and GIT_MAX_RAWSZ.  None of the
transformations here are especially surprising.

brian m. carlson (14):
  pack-bitmap-write: use GIT_MAX_RAWSZ for allocation
  builtin/repack: replace hard-coded constant
  builtin/mktree: remove hard-coded constant
  builtin/fetch-pack: remove constants with parse_oid_hex
  pack-revindex: express constants in terms of the_hash_algo
  packfile: express constants in terms of the_hash_algo
  refs/packed-backend: express constants using the_hash_algo
  upload-pack: express constants in terms of the_hash_algo
  transport: use parse_oid_hex instead of a constant
  tag: express constant in terms of the_hash_algo
  apply: replace hard-coded constants
  apply: rename new_sha1_prefix and old_sha1_prefix
  submodule: make zero-oid comparison hash function agnostic
  rerere: convert to use the_hash_algo

 apply.c               | 50 +++++++++++++-------------
 builtin/fetch-pack.c  | 13 +++----
 builtin/mktree.c      |  2 +-
 builtin/repack.c      |  4 +--
 git-submodule.sh      |  7 +++-
 pack-bitmap-write.c   |  2 +-
 pack-revindex.c       |  9 ++---
 packfile.c            |  5 +--
 refs/packed-backend.c | 14 ++++----
 rerere.c              | 81 ++++++++++++++++++++++---------------------
 tag.c                 |  2 +-
 transport.c           |  7 ++--
 upload-pack.c         | 13 +++----
 13 files changed, 112 insertions(+), 97 deletions(-)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 02/14] builtin/repack: replace hard-coded constant brian m. carlson
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 pack-bitmap-write.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index fc82f37a02..6f0c78d6aa 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -37,7 +37,7 @@ struct bitmap_writer {
 
 	struct progress *progress;
 	int show_progress;
-	unsigned char pack_checksum[20];
+	unsigned char pack_checksum[GIT_MAX_RAWSZ];
 };
 
 static struct bitmap_writer writer;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/14] builtin/repack: replace hard-coded constant
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
  2018-10-08 21:56 ` [PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 22:27   ` Stefan Beller
  2018-10-08 21:56 ` [PATCH 03/14] builtin/mktree: remove " brian m. carlson
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 builtin/repack.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index c6a7943d5c..e77859062d 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -407,8 +407,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 
 	out = xfdopen(cmd.out, "r");
 	while (strbuf_getline_lf(&line, out) != EOF) {
-		if (line.len != 40)
-			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
+		if (line.len != the_hash_algo->hexsz)
+			die("repack: Expecting full hex object ID lines only from pack-objects.");
 		string_list_append(&names, line.buf);
 	}
 	fclose(out);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/14] builtin/mktree: remove hard-coded constant
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
  2018-10-08 21:56 ` [PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation brian m. carlson
  2018-10-08 21:56 ` [PATCH 02/14] builtin/repack: replace hard-coded constant brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 22:32   ` Stefan Beller
  2018-10-08 21:56 ` [PATCH 04/14] builtin/fetch-pack: remove constants with parse_oid_hex brian m. carlson
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Instead of using a hard-coded constant for the size of a hex object ID,
switch to use the computed pointer from parse_oid_hex that points after
the parsed object ID.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 builtin/mktree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/mktree.c b/builtin/mktree.c
index 2dc4ad6ba8..94e82b8504 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -98,7 +98,7 @@ static void mktree_line(char *buf, size_t len, int nul_term_line, int allow_miss
 
 	*ntr++ = 0; /* now at the beginning of SHA1 */
 
-	path = ntr + 41;  /* at the beginning of name */
+	path = (char *)p + 1;  /* at the beginning of name */
 	if (!nul_term_line && path[0] == '"') {
 		struct strbuf p_uq = STRBUF_INIT;
 		if (unquote_c_style(&p_uq, path, NULL))

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/14] builtin/fetch-pack: remove constants with parse_oid_hex
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (2 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 03/14] builtin/mktree: remove " brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo brian m. carlson
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Instead of using GIT_SHA1_HEXSZ, use parse_oid_hex to compute a pointer
and use that in comparisons.  This is both simpler to read and works
independent of the hash length.  Update references to SHA-1 in the same
function to refer to object IDs instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 builtin/fetch-pack.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 1a1bc63566..63e69a5801 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -16,13 +16,14 @@ static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
 {
 	struct ref *ref;
 	struct object_id oid;
+	const char *p;
 
-	if (!get_oid_hex(name, &oid)) {
-		if (name[GIT_SHA1_HEXSZ] == ' ') {
-			/* <sha1> <ref>, find refname */
-			name += GIT_SHA1_HEXSZ + 1;
-		} else if (name[GIT_SHA1_HEXSZ] == '\0') {
-			; /* <sha1>, leave sha1 as name */
+	if (!parse_oid_hex(name, &oid, &p)) {
+		if (*p == ' ') {
+			/* <oid> <ref>, find refname */
+			name = p + 1;
+		} else if (*p == '\0') {
+			; /* <oid>, leave oid as name */
 		} else {
 			/* <ref>, clear cruft from oid */
 			oidclr(&oid);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (3 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 04/14] builtin/fetch-pack: remove constants with parse_oid_hex brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 22:44   ` Stefan Beller
  2018-10-08 21:56 ` [PATCH 06/14] packfile: " brian m. carlson
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Express the various constants used in terms of the_hash_algo.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 pack-revindex.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/pack-revindex.c b/pack-revindex.c
index bb521cf7fb..3756ec71a8 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -122,13 +122,14 @@ static void create_pack_revindex(struct packed_git *p)
 	unsigned num_ent = p->num_objects;
 	unsigned i;
 	const char *index = p->index_data;
+	const unsigned hashsz = the_hash_algo->rawsz;
 
 	ALLOC_ARRAY(p->revindex, num_ent + 1);
 	index += 4 * 256;
 
 	if (p->index_version > 1) {
 		const uint32_t *off_32 =
-			(uint32_t *)(index + 8 + p->num_objects * (20 + 4));
+			(uint32_t *)(index + 8 + p->num_objects * (hashsz + 4));
 		const uint32_t *off_64 = off_32 + p->num_objects;
 		for (i = 0; i < num_ent; i++) {
 			uint32_t off = ntohl(*off_32++);
@@ -142,16 +143,16 @@ static void create_pack_revindex(struct packed_git *p)
 		}
 	} else {
 		for (i = 0; i < num_ent; i++) {
-			uint32_t hl = *((uint32_t *)(index + 24 * i));
+			uint32_t hl = *((uint32_t *)(index + (hashsz + 4) * i));
 			p->revindex[i].offset = ntohl(hl);
 			p->revindex[i].nr = i;
 		}
 	}
 
-	/* This knows the pack format -- the 20-byte trailer
+	/* This knows the pack format -- the hash trailer
 	 * follows immediately after the last object data.
 	 */
-	p->revindex[num_ent].offset = p->pack_size - 20;
+	p->revindex[num_ent].offset = p->pack_size - hashsz;
 	p->revindex[num_ent].nr = -1;
 	sort_revindex(p->revindex, num_ent, p->pack_size);
 }

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/14] packfile: express constants in terms of the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (4 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 22:59   ` Stefan Beller
  2018-10-08 21:56 ` [PATCH 07/14] refs/packed-backend: express constants using the_hash_algo brian m. carlson
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid
dependence on a particular hash length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 packfile.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/packfile.c b/packfile.c
index 841b36182f..17f993b5bf 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1121,13 +1121,14 @@ int unpack_object_header(struct packed_git *p,
 void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
 {
 	unsigned i;
+	const unsigned hashsz = the_hash_algo->rawsz;
 	for (i = 0; i < p->num_bad_objects; i++)
-		if (hasheq(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
+		if (hasheq(sha1, p->bad_object_sha1 + hashsz * i))
 			return;
 	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
 				      st_mult(GIT_MAX_RAWSZ,
 					      st_add(p->num_bad_objects, 1)));
-	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
+	hashcpy(p->bad_object_sha1 + hashsz * p->num_bad_objects, sha1);
 	p->num_bad_objects++;
 }
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/14] refs/packed-backend: express constants using the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (5 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 06/14] packfile: " brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 08/14] upload-pack: express constants in terms of the_hash_algo brian m. carlson
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Switch uses of GIT_SHA1_HEXSZ to use the_hash_algo so that they are
appropriate for the any given hash length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 refs/packed-backend.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 74e2996e93..c01c7f5901 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -274,8 +274,8 @@ struct snapshot_record {
 static int cmp_packed_ref_records(const void *v1, const void *v2)
 {
 	const struct snapshot_record *e1 = v1, *e2 = v2;
-	const char *r1 = e1->start + GIT_SHA1_HEXSZ + 1;
-	const char *r2 = e2->start + GIT_SHA1_HEXSZ + 1;
+	const char *r1 = e1->start + the_hash_algo->hexsz + 1;
+	const char *r2 = e2->start + the_hash_algo->hexsz + 1;
 
 	while (1) {
 		if (*r1 == '\n')
@@ -297,7 +297,7 @@ static int cmp_packed_ref_records(const void *v1, const void *v2)
  */
 static int cmp_record_to_refname(const char *rec, const char *refname)
 {
-	const char *r1 = rec + GIT_SHA1_HEXSZ + 1;
+	const char *r1 = rec + the_hash_algo->hexsz + 1;
 	const char *r2 = refname;
 
 	while (1) {
@@ -344,7 +344,7 @@ static void sort_snapshot(struct snapshot *snapshot)
 		if (!eol)
 			/* The safety check should prevent this. */
 			BUG("unterminated line found in packed-refs");
-		if (eol - pos < GIT_SHA1_HEXSZ + 2)
+		if (eol - pos < the_hash_algo->hexsz + 2)
 			die_invalid_line(snapshot->refs->path,
 					 pos, eof - pos);
 		eol++;
@@ -456,7 +456,7 @@ static void verify_buffer_safe(struct snapshot *snapshot)
 		return;
 
 	last_line = find_start_of_record(start, eof - 1);
-	if (*(eof - 1) != '\n' || eof - last_line < GIT_SHA1_HEXSZ + 2)
+	if (*(eof - 1) != '\n' || eof - last_line < the_hash_algo->hexsz + 2)
 		die_invalid_line(snapshot->refs->path,
 				 last_line, eof - last_line);
 }
@@ -796,7 +796,7 @@ static int next_record(struct packed_ref_iterator *iter)
 
 	iter->base.flags = REF_ISPACKED;
 
-	if (iter->eof - p < GIT_SHA1_HEXSZ + 2 ||
+	if (iter->eof - p < the_hash_algo->hexsz + 2 ||
 	    parse_oid_hex(p, &iter->oid, &p) ||
 	    !isspace(*p++))
 		die_invalid_line(iter->snapshot->refs->path,
@@ -826,7 +826,7 @@ static int next_record(struct packed_ref_iterator *iter)
 
 	if (iter->pos < iter->eof && *iter->pos == '^') {
 		p = iter->pos + 1;
-		if (iter->eof - p < GIT_SHA1_HEXSZ + 1 ||
+		if (iter->eof - p < the_hash_algo->hexsz + 1 ||
 		    parse_oid_hex(p, &iter->peeled, &p) ||
 		    *p++ != '\n')
 			die_invalid_line(iter->snapshot->refs->path,

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/14] upload-pack: express constants in terms of the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (6 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 07/14] refs/packed-backend: express constants using the_hash_algo brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 09/14] transport: use parse_oid_hex instead of a constant brian m. carlson
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Convert all uses of the GIT_SHA1_HEXSZ to use the_hash_algo so that they
are appropriate for any given hash length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 upload-pack.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 62a1000f44..1aae5dd828 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -443,6 +443,7 @@ static int do_reachable_revlist(struct child_process *cmd,
 	struct object *o;
 	char namebuf[GIT_MAX_HEXSZ + 2]; /* ^ + hash + LF */
 	int i;
+	const unsigned hexsz = the_hash_algo->hexsz;
 
 	cmd->argv = argv;
 	cmd->git_cmd = 1;
@@ -461,7 +462,7 @@ static int do_reachable_revlist(struct child_process *cmd,
 		goto error;
 
 	namebuf[0] = '^';
-	namebuf[GIT_SHA1_HEXSZ + 1] = '\n';
+	namebuf[hexsz + 1] = '\n';
 	for (i = get_max_object_index(); 0 < i; ) {
 		o = get_indexed_object(--i);
 		if (!o)
@@ -470,11 +471,11 @@ static int do_reachable_revlist(struct child_process *cmd,
 			o->flags &= ~TMP_MARK;
 		if (!is_our_ref(o))
 			continue;
-		memcpy(namebuf + 1, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ);
-		if (write_in_full(cmd->in, namebuf, GIT_SHA1_HEXSZ + 2) < 0)
+		memcpy(namebuf + 1, oid_to_hex(&o->oid), hexsz);
+		if (write_in_full(cmd->in, namebuf, hexsz + 2) < 0)
 			goto error;
 	}
-	namebuf[GIT_SHA1_HEXSZ] = '\n';
+	namebuf[hexsz] = '\n';
 	for (i = 0; i < src->nr; i++) {
 		o = src->objects[i].item;
 		if (is_our_ref(o)) {
@@ -484,8 +485,8 @@ static int do_reachable_revlist(struct child_process *cmd,
 		}
 		if (reachable && o->type == OBJ_COMMIT)
 			o->flags |= TMP_MARK;
-		memcpy(namebuf, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ);
-		if (write_in_full(cmd->in, namebuf, GIT_SHA1_HEXSZ + 1) < 0)
+		memcpy(namebuf, oid_to_hex(&o->oid), hexsz);
+		if (write_in_full(cmd->in, namebuf, hexsz + 1) < 0)
 			goto error;
 	}
 	close(cmd->in);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/14] transport: use parse_oid_hex instead of a constant
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (7 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 08/14] upload-pack: express constants in terms of the_hash_algo brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 10/14] tag: express constant in terms of the_hash_algo brian m. carlson
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Use parse_oid_hex to compute a pointer instead of using GIT_SHA1_HEXSZ.
This simplifies the code and makes it independent of the hash length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 transport.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/transport.c b/transport.c
index 1c76d64aba..44b9ddf670 100644
--- a/transport.c
+++ b/transport.c
@@ -1346,15 +1346,16 @@ static void read_alternate_refs(const char *path,
 	fh = xfdopen(cmd.out, "r");
 	while (strbuf_getline_lf(&line, fh) != EOF) {
 		struct object_id oid;
+		const char *p;
 
-		if (get_oid_hex(line.buf, &oid) ||
-		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+		if (parse_oid_hex(line.buf, &oid, &p) ||
+		    *p != ' ') {
 			warning(_("invalid line while parsing alternate refs: %s"),
 				line.buf);
 			break;
 		}
 
-		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+		cb(p + 1, &oid, data);
 	}
 
 	fclose(fh);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/14] tag: express constant in terms of the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (8 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 09/14] transport: use parse_oid_hex instead of a constant brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 11/14] apply: replace hard-coded constants brian m. carlson
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 tag.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tag.c b/tag.c
index 1db663d716..7445b8f6ea 100644
--- a/tag.c
+++ b/tag.c
@@ -144,7 +144,7 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u
 		return 0;
 	item->object.parsed = 1;
 
-	if (size < GIT_SHA1_HEXSZ + 24)
+	if (size < the_hash_algo->hexsz + 24)
 		return -1;
 	if (memcmp("object ", bufptr, 7) || parse_oid_hex(bufptr + 7, &oid, &bufptr) || *bufptr++ != '\n')
 		return -1;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/14] apply: replace hard-coded constants
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (9 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 10/14] tag: express constant in terms of the_hash_algo brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:56 ` [PATCH 12/14] apply: rename new_sha1_prefix and old_sha1_prefix brian m. carlson
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Replace several 40-based constants with references to GIT_MAX_HEXSZ or
the_hash_algo, as appropriate.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 apply.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/apply.c b/apply.c
index e485fbc6bc..792ecea36a 100644
--- a/apply.c
+++ b/apply.c
@@ -223,8 +223,8 @@ struct patch {
 	struct fragment *fragments;
 	char *result;
 	size_t resultsize;
-	char old_sha1_prefix[41];
-	char new_sha1_prefix[41];
+	char old_sha1_prefix[GIT_MAX_HEXSZ + 1];
+	char new_sha1_prefix[GIT_MAX_HEXSZ + 1];
 	struct patch *next;
 
 	/* three-way fallback result */
@@ -1093,9 +1093,10 @@ static int gitdiff_index(struct apply_state *state,
 	 */
 	const char *ptr, *eol;
 	int len;
+	const unsigned hexsz = the_hash_algo->hexsz;
 
 	ptr = strchr(line, '.');
-	if (!ptr || ptr[1] != '.' || 40 < ptr - line)
+	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
 		return 0;
 	len = ptr - line;
 	memcpy(patch->old_sha1_prefix, line, len);
@@ -1109,7 +1110,7 @@ static int gitdiff_index(struct apply_state *state,
 		ptr = eol;
 	len = ptr - line;
 
-	if (40 < len)
+	if (hexsz < len)
 		return 0;
 	memcpy(patch->new_sha1_prefix, line, len);
 	patch->new_sha1_prefix[len] = 0;
@@ -3142,13 +3143,14 @@ static int apply_binary(struct apply_state *state,
 {
 	const char *name = patch->old_name ? patch->old_name : patch->new_name;
 	struct object_id oid;
+	const unsigned hexsz = the_hash_algo->hexsz;
 
 	/*
 	 * For safety, we require patch index line to contain
-	 * full 40-byte textual SHA1 for old and new, at least for now.
+	 * full hex textual object ID for old and new, at least for now.
 	 */
-	if (strlen(patch->old_sha1_prefix) != 40 ||
-	    strlen(patch->new_sha1_prefix) != 40 ||
+	if (strlen(patch->old_sha1_prefix) != hexsz ||
+	    strlen(patch->new_sha1_prefix) != hexsz ||
 	    get_oid_hex(patch->old_sha1_prefix, &oid) ||
 	    get_oid_hex(patch->new_sha1_prefix, &oid))
 		return error(_("cannot apply binary patch to '%s' "
@@ -4055,7 +4057,7 @@ static int preimage_oid_in_gitlink_patch(struct patch *p, struct object_id *oid)
 	    starts_with(++preimage, heading) &&
 	    /* does it record full SHA-1? */
 	    !get_oid_hex(preimage + sizeof(heading) - 1, oid) &&
-	    preimage[sizeof(heading) + GIT_SHA1_HEXSZ - 1] == '\n' &&
+	    preimage[sizeof(heading) + the_hash_algo->hexsz - 1] == '\n' &&
 	    /* does the abbreviated name on the index line agree with it? */
 	    starts_with(preimage + sizeof(heading) - 1, p->old_sha1_prefix))
 		return 0; /* it all looks fine */

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/14] apply: rename new_sha1_prefix and old_sha1_prefix
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (10 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 11/14] apply: replace hard-coded constants brian m. carlson
@ 2018-10-08 21:56 ` brian m. carlson
  2018-10-08 21:57 ` [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic brian m. carlson
  2018-10-08 21:57 ` [PATCH 14/14] rerere: convert to use the_hash_algo brian m. carlson
  13 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:56 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Rename these structure members to "new_oid_prefix" and "old_oid_prefix".

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 apply.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/apply.c b/apply.c
index 792ecea36a..b9eb02ec12 100644
--- a/apply.c
+++ b/apply.c
@@ -223,8 +223,8 @@ struct patch {
 	struct fragment *fragments;
 	char *result;
 	size_t resultsize;
-	char old_sha1_prefix[GIT_MAX_HEXSZ + 1];
-	char new_sha1_prefix[GIT_MAX_HEXSZ + 1];
+	char old_oid_prefix[GIT_MAX_HEXSZ + 1];
+	char new_oid_prefix[GIT_MAX_HEXSZ + 1];
 	struct patch *next;
 
 	/* three-way fallback result */
@@ -1099,8 +1099,8 @@ static int gitdiff_index(struct apply_state *state,
 	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
 		return 0;
 	len = ptr - line;
-	memcpy(patch->old_sha1_prefix, line, len);
-	patch->old_sha1_prefix[len] = 0;
+	memcpy(patch->old_oid_prefix, line, len);
+	patch->old_oid_prefix[len] = 0;
 
 	line = ptr + 2;
 	ptr = strchr(line, ' ');
@@ -1112,8 +1112,8 @@ static int gitdiff_index(struct apply_state *state,
 
 	if (hexsz < len)
 		return 0;
-	memcpy(patch->new_sha1_prefix, line, len);
-	patch->new_sha1_prefix[len] = 0;
+	memcpy(patch->new_oid_prefix, line, len);
+	patch->new_oid_prefix[len] = 0;
 	if (*ptr == ' ')
 		return gitdiff_oldmode(state, ptr + 1, patch);
 	return 0;
@@ -2205,7 +2205,7 @@ static void reverse_patches(struct patch *p)
 		SWAP(p->new_mode, p->old_mode);
 		SWAP(p->is_new, p->is_delete);
 		SWAP(p->lines_added, p->lines_deleted);
-		SWAP(p->old_sha1_prefix, p->new_sha1_prefix);
+		SWAP(p->old_oid_prefix, p->new_oid_prefix);
 
 		for (; frag; frag = frag->next) {
 			SWAP(frag->newpos, frag->oldpos);
@@ -3149,10 +3149,10 @@ static int apply_binary(struct apply_state *state,
 	 * For safety, we require patch index line to contain
 	 * full hex textual object ID for old and new, at least for now.
 	 */
-	if (strlen(patch->old_sha1_prefix) != hexsz ||
-	    strlen(patch->new_sha1_prefix) != hexsz ||
-	    get_oid_hex(patch->old_sha1_prefix, &oid) ||
-	    get_oid_hex(patch->new_sha1_prefix, &oid))
+	if (strlen(patch->old_oid_prefix) != hexsz ||
+	    strlen(patch->new_oid_prefix) != hexsz ||
+	    get_oid_hex(patch->old_oid_prefix, &oid) ||
+	    get_oid_hex(patch->new_oid_prefix, &oid))
 		return error(_("cannot apply binary patch to '%s' "
 			       "without full index line"), name);
 
@@ -3162,7 +3162,7 @@ static int apply_binary(struct apply_state *state,
 		 * applies to.
 		 */
 		hash_object_file(img->buf, img->len, blob_type, &oid);
-		if (strcmp(oid_to_hex(&oid), patch->old_sha1_prefix))
+		if (strcmp(oid_to_hex(&oid), patch->old_oid_prefix))
 			return error(_("the patch applies to '%s' (%s), "
 				       "which does not match the "
 				       "current contents."),
@@ -3175,7 +3175,7 @@ static int apply_binary(struct apply_state *state,
 				       "'%s' but it is not empty"), name);
 	}
 
-	get_oid_hex(patch->new_sha1_prefix, &oid);
+	get_oid_hex(patch->new_oid_prefix, &oid);
 	if (is_null_oid(&oid)) {
 		clear_image(img);
 		return 0; /* deletion patch */
@@ -3191,7 +3191,7 @@ static int apply_binary(struct apply_state *state,
 		if (!result)
 			return error(_("the necessary postimage %s for "
 				       "'%s' cannot be read"),
-				     patch->new_sha1_prefix, name);
+				     patch->new_oid_prefix, name);
 		clear_image(img);
 		img->buf = result;
 		img->len = size;
@@ -3207,9 +3207,9 @@ static int apply_binary(struct apply_state *state,
 
 		/* verify that the result matches */
 		hash_object_file(img->buf, img->len, blob_type, &oid);
-		if (strcmp(oid_to_hex(&oid), patch->new_sha1_prefix))
+		if (strcmp(oid_to_hex(&oid), patch->new_oid_prefix))
 			return error(_("binary patch to '%s' creates incorrect result (expecting %s, got %s)"),
-				name, patch->new_sha1_prefix, oid_to_hex(&oid));
+				name, patch->new_oid_prefix, oid_to_hex(&oid));
 	}
 
 	return 0;
@@ -3565,7 +3565,7 @@ static int try_threeway(struct apply_state *state,
 	/* Preimage the patch was prepared for */
 	if (patch->is_new)
 		write_object_file("", 0, blob_type, &pre_oid);
-	else if (get_oid(patch->old_sha1_prefix, &pre_oid) ||
+	else if (get_oid(patch->old_oid_prefix, &pre_oid) ||
 		 read_blob_object(&buf, &pre_oid, patch->old_mode))
 		return error(_("repository lacks the necessary blob to fall back on 3-way merge."));
 
@@ -4059,11 +4059,11 @@ static int preimage_oid_in_gitlink_patch(struct patch *p, struct object_id *oid)
 	    !get_oid_hex(preimage + sizeof(heading) - 1, oid) &&
 	    preimage[sizeof(heading) + the_hash_algo->hexsz - 1] == '\n' &&
 	    /* does the abbreviated name on the index line agree with it? */
-	    starts_with(preimage + sizeof(heading) - 1, p->old_sha1_prefix))
+	    starts_with(preimage + sizeof(heading) - 1, p->old_oid_prefix))
 		return 0; /* it all looks fine */
 
 	/* we may have full object name on the index line */
-	return get_oid_hex(p->old_sha1_prefix, oid);
+	return get_oid_hex(p->old_oid_prefix, oid);
 }
 
 /* Build an index that contains just the files needed for a 3way merge */
@@ -4092,7 +4092,7 @@ static int build_fake_ancestor(struct apply_state *state, struct patch *list)
 			else
 				return error(_("sha1 information is lacking or "
 					       "useless for submodule %s"), name);
-		} else if (!get_oid_blob(patch->old_sha1_prefix, &oid)) {
+		} else if (!get_oid_blob(patch->old_oid_prefix, &oid)) {
 			; /* ok */
 		} else if (!patch->lines_added && !patch->lines_deleted) {
 			/* mode-only change: update the current */

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (11 preceding siblings ...)
  2018-10-08 21:56 ` [PATCH 12/14] apply: rename new_sha1_prefix and old_sha1_prefix brian m. carlson
@ 2018-10-08 21:57 ` brian m. carlson
  2018-10-08 23:10   ` Stefan Beller
  2018-10-08 21:57 ` [PATCH 14/14] rerere: convert to use the_hash_algo brian m. carlson
  13 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

With SHA-256, the length of the all-zeros object ID is longer.  Add a
function to git-submodule.sh to check if a full hex object ID is the
all-zeros value, and use it to check the output we're parsing from git
diff-files or diff-index.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 git-submodule.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 1b568e29b9..c09eb3e03d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -82,6 +82,11 @@ isnumber()
 	n=$(($1 + 0)) 2>/dev/null && test "$n" = "$1"
 }
 
+# Given a full hex object ID, is this the zero OID?
+is_zero_oid () {
+	echo "$1" | sane_egrep '^0+$' >/dev/null 2>&1
+}
+
 # Sanitize the local git environment for use within a submodule. We
 # can't simply use clear_local_git_env since we want to preserve some
 # of the settings from GIT_CONFIG_PARAMETERS.
@@ -780,7 +785,7 @@ cmd_summary() {
 	while read -r mod_src mod_dst sha1_src sha1_dst status name
 	do
 		if test -z "$cached" &&
-			test $sha1_dst = 0000000000000000000000000000000000000000
+			is_zero_oid $sha1_dst
 		then
 			case "$mod_dst" in
 			160000)

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/14] rerere: convert to use the_hash_algo
  2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
                   ` (12 preceding siblings ...)
  2018-10-08 21:57 ` [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic brian m. carlson
@ 2018-10-08 21:57 ` brian m. carlson
  2018-10-08 23:18   ` Stefan Beller
  2018-10-12 13:11   ` [PATCH] object_id.cocci: match only expressions of type 'struct object_id' SZEDER Gábor
  13 siblings, 2 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-08 21:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Eric Sunshine, Nguyễn Thái Ngọc Duy

Since this data is stored in the .git directory, it makes sense for us
to use the same hash algorithm for it as for everything else.  Convert
the remaining uses of SHA-1 to use the_hash_algo.  Use GIT_MAX_RAWSZ for
allocations.  Rename various struct members, local variables, and a
function to be named "hash" instead of "sha1".

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 rerere.c | 81 +++++++++++++++++++++++++++++---------------------------
 1 file changed, 42 insertions(+), 39 deletions(-)

diff --git a/rerere.c b/rerere.c
index 7aa149e849..ceb98015ff 100644
--- a/rerere.c
+++ b/rerere.c
@@ -29,7 +29,7 @@ static int rerere_dir_alloc;
 #define RR_HAS_POSTIMAGE 1
 #define RR_HAS_PREIMAGE 2
 static struct rerere_dir {
-	unsigned char sha1[20];
+	unsigned char hash[GIT_MAX_HEXSZ];
 	int status_alloc, status_nr;
 	unsigned char *status;
 } **rerere_dir;
@@ -52,7 +52,7 @@ static void free_rerere_id(struct string_list_item *item)
 
 static const char *rerere_id_hex(const struct rerere_id *id)
 {
-	return sha1_to_hex(id->collection->sha1);
+	return sha1_to_hex(id->collection->hash);
 }
 
 static void fit_variant(struct rerere_dir *rr_dir, int variant)
@@ -115,7 +115,7 @@ static int is_rr_file(const char *name, const char *filename, int *variant)
 static void scan_rerere_dir(struct rerere_dir *rr_dir)
 {
 	struct dirent *de;
-	DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->sha1)));
+	DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->hash)));
 
 	if (!dir)
 		return;
@@ -133,24 +133,24 @@ static void scan_rerere_dir(struct rerere_dir *rr_dir)
 	closedir(dir);
 }
 
-static const unsigned char *rerere_dir_sha1(size_t i, void *table)
+static const unsigned char *rerere_dir_hash(size_t i, void *table)
 {
 	struct rerere_dir **rr_dir = table;
-	return rr_dir[i]->sha1;
+	return rr_dir[i]->hash;
 }
 
 static struct rerere_dir *find_rerere_dir(const char *hex)
 {
-	unsigned char sha1[20];
+	unsigned char hash[GIT_MAX_RAWSZ];
 	struct rerere_dir *rr_dir;
 	int pos;
 
-	if (get_sha1_hex(hex, sha1))
+	if (get_sha1_hex(hex, hash))
 		return NULL; /* BUG */
-	pos = sha1_pos(sha1, rerere_dir, rerere_dir_nr, rerere_dir_sha1);
+	pos = sha1_pos(hash, rerere_dir, rerere_dir_nr, rerere_dir_hash);
 	if (pos < 0) {
 		rr_dir = xmalloc(sizeof(*rr_dir));
-		hashcpy(rr_dir->sha1, sha1);
+		hashcpy(rr_dir->hash, hash);
 		rr_dir->status = NULL;
 		rr_dir->status_nr = 0;
 		rr_dir->status_alloc = 0;
@@ -207,26 +207,27 @@ static void read_rr(struct string_list *rr)
 		return;
 	while (!strbuf_getwholeline(&buf, in, '\0')) {
 		char *path;
-		unsigned char sha1[20];
+		unsigned char hash[GIT_MAX_RAWSZ];
 		struct rerere_id *id;
 		int variant;
+		const unsigned hexsz = the_hash_algo->hexsz;
 
 		/* There has to be the hash, tab, path and then NUL */
-		if (buf.len < 42 || get_sha1_hex(buf.buf, sha1))
+		if (buf.len < hexsz + 2 || get_sha1_hex(buf.buf, hash))
 			die(_("corrupt MERGE_RR"));
 
-		if (buf.buf[40] != '.') {
+		if (buf.buf[hexsz] != '.') {
 			variant = 0;
-			path = buf.buf + 40;
+			path = buf.buf + hexsz;
 		} else {
 			errno = 0;
-			variant = strtol(buf.buf + 41, &path, 10);
+			variant = strtol(buf.buf + hexsz + 1, &path, 10);
 			if (errno)
 				die(_("corrupt MERGE_RR"));
 		}
 		if (*(path++) != '\t')
 			die(_("corrupt MERGE_RR"));
-		buf.buf[40] = '\0';
+		buf.buf[hexsz] = '\0';
 		id = new_rerere_id_hex(buf.buf);
 		id->variant = variant;
 		string_list_insert(rr, path)->util = id;
@@ -360,7 +361,7 @@ static void rerere_strbuf_putconflict(struct strbuf *buf, int ch, size_t size)
 }
 
 static int handle_conflict(struct strbuf *out, struct rerere_io *io,
-			   int marker_size, git_SHA_CTX *ctx)
+			   int marker_size, git_hash_ctx *ctx)
 {
 	enum {
 		RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
@@ -398,10 +399,12 @@ static int handle_conflict(struct strbuf *out, struct rerere_io *io,
 			strbuf_addbuf(out, &two);
 			rerere_strbuf_putconflict(out, '>', marker_size);
 			if (ctx) {
-				git_SHA1_Update(ctx, one.buf ? one.buf : "",
-					    one.len + 1);
-				git_SHA1_Update(ctx, two.buf ? two.buf : "",
-					    two.len + 1);
+				the_hash_algo->update_fn(ctx, one.buf ?
+							 one.buf : "",
+							 one.len + 1);
+				the_hash_algo->update_fn(ctx, two.buf ?
+							 two.buf : "",
+							 two.len + 1);
 			}
 			break;
 		} else if (hunk == RR_SIDE_1)
@@ -430,18 +433,18 @@ static int handle_conflict(struct strbuf *out, struct rerere_io *io,
  * Return 1 if conflict hunks are found, 0 if there are no conflict
  * hunks and -1 if an error occured.
  */
-static int handle_path(unsigned char *sha1, struct rerere_io *io, int marker_size)
+static int handle_path(unsigned char *hash, struct rerere_io *io, int marker_size)
 {
-	git_SHA_CTX ctx;
+	git_hash_ctx ctx;
 	struct strbuf buf = STRBUF_INIT, out = STRBUF_INIT;
 	int has_conflicts = 0;
-	if (sha1)
-		git_SHA1_Init(&ctx);
+	if (hash)
+		the_hash_algo->init_fn(&ctx);
 
 	while (!io->getline(&buf, io)) {
 		if (is_cmarker(buf.buf, '<', marker_size)) {
 			has_conflicts = handle_conflict(&out, io, marker_size,
-							sha1 ? &ctx : NULL);
+							hash ? &ctx : NULL);
 			if (has_conflicts < 0)
 				break;
 			rerere_io_putmem(out.buf, out.len, io);
@@ -452,8 +455,8 @@ static int handle_path(unsigned char *sha1, struct rerere_io *io, int marker_siz
 	strbuf_release(&buf);
 	strbuf_release(&out);
 
-	if (sha1)
-		git_SHA1_Final(sha1, &ctx);
+	if (hash)
+		the_hash_algo->final_fn(hash, &ctx);
 
 	return has_conflicts;
 }
@@ -462,7 +465,7 @@ static int handle_path(unsigned char *sha1, struct rerere_io *io, int marker_siz
  * Scan the path for conflicts, do the "handle_path()" thing above, and
  * return the number of conflict hunks found.
  */
-static int handle_file(const char *path, unsigned char *sha1, const char *output)
+static int handle_file(const char *path, unsigned char *hash, const char *output)
 {
 	int has_conflicts = 0;
 	struct rerere_io_file io;
@@ -484,7 +487,7 @@ static int handle_file(const char *path, unsigned char *sha1, const char *output
 		}
 	}
 
-	has_conflicts = handle_path(sha1, (struct rerere_io *)&io, marker_size);
+	has_conflicts = handle_path(hash, (struct rerere_io *)&io, marker_size);
 
 	fclose(io.input);
 	if (io.io.wrerror)
@@ -814,7 +817,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
 	 */
 	for (i = 0; i < conflict.nr; i++) {
 		struct rerere_id *id;
-		unsigned char sha1[20];
+		unsigned char hash[GIT_MAX_RAWSZ];
 		const char *path = conflict.items[i].string;
 		int ret;
 
@@ -823,7 +826,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
 		 * conflict ID.  No need to write anything out
 		 * yet.
 		 */
-		ret = handle_file(path, sha1, NULL);
+		ret = handle_file(path, hash, NULL);
 		if (ret != 0 && string_list_has_string(rr, path)) {
 			remove_variant(string_list_lookup(rr, path)->util);
 			string_list_remove(rr, path, 1);
@@ -831,7 +834,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
 		if (ret < 1)
 			continue;
 
-		id = new_rerere_id(sha1);
+		id = new_rerere_id(hash);
 		string_list_insert(rr, path)->util = id;
 
 		/* Ensure that the directory exists. */
@@ -942,7 +945,7 @@ static int rerere_mem_getline(struct strbuf *sb, struct rerere_io *io_)
 	return 0;
 }
 
-static int handle_cache(const char *path, unsigned char *sha1, const char *output)
+static int handle_cache(const char *path, unsigned char *hash, const char *output)
 {
 	mmfile_t mmfile[3] = {{NULL}};
 	mmbuffer_t result = {NULL, 0};
@@ -1001,7 +1004,7 @@ static int handle_cache(const char *path, unsigned char *sha1, const char *outpu
 	 * Grab the conflict ID and optionally write the original
 	 * contents with conflict markers out.
 	 */
-	has_conflicts = handle_path(sha1, (struct rerere_io *)&io, marker_size);
+	has_conflicts = handle_path(hash, (struct rerere_io *)&io, marker_size);
 	strbuf_release(&io.input);
 	if (io.io.output)
 		fclose(io.io.output);
@@ -1012,7 +1015,7 @@ static int rerere_forget_one_path(const char *path, struct string_list *rr)
 {
 	const char *filename;
 	struct rerere_id *id;
-	unsigned char sha1[20];
+	unsigned char hash[GIT_MAX_RAWSZ];
 	int ret;
 	struct string_list_item *item;
 
@@ -1020,12 +1023,12 @@ static int rerere_forget_one_path(const char *path, struct string_list *rr)
 	 * Recreate the original conflict from the stages in the
 	 * index and compute the conflict ID
 	 */
-	ret = handle_cache(path, sha1, NULL);
+	ret = handle_cache(path, hash, NULL);
 	if (ret < 1)
 		return error(_("could not parse conflict hunks in '%s'"), path);
 
 	/* Nuke the recorded resolution for the conflict */
-	id = new_rerere_id(sha1);
+	id = new_rerere_id(hash);
 
 	for (id->variant = 0;
 	     id->variant < id->collection->status_nr;
@@ -1037,7 +1040,7 @@ static int rerere_forget_one_path(const char *path, struct string_list *rr)
 		if (!has_rerere_resolution(id))
 			continue;
 
-		handle_cache(path, sha1, rerere_path(id, "thisimage"));
+		handle_cache(path, hash, rerere_path(id, "thisimage"));
 		if (read_mmfile(&cur, rerere_path(id, "thisimage"))) {
 			free(cur.ptr);
 			error(_("failed to update conflicted state in '%s'"), path);
@@ -1069,7 +1072,7 @@ static int rerere_forget_one_path(const char *path, struct string_list *rr)
 	 * conflict in the working tree, run us again to record
 	 * the postimage.
 	 */
-	handle_cache(path, sha1, rerere_path(id, "preimage"));
+	handle_cache(path, hash, rerere_path(id, "preimage"));
 	fprintf_ln(stderr, _("Updated preimage for '%s'"), path);
 
 	/*

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 02/14] builtin/repack: replace hard-coded constant
  2018-10-08 21:56 ` [PATCH 02/14] builtin/repack: replace hard-coded constant brian m. carlson
@ 2018-10-08 22:27   ` Stefan Beller
  2018-10-08 23:01     ` Eric Sunshine
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 22:27 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  builtin/repack.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index c6a7943d5c..e77859062d 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -407,8 +407,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>
>         out = xfdopen(cmd.out, "r");
>         while (strbuf_getline_lf(&line, out) != EOF) {
> -               if (line.len != 40)
> -                       die("repack: Expecting 40 character sha1 lines only from pack-objects.");
> +               if (line.len != the_hash_algo->hexsz)
> +                       die("repack: Expecting full hex object ID lines only from pack-objects.");

This is untranslated as it is plumbing?
If so, maybe

    if (is_sha1(the_hash_algo)
        die("repack: Expecting 40 character sh...
    else
        die(repack: Expecting full hex object ID in %s, of length %d",
            the_hash_algo->name,
            the_hash_algo->hexsz);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 03/14] builtin/mktree: remove hard-coded constant
  2018-10-08 21:56 ` [PATCH 03/14] builtin/mktree: remove " brian m. carlson
@ 2018-10-08 22:32   ` Stefan Beller
  0 siblings, 0 replies; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 22:32 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Instead of using a hard-coded constant for the size of a hex object ID,
> switch to use the computed pointer from parse_oid_hex that points after
> the parsed object ID.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  builtin/mktree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/mktree.c b/builtin/mktree.c
> index 2dc4ad6ba8..94e82b8504 100644
> --- a/builtin/mktree.c
> +++ b/builtin/mktree.c
> @@ -98,7 +98,7 @@ static void mktree_line(char *buf, size_t len, int nul_term_line, int allow_miss
>
>         *ntr++ = 0; /* now at the beginning of SHA1 */
>
> -       path = ntr + 41;  /* at the beginning of name */
> +       path = (char *)p + 1;  /* at the beginning of name */

... and we need the cast to un-const p such that path takes it.
Makes sense.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo
  2018-10-08 21:56 ` [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo brian m. carlson
@ 2018-10-08 22:44   ` Stefan Beller
  2018-10-09 22:26     ` brian m. carlson
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 22:44 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

> -       /* This knows the pack format -- the 20-byte trailer
> +       /* This knows the pack format -- the hash trailer
>          * follows immediately after the last object data.

While at it, fix the comment style?

With or without the nit addressed, this patch (and patches
1 and 4) are

  Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 06/14] packfile: express constants in terms of the_hash_algo
  2018-10-08 21:56 ` [PATCH 06/14] packfile: " brian m. carlson
@ 2018-10-08 22:59   ` Stefan Beller
  2018-10-09 22:25     ` brian m. carlson
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 22:59 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid
> dependence on a particular hash length.

Unlike the previous patches, this is dealing directly with packfiles,
which (I would think) carry their own hash function selector?
(i.e. packfiles up to version 4 are sha1 hardcoded and version
5 and onwards will have a hash type field. Usually that hash type would
match what is in the_repository, but you could obtain packfiles
out of band, or the translation table that we plan to have might
be part of the packfile/idx file?)


>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  packfile.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/packfile.c b/packfile.c
> index 841b36182f..17f993b5bf 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1121,13 +1121,14 @@ int unpack_object_header(struct packed_git *p,
>  void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1)
>  {
>         unsigned i;
> +       const unsigned hashsz = the_hash_algo->rawsz;
>         for (i = 0; i < p->num_bad_objects; i++)
> -               if (hasheq(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
> +               if (hasheq(sha1, p->bad_object_sha1 + hashsz * i))
>                         return;
>         p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
>                                       st_mult(GIT_MAX_RAWSZ,
>                                               st_add(p->num_bad_objects, 1)));
> -       hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
> +       hashcpy(p->bad_object_sha1 + hashsz * p->num_bad_objects, sha1);
>         p->num_bad_objects++;
>  }
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 02/14] builtin/repack: replace hard-coded constant
  2018-10-08 22:27   ` Stefan Beller
@ 2018-10-08 23:01     ` Eric Sunshine
  2018-10-09 23:00       ` brian m. carlson
  0 siblings, 1 reply; 30+ messages in thread
From: Eric Sunshine @ 2018-10-08 23:01 UTC (permalink / raw)
  To: Stefan Beller
  Cc: brian m. carlson, Git List, Jeff King,
	Nguyễn Thái Ngọc Duy

On Mon, Oct 8, 2018 at 6:27 PM Stefan Beller <sbeller@google.com> wrote:
> On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > -               if (line.len != 40)
> > -                       die("repack: Expecting 40 character sha1 lines only from pack-objects.");
> > +               if (line.len != the_hash_algo->hexsz)
> > +                       die("repack: Expecting full hex object ID lines only from pack-objects.");
>
> This is untranslated as it is plumbing? If so, maybe
>
>     if (is_sha1(the_hash_algo)
>         die("repack: Expecting 40 character sh...
>     else
>         die(repack: Expecting full hex object ID in %s, of length %d",
>             the_hash_algo->name,
>             the_hash_algo->hexsz);

Special-casing for SHA-1 seems overkill for an error message. A script
expecting this particular error condition and this particular error
message would be fragile indeed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic
  2018-10-08 21:57 ` [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic brian m. carlson
@ 2018-10-08 23:10   ` Stefan Beller
  0 siblings, 0 replies; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 23:10 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> With SHA-256, the length of the all-zeros object ID is longer.  Add a
> function to git-submodule.sh to check if a full hex object ID is the
> all-zeros value, and use it to check the output we're parsing from git
> diff-files or diff-index.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---

Nice!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 14/14] rerere: convert to use the_hash_algo
  2018-10-08 21:57 ` [PATCH 14/14] rerere: convert to use the_hash_algo brian m. carlson
@ 2018-10-08 23:18   ` Stefan Beller
  2018-10-12 13:11   ` [PATCH] object_id.cocci: match only expressions of type 'struct object_id' SZEDER Gábor
  1 sibling, 0 replies; 30+ messages in thread
From: Stefan Beller @ 2018-10-08 23:18 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Since this data is stored in the .git directory, it makes sense for us
> to use the same hash algorithm for it as for everything else.  Convert
> the remaining uses of SHA-1 to use the_hash_algo.  Use GIT_MAX_RAWSZ for
> allocations.  Rename various struct members, local variables, and a
> function to be named "hash" instead of "sha1".
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  rerere.c | 81 +++++++++++++++++++++++++++++---------------------------
>  1 file changed, 42 insertions(+), 39 deletions(-)

I have reviewed all patches and had minor questions on some of them,
all others look good to me.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 06/14] packfile: express constants in terms of the_hash_algo
  2018-10-08 22:59   ` Stefan Beller
@ 2018-10-09 22:25     ` brian m. carlson
  2018-10-09 22:34       ` Stefan Beller
  0 siblings, 1 reply; 30+ messages in thread
From: brian m. carlson @ 2018-10-09 22:25 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

[-- Attachment #1: Type: text/plain, Size: 1573 bytes --]

On Mon, Oct 08, 2018 at 03:59:36PM -0700, Stefan Beller wrote:
> On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> >
> > Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid
> > dependence on a particular hash length.
> 
> Unlike the previous patches, this is dealing directly with packfiles,
> which (I would think) carry their own hash function selector?
> (i.e. packfiles up to version 4 are sha1 hardcoded and version
> 5 and onwards will have a hash type field. Usually that hash type would
> match what is in the_repository, but you could obtain packfiles
> out of band, or the translation table that we plan to have might
> be part of the packfile/idx file?)

Yeah, the transition plan doesn't specify a format for pack files, but
we may end up needing one.  We definitely have a specified format for
index files already, and that's where the translation table will be.
Anything other than the pack index and the loose object index in the
.git directory will have the same algorithm as the rest of the
repository, so technically we could use any pack format as long as it
lives in the .git directory.

This code is mostly here on an interim basis to let us compile with a
fully SHA-256 (no SHA-1) Git.  Once that piece is done, we can move on
to a stage 4 Git, which can do either only SHA-256, or only SHA-1, where
we'll learn about various pack file formats and detecting the algorithm
from them.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo
  2018-10-08 22:44   ` Stefan Beller
@ 2018-10-09 22:26     ` brian m. carlson
  0 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-09 22:26 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

[-- Attachment #1: Type: text/plain, Size: 403 bytes --]

On Mon, Oct 08, 2018 at 03:44:36PM -0700, Stefan Beller wrote:
> > -       /* This knows the pack format -- the 20-byte trailer
> > +       /* This knows the pack format -- the hash trailer
> >          * follows immediately after the last object data.
> 
> While at it, fix the comment style?

Sure, I can do that.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 06/14] packfile: express constants in terms of the_hash_algo
  2018-10-09 22:25     ` brian m. carlson
@ 2018-10-09 22:34       ` Stefan Beller
  2018-10-09 22:54         ` brian m. carlson
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Beller @ 2018-10-09 22:34 UTC (permalink / raw)
  To: brian m. carlson, git, Jeff King, Eric Sunshine, Duy Nguyen

On Tue, Oct 9, 2018 at 3:25 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On Mon, Oct 08, 2018 at 03:59:36PM -0700, Stefan Beller wrote:
> > On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
> > <sandals@crustytoothpaste.net> wrote:
> > >
> > > Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid
> > > dependence on a particular hash length.
> >
> > Unlike the previous patches, this is dealing directly with packfiles,
> > which (I would think) carry their own hash function selector?
> > (i.e. packfiles up to version 4 are sha1 hardcoded and version
> > 5 and onwards will have a hash type field. Usually that hash type would
> > match what is in the_repository, but you could obtain packfiles
> > out of band, or the translation table that we plan to have might
> > be part of the packfile/idx file?)
>
> Yeah, the transition plan doesn't specify a format for pack files, but
> we may end up needing one.  We definitely have a specified format for
> index files already, and that's where the translation table will be.
> Anything other than the pack index and the loose object index in the
> .git directory will have the same algorithm as the rest of the
> repository, so technically we could use any pack format as long as it
> lives in the .git directory.
>
> This code is mostly here on an interim basis to let us compile with a
> fully SHA-256 (no SHA-1) Git.  Once that piece is done, we can move on
> to a stage 4 Git, which can do either only SHA-256, or only SHA-1, where
> we'll learn about various pack file formats and detecting the algorithm
> from them.

This second paragraph really helps to put things into perspective, thanks!
I assume this interim base of code only applies to this patch?
(In that case maybe put it into the commit message?)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 06/14] packfile: express constants in terms of the_hash_algo
  2018-10-09 22:34       ` Stefan Beller
@ 2018-10-09 22:54         ` brian m. carlson
  0 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-09 22:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jeff King, Eric Sunshine, Duy Nguyen

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

On Tue, Oct 09, 2018 at 03:34:17PM -0700, Stefan Beller wrote:
> On Tue, Oct 9, 2018 at 3:25 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > This code is mostly here on an interim basis to let us compile with a
> > fully SHA-256 (no SHA-1) Git.  Once that piece is done, we can move on
> > to a stage 4 Git, which can do either only SHA-256, or only SHA-1, where
> > we'll learn about various pack file formats and detecting the algorithm
> > from them.
> 
> This second paragraph really helps to put things into perspective, thanks!
> I assume this interim base of code only applies to this patch?
> (In that case maybe put it into the commit message?)

That comment will apply to most of the changes to the packfile code,
whether in this series or in future series.  However, after your
question, I was indeed going to put it into the commit message when I
reroll.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 02/14] builtin/repack: replace hard-coded constant
  2018-10-08 23:01     ` Eric Sunshine
@ 2018-10-09 23:00       ` brian m. carlson
  0 siblings, 0 replies; 30+ messages in thread
From: brian m. carlson @ 2018-10-09 23:00 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Stefan Beller, Git List, Jeff King,
	Nguyễn Thái Ngọc Duy

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

On Mon, Oct 08, 2018 at 07:01:27PM -0400, Eric Sunshine wrote:
> On Mon, Oct 8, 2018 at 6:27 PM Stefan Beller <sbeller@google.com> wrote:
> > On Mon, Oct 8, 2018 at 2:57 PM brian m. carlson
> > <sandals@crustytoothpaste.net> wrote:
> > > -               if (line.len != 40)
> > > -                       die("repack: Expecting 40 character sha1 lines only from pack-objects.");
> > > +               if (line.len != the_hash_algo->hexsz)
> > > +                       die("repack: Expecting full hex object ID lines only from pack-objects.");
> >
> > This is untranslated as it is plumbing? If so, maybe
> >
> >     if (is_sha1(the_hash_algo)
> >         die("repack: Expecting 40 character sh...
> >     else
> >         die(repack: Expecting full hex object ID in %s, of length %d",
> >             the_hash_algo->name,
> >             the_hash_algo->hexsz);
> 
> Special-casing for SHA-1 seems overkill for an error message. A script
> expecting this particular error condition and this particular error
> message would be fragile indeed.

Yeah, I don't think a special case is needed here.  Moreover, since we
just invoked pack-objects ourselves and are now reading the output of
it, seeing this error message means that someone has broken Git in a
significant way.  The end user should never see this message.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] object_id.cocci: match only expressions of type 'struct object_id'
  2018-10-08 21:57 ` [PATCH 14/14] rerere: convert to use the_hash_algo brian m. carlson
  2018-10-08 23:18   ` Stefan Beller
@ 2018-10-12 13:11   ` SZEDER Gábor
  2018-10-15  2:34     ` Junio C Hamano
  1 sibling, 1 reply; 30+ messages in thread
From: SZEDER Gábor @ 2018-10-12 13:11 UTC (permalink / raw)
  To: brian m. carlson
  Cc: git, Jeff King, Eric Sunshine,
	Nguyễn Thái Ngọc Duy, SZEDER Gábor

Most of our semantic patches in 'contrib/coccinelle/object_id.cocci'
turn calls of SHA1-specific functions into calls of their
corresponding object_id counterparts, e.g. sha1_to_hex() to
oid_to_hex().  These semantic patches look something like this:

  @@
  expression E1;
  @@
  - sha1_to_hex(E1.hash)
  + oid_to_hex(&E1)

and match the access to the 'hash' field in any data type, not only in
'struct object_id', and, consquently, can produce wrong
transformations.

Case in point is the recent hash function transition patch "rerere:
convert to use the_hash_algo" [1], which, among other things, renamed
'struct rerere_dir's 'sha1' field to 'hash', and then 'make
coccicheck' started to suggest the following wrong transformations for
'rerere.c' [2]:

  -    return sha1_to_hex(id->collection->hash);
  +    return oid_to_hex(id->collection);

and

  -    DIR *dir = opendir(git_path("rr-cache/%s", sha1_to_hex(rr_dir->hash)));
  +    DIR *dir = opendir(git_path("rr-cache/%s", oid_to_hex(rr_dir)));

Avoid such wrong transformations by tightening semantic patches in
'object_id.cocci' to match only type of or pointers to 'struct
object_id'.

[1] https://public-inbox.org/git/20181008215701.779099-15-sandals@crustytoothpaste.net/
[2] https://travis-ci.org/git/git/jobs/440463476#L580

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

I think this patch should come before that patch mentioned in the commit
message, or perhaps even before the whole patch series.

 contrib/coccinelle/object_id.cocci | 117 ++++++++++++++++-------------
 1 file changed, 63 insertions(+), 54 deletions(-)

diff --git a/contrib/coccinelle/object_id.cocci b/contrib/coccinelle/object_id.cocci
index d8bdb48712..6a7cf3e02d 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -1,119 +1,127 @@
 @@
-expression E1;
+struct object_id OID;
 @@
-- is_null_sha1(E1.hash)
-+ is_null_oid(&E1)
+- is_null_sha1(OID.hash)
++ is_null_oid(&OID)
 
 @@
-expression E1;
+struct object_id *OIDPTR;
 @@
-- is_null_sha1(E1->hash)
-+ is_null_oid(E1)
+- is_null_sha1(OIDPTR->hash)
++ is_null_oid(OIDPTR)
 
 @@
-expression E1;
+struct object_id OID;
 @@
-- sha1_to_hex(E1.hash)
-+ oid_to_hex(&E1)
+- sha1_to_hex(OID.hash)
++ oid_to_hex(&OID)
 
 @@
 identifier f != oid_to_hex;
-expression E1;
+struct object_id *OIDPTR;
 @@
   f(...) {<...
-- sha1_to_hex(E1->hash)
-+ oid_to_hex(E1)
+- sha1_to_hex(OIDPTR->hash)
++ oid_to_hex(OIDPTR)
   ...>}
 
 @@
-expression E1, E2;
+expression E;
+struct object_id OID;
 @@
-- sha1_to_hex_r(E1, E2.hash)
-+ oid_to_hex_r(E1, &E2)
+- sha1_to_hex_r(E, OID.hash)
++ oid_to_hex_r(E, &OID)
 
 @@
 identifier f != oid_to_hex_r;
-expression E1, E2;
+expression E;
+struct object_id *OIDPTR;
 @@
    f(...) {<...
-- sha1_to_hex_r(E1, E2->hash)
-+ oid_to_hex_r(E1, E2)
+- sha1_to_hex_r(E, OIDPTR->hash)
++ oid_to_hex_r(E, OIDPTR)
   ...>}
 
 @@
-expression E1;
+struct object_id OID;
 @@
-- hashclr(E1.hash)
-+ oidclr(&E1)
+- hashclr(OID.hash)
++ oidclr(&OID)
 
 @@
 identifier f != oidclr;
-expression E1;
+struct object_id *OIDPTR;
 @@
   f(...) {<...
-- hashclr(E1->hash)
-+ oidclr(E1)
+- hashclr(OIDPTR->hash)
++ oidclr(OIDPTR)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id OID1, OID2;
 @@
-- hashcmp(E1.hash, E2.hash)
-+ oidcmp(&E1, &E2)
+- hashcmp(OID1.hash, OID2.hash)
++ oidcmp(&OID1, &OID2)
 
 @@
 identifier f != oidcmp;
-expression E1, E2;
+struct object_id *OIDPTR1, OIDPTR2;
 @@
   f(...) {<...
-- hashcmp(E1->hash, E2->hash)
-+ oidcmp(E1, E2)
+- hashcmp(OIDPTR1->hash, OIDPTR2->hash)
++ oidcmp(OIDPTR1, OIDPTR2)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcmp(E1->hash, E2.hash)
-+ oidcmp(E1, &E2)
+- hashcmp(OIDPTR->hash, OID.hash)
++ oidcmp(OIDPTR, &OID)
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcmp(E1.hash, E2->hash)
-+ oidcmp(&E1, E2)
+- hashcmp(OID.hash, OIDPTR->hash)
++ oidcmp(&OID, OIDPTR)
 
 @@
-expression E1, E2;
+struct object_id OID1, OID2;
 @@
-- hashcpy(E1.hash, E2.hash)
-+ oidcpy(&E1, &E2)
+- hashcpy(OID1.hash, OID2.hash)
++ oidcpy(&OID1, &OID2)
 
 @@
 identifier f != oidcpy;
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
   f(...) {<...
-- hashcpy(E1->hash, E2->hash)
-+ oidcpy(E1, E2)
+- hashcpy(OIDPTR1->hash, OIDPTR2->hash)
++ oidcpy(OIDPTR1, OIDPTR2)
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcpy(E1->hash, E2.hash)
-+ oidcpy(E1, &E2)
+- hashcpy(OIDPTR->hash, OID.hash)
++ oidcpy(OIDPTR, &OID)
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR;
+struct object_id OID;
 @@
-- hashcpy(E1.hash, E2->hash)
-+ oidcpy(&E1, E2)
+- hashcpy(OID.hash, OIDPTR->hash)
++ oidcpy(&OID, OIDPTR)
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
-- oidcmp(E1, E2) == 0
-+ oideq(E1, E2)
+- oidcmp(OIDPTR1, OIDPTR2) == 0
++ oideq(OIDPTR1, OIDPTR2)
 
 @@
 identifier f != hasheq;
@@ -125,10 +133,11 @@ expression E1, E2;
   ...>}
 
 @@
-expression E1, E2;
+struct object_id *OIDPTR1;
+struct object_id *OIDPTR2;
 @@
-- oidcmp(E1, E2) != 0
-+ !oideq(E1, E2)
+- oidcmp(OIDPTR1, OIDPTR2) != 0
++ !oideq(OIDPTR1, OIDPTR2)
 
 @@
 identifier f != hasheq;
-- 
2.19.1.465.gaff195083f


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] object_id.cocci: match only expressions of type 'struct object_id'
  2018-10-12 13:11   ` [PATCH] object_id.cocci: match only expressions of type 'struct object_id' SZEDER Gábor
@ 2018-10-15  2:34     ` Junio C Hamano
  2018-10-15  4:24       ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2018-10-15  2:34 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: brian m. carlson, git, Jeff King, Eric Sunshine,
	Nguyễn Thái Ngọc Duy

SZEDER Gábor <szeder.dev@gmail.com> writes:

> Most of our semantic patches in 'contrib/coccinelle/object_id.cocci'
> turn calls of SHA1-specific functions into calls of their
> corresponding object_id counterparts, e.g. sha1_to_hex() to
> oid_to_hex().  These semantic patches look something like this:
>
>   @@
>   expression E1;
>   @@
>   - sha1_to_hex(E1.hash)
>   + oid_to_hex(&E1)
>
> and match the access to the 'hash' field in any data type, not only in
> 'struct object_id', and, consquently, can produce wrong
> transformations.

Thanks, will queue.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] object_id.cocci: match only expressions of type 'struct object_id'
  2018-10-15  2:34     ` Junio C Hamano
@ 2018-10-15  4:24       ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2018-10-15  4:24 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: brian m. carlson, git, Jeff King, Eric Sunshine,
	Nguyễn Thái Ngọc Duy

Junio C Hamano <gitster@pobox.com> writes:

> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
>> Most of our semantic patches in 'contrib/coccinelle/object_id.cocci'
>> turn calls of SHA1-specific functions into calls of their
>> corresponding object_id counterparts, e.g. sha1_to_hex() to
>> oid_to_hex().  These semantic patches look something like this:
>>
>>   @@
>>   expression E1;
>>   @@
>>   - sha1_to_hex(E1.hash)
>>   + oid_to_hex(&E1)
>>
>> and match the access to the 'hash' field in any data type, not only in
>> 'struct object_id', and, consquently, can produce wrong
>> transformations.
>
> Thanks, will queue.

I ended up taking this as part of Brian's "the-hash-algo" series.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-10-15  4:24 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08 21:56 [PATCH 00/14] Hash function transition part 15 brian m. carlson
2018-10-08 21:56 ` [PATCH 01/14] pack-bitmap-write: use GIT_MAX_RAWSZ for allocation brian m. carlson
2018-10-08 21:56 ` [PATCH 02/14] builtin/repack: replace hard-coded constant brian m. carlson
2018-10-08 22:27   ` Stefan Beller
2018-10-08 23:01     ` Eric Sunshine
2018-10-09 23:00       ` brian m. carlson
2018-10-08 21:56 ` [PATCH 03/14] builtin/mktree: remove " brian m. carlson
2018-10-08 22:32   ` Stefan Beller
2018-10-08 21:56 ` [PATCH 04/14] builtin/fetch-pack: remove constants with parse_oid_hex brian m. carlson
2018-10-08 21:56 ` [PATCH 05/14] pack-revindex: express constants in terms of the_hash_algo brian m. carlson
2018-10-08 22:44   ` Stefan Beller
2018-10-09 22:26     ` brian m. carlson
2018-10-08 21:56 ` [PATCH 06/14] packfile: " brian m. carlson
2018-10-08 22:59   ` Stefan Beller
2018-10-09 22:25     ` brian m. carlson
2018-10-09 22:34       ` Stefan Beller
2018-10-09 22:54         ` brian m. carlson
2018-10-08 21:56 ` [PATCH 07/14] refs/packed-backend: express constants using the_hash_algo brian m. carlson
2018-10-08 21:56 ` [PATCH 08/14] upload-pack: express constants in terms of the_hash_algo brian m. carlson
2018-10-08 21:56 ` [PATCH 09/14] transport: use parse_oid_hex instead of a constant brian m. carlson
2018-10-08 21:56 ` [PATCH 10/14] tag: express constant in terms of the_hash_algo brian m. carlson
2018-10-08 21:56 ` [PATCH 11/14] apply: replace hard-coded constants brian m. carlson
2018-10-08 21:56 ` [PATCH 12/14] apply: rename new_sha1_prefix and old_sha1_prefix brian m. carlson
2018-10-08 21:57 ` [PATCH 13/14] submodule: make zero-oid comparison hash function agnostic brian m. carlson
2018-10-08 23:10   ` Stefan Beller
2018-10-08 21:57 ` [PATCH 14/14] rerere: convert to use the_hash_algo brian m. carlson
2018-10-08 23:18   ` Stefan Beller
2018-10-12 13:11   ` [PATCH] object_id.cocci: match only expressions of type 'struct object_id' SZEDER Gábor
2018-10-15  2:34     ` Junio C Hamano
2018-10-15  4:24       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).