git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v4 0/4] Test oidmap
@ 2019-06-15 10:06 Christian Couder
  2019-06-15 10:06 ` [PATCH v4 1/4] t/helper: add test-oidmap.c Christian Couder
                   ` (4 more replies)
  0 siblings, 5 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-15 10:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Jeff King, Christian Couder

Unlike hashmap that has t/helper/test-hashmap.c and t/t0011-hashmap.sh
oidmap has no specific test. The goal of this small patch series is to
change that and also improve oidmap a bit while at it.

Changes compared to V3 are the following:

  - removed "hash" command in test-oidmap.c and "hash" test in
    t0016-oidmap.sh as suggested by Peff,

  - added patch 4/4 which does the same as above in test-hashmap.c and
    t0011-hashmap.sh as suggested by Peff.

Previous versions on the mailing list:

V3: https://public-inbox.org/git/20190612232425.12149-1-chriscool@tuxfamily.org/
V2: https://public-inbox.org/git/20190611082325.28878-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20190609044907.32477-1-chriscool@tuxfamily.org/

This patch series on GitHub:

https://github.com/chriscool/git/commits/oidmap

Christian Couder (4):
  t/helper: add test-oidmap.c
  t: add t0016-oidmap.sh
  oidmap: use sha1hash() instead of static hash() function
  test-hashmap: remove 'hash' command

 Makefile                |   1 +
 oidmap.c                |  13 +----
 t/helper/test-hashmap.c |   9 +--
 t/helper/test-oidmap.c  | 126 ++++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c    |   1 +
 t/helper/test-tool.h    |   1 +
 t/t0011-hashmap.sh      |   9 ---
 t/t0016-oidmap.sh       |  84 +++++++++++++++++++++++++++
 8 files changed, 217 insertions(+), 27 deletions(-)
 create mode 100644 t/helper/test-oidmap.c
 create mode 100755 t/t0016-oidmap.sh

-- 
2.22.0.3.g82edbe9b01.dirty


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 1/4] t/helper: add test-oidmap.c
  2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
@ 2019-06-15 10:06 ` Christian Couder
  2019-06-15 10:07 ` [PATCH v4 2/4] t: add t0016-oidmap.sh Christian Couder
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-15 10:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Jeff King, Christian Couder

This new helper is very similar to "test-hashmap.c" and will help
test how `struct oidmap` from oidmap.{c,h} can be used.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile               |   1 +
 t/helper/test-oidmap.c | 126 +++++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c   |   1 +
 t/helper/test-tool.h   |   1 +
 4 files changed, 129 insertions(+)
 create mode 100644 t/helper/test-oidmap.c

diff --git a/Makefile b/Makefile
index 8a7e235352..5efc7700ed 100644
--- a/Makefile
+++ b/Makefile
@@ -727,6 +727,7 @@ TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o
 TEST_BUILTINS_OBJS += test-match-trees.o
 TEST_BUILTINS_OBJS += test-mergesort.o
 TEST_BUILTINS_OBJS += test-mktemp.o
+TEST_BUILTINS_OBJS += test-oidmap.o
 TEST_BUILTINS_OBJS += test-online-cpus.o
 TEST_BUILTINS_OBJS += test-parse-options.o
 TEST_BUILTINS_OBJS += test-path-utils.o
diff --git a/t/helper/test-oidmap.c b/t/helper/test-oidmap.c
new file mode 100644
index 0000000000..7036588175
--- /dev/null
+++ b/t/helper/test-oidmap.c
@@ -0,0 +1,126 @@
+#include "test-tool.h"
+#include "cache.h"
+#include "oidmap.h"
+#include "strbuf.h"
+
+/* key is an oid and value is a name (could be a refname for example) */
+struct test_entry {
+	struct oidmap_entry entry;
+	char name[FLEX_ARRAY];
+};
+
+#define DELIM " \t\r\n"
+
+/*
+ * Read stdin line by line and print result of commands to stdout:
+ *
+ * hash oidkey -> sha1hash(oidkey)
+ * put oidkey namevalue -> NULL / old namevalue
+ * get oidkey -> NULL / namevalue
+ * remove oidkey -> NULL / old namevalue
+ * iterate -> oidkey1 namevalue1\noidkey2 namevalue2\n...
+ *
+ */
+int cmd__oidmap(int argc, const char **argv)
+{
+	struct strbuf line = STRBUF_INIT;
+	struct oidmap map = OIDMAP_INIT;
+
+	setup_git_directory();
+
+	/* init oidmap */
+	oidmap_init(&map, 0);
+
+	/* process commands from stdin */
+	while (strbuf_getline(&line, stdin) != EOF) {
+		char *cmd, *p1 = NULL, *p2 = NULL;
+		struct test_entry *entry;
+		struct object_id oid;
+
+		/* break line into command and up to two parameters */
+		cmd = strtok(line.buf, DELIM);
+		/* ignore empty lines */
+		if (!cmd || *cmd == '#')
+			continue;
+
+		p1 = strtok(NULL, DELIM);
+		if (p1)
+			p2 = strtok(NULL, DELIM);
+
+		if (!strcmp("add", cmd) && p1 && p2) {
+
+			if (get_oid(p1, &oid)) {
+				printf("Unknown oid: %s\n", p1);
+				continue;
+			}
+
+			/* create entry with oidkey from p1, value = p2 */
+			FLEX_ALLOC_STR(entry, name, p2);
+			oidcpy(&entry->entry.oid, &oid);
+
+			/* add to oidmap */
+			oidmap_put(&map, entry);
+
+		} else if (!strcmp("put", cmd) && p1 && p2) {
+
+			if (get_oid(p1, &oid)) {
+				printf("Unknown oid: %s\n", p1);
+				continue;
+			}
+
+			/* create entry with oid_key = p1, name_value = p2 */
+			FLEX_ALLOC_STR(entry, name, p2);
+			oidcpy(&entry->entry.oid, &oid);
+
+			/* add / replace entry */
+			entry = oidmap_put(&map, entry);
+
+			/* print and free replaced entry, if any */
+			puts(entry ? entry->name : "NULL");
+			free(entry);
+
+		} else if (!strcmp("get", cmd) && p1) {
+
+			if (get_oid(p1, &oid)) {
+				printf("Unknown oid: %s\n", p1);
+				continue;
+			}
+
+			/* lookup entry in oidmap */
+			entry = oidmap_get(&map, &oid);
+
+			/* print result */
+			puts(entry ? entry->name : "NULL");
+
+		} else if (!strcmp("remove", cmd) && p1) {
+
+			if (get_oid(p1, &oid)) {
+				printf("Unknown oid: %s\n", p1);
+				continue;
+			}
+
+			/* remove entry from oidmap */
+			entry = oidmap_remove(&map, &oid);
+
+			/* print result and free entry*/
+			puts(entry ? entry->name : "NULL");
+			free(entry);
+
+		} else if (!strcmp("iterate", cmd)) {
+
+			struct oidmap_iter iter;
+			oidmap_iter_init(&map, &iter);
+			while ((entry = oidmap_iter_next(&iter)))
+				printf("%s %s\n", oid_to_hex(&entry->entry.oid), entry->name);
+
+		} else {
+
+			printf("Unknown command %s\n", cmd);
+
+		}
+	}
+
+	strbuf_release(&line);
+	oidmap_free(&map, 1);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 087a8c0cc9..1eac25233f 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -35,6 +35,7 @@ static struct test_cmd cmds[] = {
 	{ "match-trees", cmd__match_trees },
 	{ "mergesort", cmd__mergesort },
 	{ "mktemp", cmd__mktemp },
+	{ "oidmap", cmd__oidmap },
 	{ "online-cpus", cmd__online_cpus },
 	{ "parse-options", cmd__parse_options },
 	{ "path-utils", cmd__path_utils },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 7e703f3038..c7a46dc320 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -25,6 +25,7 @@ int cmd__lazy_init_name_hash(int argc, const char **argv);
 int cmd__match_trees(int argc, const char **argv);
 int cmd__mergesort(int argc, const char **argv);
 int cmd__mktemp(int argc, const char **argv);
+int cmd__oidmap(int argc, const char **argv);
 int cmd__online_cpus(int argc, const char **argv);
 int cmd__parse_options(int argc, const char **argv);
 int cmd__path_utils(int argc, const char **argv);
-- 
2.22.0.3.g82edbe9b01.dirty


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 2/4] t: add t0016-oidmap.sh
  2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
  2019-06-15 10:06 ` [PATCH v4 1/4] t/helper: add test-oidmap.c Christian Couder
@ 2019-06-15 10:07 ` Christian Couder
  2019-06-15 10:07 ` [PATCH v4 3/4] oidmap: use sha1hash() instead of static hash() function Christian Couder
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-15 10:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Jeff King, Christian Couder,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

Add actual tests for operations using `struct oidmap` from oidmap.{c,h}.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t0016-oidmap.sh | 84 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)
 create mode 100755 t/t0016-oidmap.sh

diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
new file mode 100755
index 0000000000..af17264ce3
--- /dev/null
+++ b/t/t0016-oidmap.sh
@@ -0,0 +1,84 @@
+#!/bin/sh
+
+test_description='test oidmap'
+. ./test-lib.sh
+
+# This purposefully is very similar to t0011-hashmap.sh
+
+test_oidmap () {
+	echo "$1" | test-tool oidmap $3 >actual &&
+	echo "$2" >expect &&
+	test_cmp expect actual
+}
+
+
+test_expect_success 'setup' '
+
+	test_commit one &&
+	test_commit two &&
+	test_commit three &&
+	test_commit four
+
+'
+
+test_expect_success 'put' '
+
+test_oidmap "put one 1
+put two 2
+put invalidOid 4
+put three 3" "NULL
+NULL
+Unknown oid: invalidOid
+NULL"
+
+'
+
+test_expect_success 'replace' '
+
+test_oidmap "put one 1
+put two 2
+put three 3
+put invalidOid 4
+put two deux
+put one un" "NULL
+NULL
+NULL
+Unknown oid: invalidOid
+2
+1"
+
+'
+
+test_expect_success 'get' '
+
+test_oidmap "put one 1
+put two 2
+put three 3
+get two
+get four
+get invalidOid
+get one" "NULL
+NULL
+NULL
+2
+NULL
+Unknown oid: invalidOid
+1"
+
+'
+
+test_expect_success 'iterate' '
+
+test_oidmap "put one 1
+put two 2
+put three 3
+iterate" "NULL
+NULL
+NULL
+$(git rev-parse two) 2
+$(git rev-parse one) 1
+$(git rev-parse three) 3"
+
+'
+
+test_done
-- 
2.22.0.3.g82edbe9b01.dirty


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 3/4] oidmap: use sha1hash() instead of static hash() function
  2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
  2019-06-15 10:06 ` [PATCH v4 1/4] t/helper: add test-oidmap.c Christian Couder
  2019-06-15 10:07 ` [PATCH v4 2/4] t: add t0016-oidmap.sh Christian Couder
@ 2019-06-15 10:07 ` Christian Couder
  2019-06-15 10:07 ` [PATCH v4 4/4] test-hashmap: remove 'hash' command Christian Couder
  2019-06-19 21:42 ` [PATCH v4 0/4] Test oidmap Jeff King
  4 siblings, 0 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-15 10:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Jeff King, Christian Couder,
	Christian Couder

From: Christian Couder <christian.couder@gmail.com>

Get rid of the static hash() function in oidmap.c which is redundant
with sha1hash(). Use sha1hash() directly instead.

Let's be more consistent and not use several hash functions doing
nearly exactly the same thing.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 oidmap.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/oidmap.c b/oidmap.c
index b0841a0f58..01c206aaef 100644
--- a/oidmap.c
+++ b/oidmap.c
@@ -12,13 +12,6 @@ static int oidmap_neq(const void *hashmap_cmp_fn_data,
 		      &((const struct oidmap_entry *) entry_or_key)->oid);
 }
 
-static int hash(const struct object_id *oid)
-{
-	int hash;
-	memcpy(&hash, oid->hash, sizeof(hash));
-	return hash;
-}
-
 void oidmap_init(struct oidmap *map, size_t initial_size)
 {
 	hashmap_init(&map->map, oidmap_neq, NULL, initial_size);
@@ -36,7 +29,7 @@ void *oidmap_get(const struct oidmap *map, const struct object_id *key)
 	if (!map->map.cmpfn)
 		return NULL;
 
-	return hashmap_get_from_hash(&map->map, hash(key), key);
+	return hashmap_get_from_hash(&map->map, sha1hash(key->hash), key);
 }
 
 void *oidmap_remove(struct oidmap *map, const struct object_id *key)
@@ -46,7 +39,7 @@ void *oidmap_remove(struct oidmap *map, const struct object_id *key)
 	if (!map->map.cmpfn)
 		oidmap_init(map, 0);
 
-	hashmap_entry_init(&entry, hash(key));
+	hashmap_entry_init(&entry, sha1hash(key->hash));
 	return hashmap_remove(&map->map, &entry, key);
 }
 
@@ -57,6 +50,6 @@ void *oidmap_put(struct oidmap *map, void *entry)
 	if (!map->map.cmpfn)
 		oidmap_init(map, 0);
 
-	hashmap_entry_init(&to_put->internal_entry, hash(&to_put->oid));
+	hashmap_entry_init(&to_put->internal_entry, sha1hash(to_put->oid.hash));
 	return hashmap_put(&map->map, to_put);
 }
-- 
2.22.0.3.g82edbe9b01.dirty


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 4/4] test-hashmap: remove 'hash' command
  2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
                   ` (2 preceding siblings ...)
  2019-06-15 10:07 ` [PATCH v4 3/4] oidmap: use sha1hash() instead of static hash() function Christian Couder
@ 2019-06-15 10:07 ` Christian Couder
  2019-06-19 21:42 ` [PATCH v4 0/4] Test oidmap Jeff King
  4 siblings, 0 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-15 10:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Jeff King, Christian Couder

If hashes like strhash() are updated, for example to use a different
hash algorithm, we should not have to be updating t0011 to change out
the hashes.

As long as hashmap can store and retrieve values, and that it performs
well, we should not care what are the values of the hashes. Let's just
focus on the externally visible behavior instead.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/helper/test-hashmap.c | 9 +--------
 t/t0011-hashmap.sh      | 9 ---------
 2 files changed, 1 insertion(+), 17 deletions(-)

diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 23d2b172fe..aaf17b0ddf 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -173,14 +173,7 @@ int cmd__hashmap(int argc, const char **argv)
 			p2 = strtok(NULL, DELIM);
 		}
 
-		if (!strcmp("hash", cmd) && p1) {
-
-			/* print results of different hash functions */
-			printf("%u %u %u %u\n",
-			       strhash(p1), memhash(p1, strlen(p1)),
-			       strihash(p1), memihash(p1, strlen(p1)));
-
-		} else if (!strcmp("add", cmd) && p1 && p2) {
+		if (!strcmp("add", cmd) && p1 && p2) {
 
 			/* create entry with key = p1, value = p2 */
 			entry = alloc_test_entry(hash, p1, p2);
diff --git a/t/t0011-hashmap.sh b/t/t0011-hashmap.sh
index 3f1f505e89..9c96b3e3b1 100755
--- a/t/t0011-hashmap.sh
+++ b/t/t0011-hashmap.sh
@@ -9,15 +9,6 @@ test_hashmap() {
 	test_cmp expect actual
 }
 
-test_expect_success 'hash functions' '
-
-test_hashmap "hash key1" "2215982743 2215982743 116372151 116372151" &&
-test_hashmap "hash key2" "2215982740 2215982740 116372148 116372148" &&
-test_hashmap "hash fooBarFrotz" "1383912807 1383912807 3189766727 3189766727" &&
-test_hashmap "hash foobarfrotz" "2862305959 2862305959 3189766727 3189766727"
-
-'
-
 test_expect_success 'put' '
 
 test_hashmap "put key1 value1
-- 
2.22.0.3.g82edbe9b01.dirty


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/4] Test oidmap
  2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
                   ` (3 preceding siblings ...)
  2019-06-15 10:07 ` [PATCH v4 4/4] test-hashmap: remove 'hash' command Christian Couder
@ 2019-06-19 21:42 ` Jeff King
  2019-06-19 22:09   ` Jeff King
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
  4 siblings, 2 replies; 37+ messages in thread
From: Jeff King @ 2019-06-19 21:42 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Sat, Jun 15, 2019 at 12:06:58PM +0200, Christian Couder wrote:

> Unlike hashmap that has t/helper/test-hashmap.c and t/t0011-hashmap.sh
> oidmap has no specific test. The goal of this small patch series is to
> change that and also improve oidmap a bit while at it.
> 
> Changes compared to V3 are the following:
> 
>   - removed "hash" command in test-oidmap.c and "hash" test in
>     t0016-oidmap.sh as suggested by Peff,
> 
>   - added patch 4/4 which does the same as above in test-hashmap.c and
>     t0011-hashmap.sh as suggested by Peff.

This version looks good to me.

I do think that sha1hash() will eventually go away in favor of
oidhash(), but we can approach that separately, and convert oidmap along
with everyone else.

It looks like we are close to being able to do that now. Grepping for
sha1hash shows just about everybody dereferencing an oid object, except
for the call in pack-objects.c. And skimming the callers there,
they all appear to have an oid object, too.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/4] Test oidmap
  2019-06-19 21:42 ` [PATCH v4 0/4] Test oidmap Jeff King
@ 2019-06-19 22:09   ` Jeff King
  2019-06-19 22:25     ` Christian Couder
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
  1 sibling, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-19 22:09 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Wed, Jun 19, 2019 at 05:42:13PM -0400, Jeff King wrote:

> I do think that sha1hash() will eventually go away in favor of
> oidhash(), but we can approach that separately, and convert oidmap along
> with everyone else.
> 
> It looks like we are close to being able to do that now. Grepping for
> sha1hash shows just about everybody dereferencing an oid object, except
> for the call in pack-objects.c. And skimming the callers there,
> they all appear to have an oid object, too.

Actually, it does get a little tangled, as our khash implementation also
uses sha1hash (though IMHO that should become oidhash, too). There's
also some preparatory work needed in pack-objects, which I've pushed up
to:

  https://github.com/peff/git jk/drop-sha1hash

I got interrupted and may try to resume this cleanup later; mostly I
wanted to post something now so you did not duplicate what I'd already
done.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/4] Test oidmap
  2019-06-19 22:09   ` Jeff King
@ 2019-06-19 22:25     ` Christian Couder
  0 siblings, 0 replies; 37+ messages in thread
From: Christian Couder @ 2019-06-19 22:25 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Thu, Jun 20, 2019 at 12:09 AM Jeff King <peff@peff.net> wrote:
>
> On Wed, Jun 19, 2019 at 05:42:13PM -0400, Jeff King wrote:
>
> > I do think that sha1hash() will eventually go away in favor of
> > oidhash(), but we can approach that separately, and convert oidmap along
> > with everyone else.

Yeah, deprecating it is a different topic.

> > It looks like we are close to being able to do that now. Grepping for
> > sha1hash shows just about everybody dereferencing an oid object, except
> > for the call in pack-objects.c. And skimming the callers there,
> > they all appear to have an oid object, too.
>
> Actually, it does get a little tangled, as our khash implementation also
> uses sha1hash (though IMHO that should become oidhash, too). There's
> also some preparatory work needed in pack-objects, which I've pushed up
> to:
>
>   https://github.com/peff/git jk/drop-sha1hash
>
> I got interrupted and may try to resume this cleanup later; mostly I
> wanted to post something now so you did not duplicate what I'd already
> done.

Thanks for your work on that,
Christian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 0/17] drop non-object_id hashing
  2019-06-19 21:42 ` [PATCH v4 0/4] Test oidmap Jeff King
  2019-06-19 22:09   ` Jeff King
@ 2019-06-20  7:39   ` Jeff King
  2019-06-20  7:40     ` [PATCH 01/17] describe: fix accidental oid/hash type-punning Jeff King
                       ` (16 more replies)
  1 sibling, 17 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:39 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Wed, Jun 19, 2019 at 05:42:12PM -0400, Jeff King wrote:

> I do think that sha1hash() will eventually go away in favor of
> oidhash(), but we can approach that separately, and convert oidmap along
> with everyone else.
> 
> It looks like we are close to being able to do that now. Grepping for
> sha1hash shows just about everybody dereferencing an oid object, except
> for the call in pack-objects.c. And skimming the callers there,
> they all appear to have an oid object, too.

Here are patches to do that. I decided not to build them on top of
yours, just to keep things simple. When merged with yours, there's no
textual conflict, but the new calls to sha1hash(key->hash) need to
become oidhash(key) instead (the compiler does catch this). It would be
easy to rebase, though.

I ended up with quite a few more patches than I thought I would need,
but I'm actually pretty pleased with the result. A lot of these are
cleanups in the packfile code that I'd been meaning to do for a while.

  [01/17]: describe: fix accidental oid/hash type-punning
  [02/17]: upload-pack: rename a "sha1" variable to "oid"

These first two are small cleanups I noticed in sites I touched later in
the series.

  [03/17]: pack-bitmap-write: convert some helpers to use object_id
  [04/17]: pack-objects: convert packlist_find() to use object_id
  [05/17]: pack-objects: convert locate_object_entry_hash() to object_id
  [06/17]: object: convert lookup_unknown_object() to use object_id
  [07/17]: object: convert lookup_object() to use object_id
  [08/17]: object: convert internal hash_obj() to object_id
  [09/17]: object: convert create_object() to use object_id

These ones are all just about passing object_id structs around more
consistently. In most cases, all of the callers of a particular function
were already passing "oid->hash", so we could just have them pass "oid"
instead.

The big one here is lookup_object(), which shockingly does not seem to
create any conflicts with pu.

  [10/17]: khash: drop broken oid_map typedef
  [11/17]: khash: rename kh_oid_t to kh_oid_set
  [12/17]: delta-islands: convert island_marks khash to use oids
  [13/17]: pack-bitmap: convert khash_sha1 maps into kh_oid_map
  [14/17]: khash: drop sha1-specific map types
  [15/17]: khash: rename oid helper functions

Some khash cleanups and prep to stop using sha1hash() there.

  [16/17]: hash.h: move object_id definition from cache.h
  [17/17]: hashmap: convert sha1hash() to oidhash()

And then finally we can rename the offending function. Ta-da. :)

 blob.c                           |  5 ++---
 builtin/describe.c               |  4 ++--
 builtin/fast-export.c            |  4 ++--
 builtin/fsck.c                   |  8 ++++----
 builtin/name-rev.c               |  3 +--
 builtin/pack-objects.c           | 21 +++++++++++----------
 builtin/prune.c                  |  2 +-
 builtin/unpack-objects.c         |  2 +-
 cache.h                          | 24 ------------------------
 commit-graph.c                   |  2 +-
 commit.c                         |  5 ++---
 decorate.c                       |  2 +-
 delta-islands.c                  | 24 ++++++++++++------------
 diffcore-rename.c                |  2 +-
 fetch-pack.c                     | 12 ++++++------
 fsck.c                           |  2 +-
 hash.h                           | 24 ++++++++++++++++++++++++
 hashmap.h                        |  8 +++++---
 http-push.c                      |  4 ++--
 khash.h                          | 22 ++++++----------------
 object.c                         | 26 +++++++++++++-------------
 object.h                         |  6 +++---
 oidset.c                         | 12 ++++++------
 oidset.h                         |  4 ++--
 pack-bitmap-write.c              | 30 +++++++++++++++---------------
 pack-bitmap.c                    | 16 ++++++++--------
 pack-bitmap.h                    |  2 +-
 pack-objects.c                   | 12 ++++++------
 pack-objects.h                   |  2 +-
 patch-ids.c                      |  2 +-
 reachable.c                      |  4 ++--
 refs.c                           |  2 +-
 t/helper/test-example-decorate.c |  6 +++---
 tag.c                            |  5 ++---
 tree.c                           |  5 ++---
 upload-pack.c                    |  8 ++++----
 walker.c                         |  2 +-
 37 files changed, 156 insertions(+), 168 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/17] describe: fix accidental oid/hash type-punning
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
@ 2019-06-20  7:40     ` Jeff King
  2019-06-20 16:32       ` Junio C Hamano
  2019-06-20  7:40     ` [PATCH 02/17] upload-pack: rename a "sha1" variable to "oid" Jeff King
                       ` (15 subsequent siblings)
  16 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:40 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

The find_commit_name() function passes an object_id.hash as the key of a
hashmap. That ends up in commit_name_neq(), which then feeds it to
oideq(). Which means we should actually be the whole "struct object_id".

It works anyway because pointers to the two are interchangeable. And
because we're going through a layer of void pointers, the compiler
doesn't notice the type mismatch.

But it's worth cleaning up (especially since once we switch away from
sha1hash() on the same line, accessing the hash member will look doubly
out of place).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/describe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 1409cedce2..0a5cde00a2 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -76,7 +76,7 @@ static int commit_name_neq(const void *unused_cmp_data,
 
 static inline struct commit_name *find_commit_name(const struct object_id *peeled)
 {
-	return hashmap_get_from_hash(&names, sha1hash(peeled->hash), peeled->hash);
+	return hashmap_get_from_hash(&names, sha1hash(peeled->hash), peeled);
 }
 
 static int replace_name(struct commit_name *e,
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/17] upload-pack: rename a "sha1" variable to "oid"
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
  2019-06-20  7:40     ` [PATCH 01/17] describe: fix accidental oid/hash type-punning Jeff King
@ 2019-06-20  7:40     ` Jeff King
  2019-06-20  7:40     ` [PATCH 03/17] pack-bitmap-write: convert some helpers to use object_id Jeff King
                       ` (14 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:40 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

This variable is a "struct object_id", but uses the old-style name
"sha1". Let's call it oid to match more modern code (and make it clear
that it can handle any algorithm, since it uses parse_oid_hex()
properly).

Signed-off-by: Jeff King <peff@peff.net>
---
 upload-pack.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 4d2129e7fc..d9a62adef0 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -528,13 +528,13 @@ static int get_reachable_list(struct object_array *src,
 		return -1;
 
 	while ((i = read_in_full(cmd.out, namebuf, hexsz + 1)) == hexsz + 1) {
-		struct object_id sha1;
+		struct object_id oid;
 		const char *p;
 
-		if (parse_oid_hex(namebuf, &sha1, &p) || *p != '\n')
+		if (parse_oid_hex(namebuf, &oid, &p) || *p != '\n')
 			break;
 
-		o = lookup_object(the_repository, sha1.hash);
+		o = lookup_object(the_repository, oid.hash);
 		if (o && o->type == OBJ_COMMIT) {
 			o->flags &= ~TMP_MARK;
 		}
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/17] pack-bitmap-write: convert some helpers to use object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
  2019-06-20  7:40     ` [PATCH 01/17] describe: fix accidental oid/hash type-punning Jeff King
  2019-06-20  7:40     ` [PATCH 02/17] upload-pack: rename a "sha1" variable to "oid" Jeff King
@ 2019-06-20  7:40     ` Jeff King
  2019-06-20  7:41     ` [PATCH 04/17] pack-objects: convert packlist_find() " Jeff King
                       ` (13 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:40 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

A few functions take raw hash pointers, but all of their callers
actually have a "struct object_id". Let's retain that extra type as long
as possible (which will let future patches extend that further, and so
on).

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-bitmap-write.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 802ed62677..59aa201043 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -142,13 +142,13 @@ static inline void reset_all_seen(void)
 	seen_objects_nr = 0;
 }
 
-static uint32_t find_object_pos(const unsigned char *hash)
+static uint32_t find_object_pos(const struct object_id *oid)
 {
-	struct object_entry *entry = packlist_find(writer.to_pack, hash, NULL);
+	struct object_entry *entry = packlist_find(writer.to_pack, oid->hash, NULL);
 
 	if (!entry) {
 		die("Failed to write bitmap index. Packfile doesn't have full closure "
-			"(object %s is missing)", hash_to_hex(hash));
+			"(object %s is missing)", oid_to_hex(oid));
 	}
 
 	return oe_in_pack_pos(writer.to_pack, entry);
@@ -157,7 +157,7 @@ static uint32_t find_object_pos(const unsigned char *hash)
 static void show_object(struct object *object, const char *name, void *data)
 {
 	struct bitmap *base = data;
-	bitmap_set(base, find_object_pos(object->oid.hash));
+	bitmap_set(base, find_object_pos(&object->oid));
 	mark_as_seen(object);
 }
 
@@ -170,7 +170,7 @@ static int
 add_to_include_set(struct bitmap *base, struct commit *commit)
 {
 	khiter_t hash_pos;
-	uint32_t bitmap_pos = find_object_pos(commit->object.oid.hash);
+	uint32_t bitmap_pos = find_object_pos(&commit->object.oid);
 
 	if (bitmap_get(base, bitmap_pos))
 		return 0;
@@ -375,14 +375,14 @@ void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack)
 	 */
 }
 
-static struct ewah_bitmap *find_reused_bitmap(const unsigned char *sha1)
+static struct ewah_bitmap *find_reused_bitmap(const struct object_id *oid)
 {
 	khiter_t hash_pos;
 
 	if (!writer.reused)
 		return NULL;
 
-	hash_pos = kh_get_sha1(writer.reused, sha1);
+	hash_pos = kh_get_sha1(writer.reused, oid->hash);
 	if (hash_pos >= kh_end(writer.reused))
 		return NULL;
 
@@ -422,14 +422,14 @@ void bitmap_writer_select_commits(struct commit **indexed_commits,
 
 		if (next == 0) {
 			chosen = indexed_commits[i];
-			reused_bitmap = find_reused_bitmap(chosen->object.oid.hash);
+			reused_bitmap = find_reused_bitmap(&chosen->object.oid);
 		} else {
 			chosen = indexed_commits[i + next];
 
 			for (j = 0; j <= next; ++j) {
 				struct commit *cm = indexed_commits[i + j];
 
-				reused_bitmap = find_reused_bitmap(cm->object.oid.hash);
+				reused_bitmap = find_reused_bitmap(&cm->object.oid);
 				if (reused_bitmap || (cm->object.flags & NEEDS_BITMAP) != 0) {
 					chosen = cm;
 					break;
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/17] pack-objects: convert packlist_find() to use object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (2 preceding siblings ...)
  2019-06-20  7:40     ` [PATCH 03/17] pack-bitmap-write: convert some helpers to use object_id Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 05/17] pack-objects: convert locate_object_entry_hash() to object_id Jeff King
                       ` (12 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

We take a raw hash pointer, but most of our callers have a "struct
object_id" already. Let's switch to taking the full struct, which will
let us continue removing uses of raw sha1 buffers.

There are two callers that do need special attention:

  - in rebuild_existing_bitmaps(), we need to switch to
    nth_packed_object_oid(). This incurs an extra hash copy over
    pointing straight to the mmap'd sha1, but it shouldn't be measurable
    compared to the rest of the operation.

  - in can_reuse_delta() we already spent the effort to copy the sha1
    into a "struct object_id", but now we just have to do so a little
    earlier in the function (we can't easily convert that function's
    callers because they may be pointing at mmap'd REF_DELTA blocks).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c | 19 ++++++++++---------
 pack-bitmap-write.c    |  2 +-
 pack-bitmap.c          |  6 +++---
 pack-objects.c         |  4 ++--
 pack-objects.h         |  2 +-
 5 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b2be8869c2..c95693fd4b 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -606,12 +606,12 @@ static int mark_tagged(const char *path, const struct object_id *oid, int flag,
 		       void *cb_data)
 {
 	struct object_id peeled;
-	struct object_entry *entry = packlist_find(&to_pack, oid->hash, NULL);
+	struct object_entry *entry = packlist_find(&to_pack, oid, NULL);
 
 	if (entry)
 		entry->tagged = 1;
 	if (!peel_ref(path, &peeled)) {
-		entry = packlist_find(&to_pack, peeled.hash, NULL);
+		entry = packlist_find(&to_pack, &peeled, NULL);
 		if (entry)
 			entry->tagged = 1;
 	}
@@ -996,7 +996,7 @@ static int have_duplicate_entry(const struct object_id *oid,
 {
 	struct object_entry *entry;
 
-	entry = packlist_find(&to_pack, oid->hash, index_pos);
+	entry = packlist_find(&to_pack, oid, index_pos);
 	if (!entry)
 		return 0;
 
@@ -1494,11 +1494,13 @@ static int can_reuse_delta(const unsigned char *base_sha1,
 	if (!base_sha1)
 		return 0;
 
+	oidread(&base_oid, base_sha1);
+
 	/*
 	 * First see if we're already sending the base (or it's explicitly in
 	 * our "excluded" list).
 	 */
-	base = packlist_find(&to_pack, base_sha1, NULL);
+	base = packlist_find(&to_pack, &base_oid, NULL);
 	if (base) {
 		if (!in_same_island(&delta->idx.oid, &base->idx.oid))
 			return 0;
@@ -1511,7 +1513,6 @@ static int can_reuse_delta(const unsigned char *base_sha1,
 	 * even if it was buried too deep in history to make it into the
 	 * packing list.
 	 */
-	oidread(&base_oid, base_sha1);
 	if (thin && bitmap_has_oid_in_uninteresting(bitmap_git, &base_oid)) {
 		if (use_delta_islands) {
 			if (!in_same_island(&delta->idx.oid, &base_oid))
@@ -2571,7 +2572,7 @@ static void add_tag_chain(const struct object_id *oid)
 	 * it was included via bitmaps, we would not have parsed it
 	 * previously).
 	 */
-	if (packlist_find(&to_pack, oid->hash, NULL))
+	if (packlist_find(&to_pack, oid, NULL))
 		return;
 
 	tag = lookup_tag(the_repository, oid);
@@ -2595,7 +2596,7 @@ static int add_ref_tag(const char *path, const struct object_id *oid, int flag,
 
 	if (starts_with(path, "refs/tags/") && /* is a tag? */
 	    !peel_ref(path, &peeled)    && /* peelable? */
-	    packlist_find(&to_pack, peeled.hash, NULL))      /* object packed? */
+	    packlist_find(&to_pack, &peeled, NULL))      /* object packed? */
 		add_tag_chain(oid);
 	return 0;
 }
@@ -2795,7 +2796,7 @@ static void show_object(struct object *obj, const char *name, void *data)
 		for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
 			depth++;
 
-		ent = packlist_find(&to_pack, obj->oid.hash, NULL);
+		ent = packlist_find(&to_pack, &obj->oid, NULL);
 		if (ent && depth > oe_tree_depth(&to_pack, ent))
 			oe_set_tree_depth(&to_pack, ent, depth);
 	}
@@ -3026,7 +3027,7 @@ static void loosen_unused_packed_objects(void)
 
 		for (i = 0; i < p->num_objects; i++) {
 			nth_packed_object_oid(&oid, p, i);
-			if (!packlist_find(&to_pack, oid.hash, NULL) &&
+			if (!packlist_find(&to_pack, &oid, NULL) &&
 			    !has_sha1_pack_kept_or_nonlocal(&oid) &&
 			    !loosened_object_can_be_discarded(&oid, p->mtime))
 				if (force_object_loose(&oid, p->mtime))
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 59aa201043..0637378533 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -144,7 +144,7 @@ static inline void reset_all_seen(void)
 
 static uint32_t find_object_pos(const struct object_id *oid)
 {
-	struct object_entry *entry = packlist_find(writer.to_pack, oid->hash, NULL);
+	struct object_entry *entry = packlist_find(writer.to_pack, oid, NULL);
 
 	if (!entry) {
 		die("Failed to write bitmap index. Packfile doesn't have full closure "
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 6069b2fe55..ff1f07e249 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1057,13 +1057,13 @@ int rebuild_existing_bitmaps(struct bitmap_index *bitmap_git,
 	reposition = xcalloc(num_objects, sizeof(uint32_t));
 
 	for (i = 0; i < num_objects; ++i) {
-		const unsigned char *sha1;
+		struct object_id oid;
 		struct revindex_entry *entry;
 		struct object_entry *oe;
 
 		entry = &bitmap_git->pack->revindex[i];
-		sha1 = nth_packed_object_sha1(bitmap_git->pack, entry->nr);
-		oe = packlist_find(mapping, sha1, NULL);
+		nth_packed_object_oid(&oid, bitmap_git->pack, entry->nr);
+		oe = packlist_find(mapping, &oid, NULL);
 
 		if (oe)
 			reposition[i] = oe_in_pack_pos(mapping, oe) + 1;
diff --git a/pack-objects.c b/pack-objects.c
index ce33b8906e..49fdf52ea6 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -68,7 +68,7 @@ static void rehash_objects(struct packing_data *pdata)
 }
 
 struct object_entry *packlist_find(struct packing_data *pdata,
-				   const unsigned char *sha1,
+				   const struct object_id *oid,
 				   uint32_t *index_pos)
 {
 	uint32_t i;
@@ -77,7 +77,7 @@ struct object_entry *packlist_find(struct packing_data *pdata,
 	if (!pdata->index_size)
 		return NULL;
 
-	i = locate_object_entry_hash(pdata, sha1, &found);
+	i = locate_object_entry_hash(pdata, oid->hash, &found);
 
 	if (index_pos)
 		*index_pos = i;
diff --git a/pack-objects.h b/pack-objects.h
index 6fde7ce27c..857d43850b 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -187,7 +187,7 @@ struct object_entry *packlist_alloc(struct packing_data *pdata,
 				    uint32_t index_pos);
 
 struct object_entry *packlist_find(struct packing_data *pdata,
-				   const unsigned char *sha1,
+				   const struct object_id *oid,
 				   uint32_t *index_pos);
 
 static inline uint32_t pack_name_hash(const char *name)
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/17] pack-objects: convert locate_object_entry_hash() to object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (3 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 04/17] pack-objects: convert packlist_find() " Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 06/17] object: convert lookup_unknown_object() to use object_id Jeff King
                       ` (11 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

There are no callers of locate_object_entry_hash() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-objects.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/pack-objects.c b/pack-objects.c
index 49fdf52ea6..00a5f6e0ec 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -6,17 +6,17 @@
 #include "config.h"
 
 static uint32_t locate_object_entry_hash(struct packing_data *pdata,
-					 const unsigned char *sha1,
+					 const struct object_id *oid,
 					 int *found)
 {
 	uint32_t i, mask = (pdata->index_size - 1);
 
-	i = sha1hash(sha1) & mask;
+	i = sha1hash(oid->hash) & mask;
 
 	while (pdata->index[i] > 0) {
 		uint32_t pos = pdata->index[i] - 1;
 
-		if (hasheq(sha1, pdata->objects[pos].idx.oid.hash)) {
+		if (oideq(oid, &pdata->objects[pos].idx.oid)) {
 			*found = 1;
 			return i;
 		}
@@ -56,7 +56,7 @@ static void rehash_objects(struct packing_data *pdata)
 	for (i = 0; i < pdata->nr_objects; i++) {
 		int found;
 		uint32_t ix = locate_object_entry_hash(pdata,
-						       entry->idx.oid.hash,
+						       &entry->idx.oid,
 						       &found);
 
 		if (found)
@@ -77,7 +77,7 @@ struct object_entry *packlist_find(struct packing_data *pdata,
 	if (!pdata->index_size)
 		return NULL;
 
-	i = locate_object_entry_hash(pdata, oid->hash, &found);
+	i = locate_object_entry_hash(pdata, oid, &found);
 
 	if (index_pos)
 		*index_pos = i;
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/17] object: convert lookup_unknown_object() to use object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (4 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 05/17] pack-objects: convert locate_object_entry_hash() to object_id Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 07/17] object: convert lookup_object() " Jeff King
                       ` (10 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

There are no callers left of lookup_unknown_object() that aren't just
passing us the "hash" member of a "struct object_id". Let's take the
whole struct, which gets us closer to removing all raw sha1 variables.
It also matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fsck.c                   | 2 +-
 builtin/pack-objects.c           | 2 +-
 fsck.c                           | 2 +-
 http-push.c                      | 2 +-
 object.c                         | 6 +++---
 object.h                         | 2 +-
 refs.c                           | 2 +-
 t/helper/test-example-decorate.c | 6 +++---
 upload-pack.c                    | 2 +-
 walker.c                         | 2 +-
 10 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index d26fb0a044..e422c82465 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -756,7 +756,7 @@ static int fsck_cache_tree(struct cache_tree *it)
 
 static void mark_object_for_connectivity(const struct object_id *oid)
 {
-	struct object *obj = lookup_unknown_object(oid->hash);
+	struct object *obj = lookup_unknown_object(oid);
 	obj->flags |= HAS_OBJ;
 }
 
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c95693fd4b..3e8467aa23 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2923,7 +2923,7 @@ static void add_objects_in_unpacked_packs(void)
 
 		for (i = 0; i < p->num_objects; i++) {
 			nth_packed_object_oid(&oid, p, i);
-			o = lookup_unknown_object(oid.hash);
+			o = lookup_unknown_object(&oid);
 			if (!(o->flags & OBJECT_ADDED))
 				mark_in_pack_object(o, p, &in_pack);
 			o->flags |= OBJECT_ADDED;
diff --git a/fsck.c b/fsck.c
index 4703f55561..117c4a978f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1092,7 +1092,7 @@ int fsck_finish(struct fsck_options *options)
 
 		blob = lookup_blob(the_repository, oid);
 		if (!blob) {
-			struct object *obj = lookup_unknown_object(oid->hash);
+			struct object *obj = lookup_unknown_object(oid);
 			ret |= report(options, obj,
 				      FSCK_MSG_GITMODULES_BLOB,
 				      "non-blob found at .gitmodules");
diff --git a/http-push.c b/http-push.c
index e36561a6db..96a98e1e61 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1432,7 +1432,7 @@ static void one_remote_ref(const char *refname)
 	 * may be required for updating server info later.
 	 */
 	if (repo->can_update_info_refs && !has_object_file(&ref->old_oid)) {
-		obj = lookup_unknown_object(ref->old_oid.hash);
+		obj = lookup_unknown_object(&ref->old_oid);
 		fprintf(stderr,	"  fetch %s for %s\n",
 			oid_to_hex(&ref->old_oid), refname);
 		add_fetch_request(obj);
diff --git a/object.c b/object.c
index e81d47a79c..d5b1d8daaf 100644
--- a/object.c
+++ b/object.c
@@ -178,11 +178,11 @@ void *object_as_type(struct repository *r, struct object *obj, enum object_type
 	}
 }
 
-struct object *lookup_unknown_object(const unsigned char *sha1)
+struct object *lookup_unknown_object(const struct object_id *oid)
 {
-	struct object *obj = lookup_object(the_repository, sha1);
+	struct object *obj = lookup_object(the_repository, oid->hash);
 	if (!obj)
-		obj = create_object(the_repository, sha1,
+		obj = create_object(the_repository, oid->hash,
 				    alloc_object_node(the_repository));
 	return obj;
 }
diff --git a/object.h b/object.h
index 4526979ccf..5e0ccfe0e4 100644
--- a/object.h
+++ b/object.h
@@ -143,7 +143,7 @@ struct object *parse_object_or_die(const struct object_id *oid, const char *name
 struct object *parse_object_buffer(struct repository *r, const struct object_id *oid, enum object_type type, unsigned long size, void *buffer, int *eaten_p);
 
 /** Returns the object, with potentially excess memory allocated. **/
-struct object *lookup_unknown_object(const unsigned  char *sha1);
+struct object *lookup_unknown_object(const struct object_id *oid);
 
 struct object_list *object_list_insert(struct object *item,
 				       struct object_list **list_p);
diff --git a/refs.c b/refs.c
index b8a8430c96..cd297ee4bd 100644
--- a/refs.c
+++ b/refs.c
@@ -379,7 +379,7 @@ static int filter_refs(const char *refname, const struct object_id *oid,
 
 enum peel_status peel_object(const struct object_id *name, struct object_id *oid)
 {
-	struct object *o = lookup_unknown_object(name->hash);
+	struct object *o = lookup_unknown_object(name);
 
 	if (o->type == OBJ_NONE) {
 		int type = oid_object_info(the_repository, name, NULL);
diff --git a/t/helper/test-example-decorate.c b/t/helper/test-example-decorate.c
index a20a6161e4..c8a1cde7d2 100644
--- a/t/helper/test-example-decorate.c
+++ b/t/helper/test-example-decorate.c
@@ -26,8 +26,8 @@ int cmd__example_decorate(int argc, const char **argv)
 	 * Add 2 objects, one with a non-NULL decoration and one with a NULL
 	 * decoration.
 	 */
-	one = lookup_unknown_object(one_oid.hash);
-	two = lookup_unknown_object(two_oid.hash);
+	one = lookup_unknown_object(&one_oid);
+	two = lookup_unknown_object(&two_oid);
 	ret = add_decoration(&n, one, &decoration_a);
 	if (ret)
 		BUG("when adding a brand-new object, NULL should be returned");
@@ -56,7 +56,7 @@ int cmd__example_decorate(int argc, const char **argv)
 	ret = lookup_decoration(&n, two);
 	if (ret != &decoration_b)
 		BUG("lookup should return added declaration");
-	three = lookup_unknown_object(three_oid.hash);
+	three = lookup_unknown_object(&three_oid);
 	ret = lookup_decoration(&n, three);
 	if (ret)
 		BUG("lookup for unknown object should return NULL");
diff --git a/upload-pack.c b/upload-pack.c
index d9a62adef0..ecc19641fe 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -960,7 +960,7 @@ static void receive_needs(struct packet_reader *reader, struct object_array *wan
 static int mark_our_ref(const char *refname, const char *refname_full,
 			const struct object_id *oid)
 {
-	struct object *o = lookup_unknown_object(oid->hash);
+	struct object *o = lookup_unknown_object(oid);
 
 	if (ref_is_hidden(refname, refname_full)) {
 		o->flags |= HIDDEN_REF;
diff --git a/walker.c b/walker.c
index d74ae59c77..06cd2bd569 100644
--- a/walker.c
+++ b/walker.c
@@ -285,7 +285,7 @@ int walker_fetch(struct walker *walker, int targets, char **target,
 			error("Could not interpret response from server '%s' as something to pull", target[i]);
 			goto done;
 		}
-		if (process(walker, lookup_unknown_object(oids[i].hash)))
+		if (process(walker, lookup_unknown_object(&oids[i])))
 			goto done;
 	}
 
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/17] object: convert lookup_object() to use object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (5 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 06/17] object: convert lookup_unknown_object() to use object_id Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 08/17] object: convert internal hash_obj() to object_id Jeff King
                       ` (9 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.  It also
matches the existing conversions of lookup_blob(), etc.

The conversions of callers were done by hand, but they're all mechanical
one-liners.

Signed-off-by: Jeff King <peff@peff.net>
---
 blob.c                   |  2 +-
 builtin/fast-export.c    |  4 ++--
 builtin/fsck.c           |  6 +++---
 builtin/name-rev.c       |  3 +--
 builtin/prune.c          |  2 +-
 builtin/unpack-objects.c |  2 +-
 commit.c                 |  2 +-
 delta-islands.c          |  2 +-
 fetch-pack.c             | 12 ++++++------
 http-push.c              |  2 +-
 object.c                 | 12 ++++++------
 object.h                 |  2 +-
 reachable.c              |  4 ++--
 tag.c                    |  2 +-
 tree.c                   |  2 +-
 upload-pack.c            |  2 +-
 16 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/blob.c b/blob.c
index 342bdbb1bb..b9c7180b7c 100644
--- a/blob.c
+++ b/blob.c
@@ -7,7 +7,7 @@ const char *blob_type = "blob";
 
 struct blob *lookup_blob(struct repository *r, const struct object_id *oid)
 {
-	struct object *obj = lookup_object(r, oid->hash);
+	struct object *obj = lookup_object(r, oid);
 	if (!obj)
 		return create_object(r, oid->hash,
 				     alloc_blob_node(r));
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index c22cef3b2f..f541f55d33 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -275,7 +275,7 @@ static void export_blob(const struct object_id *oid)
 	if (is_null_oid(oid))
 		return;
 
-	object = lookup_object(the_repository, oid->hash);
+	object = lookup_object(the_repository, oid);
 	if (object && object->flags & SHOWN)
 		return;
 
@@ -453,7 +453,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 						  &spec->oid));
 			else {
 				struct object *object = lookup_object(the_repository,
-								      spec->oid.hash);
+								      &spec->oid);
 				printf("M %06o :%d ", spec->mode,
 				       get_object_mark(object));
 			}
diff --git a/builtin/fsck.c b/builtin/fsck.c
index e422c82465..18403a94fa 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -238,7 +238,7 @@ static int mark_used(struct object *obj, int type, void *data, struct fsck_optio
 static void mark_unreachable_referents(const struct object_id *oid)
 {
 	struct fsck_options options = FSCK_OPTIONS_DEFAULT;
-	struct object *obj = lookup_object(the_repository, oid->hash);
+	struct object *obj = lookup_object(the_repository, oid);
 
 	if (!obj || !(obj->flags & HAS_OBJ))
 		return; /* not part of our original set */
@@ -497,7 +497,7 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid,
 	struct object *obj;
 
 	if (!is_null_oid(oid)) {
-		obj = lookup_object(the_repository, oid->hash);
+		obj = lookup_object(the_repository, oid);
 		if (obj && (obj->flags & HAS_OBJ)) {
 			if (timestamp && name_objects)
 				add_decoration(fsck_walk_options.object_names,
@@ -879,7 +879,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 		struct object_id oid;
 		if (!get_oid(arg, &oid)) {
 			struct object *obj = lookup_object(the_repository,
-							   oid.hash);
+							   &oid);
 
 			if (!obj || !(obj->flags & HAS_OBJ)) {
 				if (is_promisor_object(&oid))
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 16df43473a..c785fe16ba 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -378,8 +378,7 @@ static void name_rev_line(char *p, struct name_ref_data *data)
 			*(p+1) = 0;
 			if (!get_oid(p - (hexsz - 1), &oid)) {
 				struct object *o =
-					lookup_object(the_repository,
-						      oid.hash);
+					lookup_object(the_repository, &oid);
 				if (o)
 					name = get_rev_name(o, &buf);
 			}
diff --git a/builtin/prune.c b/builtin/prune.c
index 97613eccb5..2b76872ad2 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -53,7 +53,7 @@ static int is_object_reachable(const struct object_id *oid,
 
 	perform_reachability_traversal(revs);
 
-	obj = lookup_object(the_repository, oid->hash);
+	obj = lookup_object(the_repository, oid);
 	return obj && (obj->flags & SEEN);
 }
 
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 80478808b3..a87a4bfd2c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -332,7 +332,7 @@ static int resolve_against_held(unsigned nr, const struct object_id *base,
 {
 	struct object *obj;
 	struct obj_buffer *obj_buffer;
-	obj = lookup_object(the_repository, base->hash);
+	obj = lookup_object(the_repository, base);
 	if (!obj)
 		return 0;
 	obj_buffer = lookup_object_buffer(obj);
diff --git a/commit.c b/commit.c
index 8fa1883c61..f47c75afae 100644
--- a/commit.c
+++ b/commit.c
@@ -57,7 +57,7 @@ struct commit *lookup_commit_or_die(const struct object_id *oid, const char *ref
 
 struct commit *lookup_commit(struct repository *r, const struct object_id *oid)
 {
-	struct object *obj = lookup_object(r, oid->hash);
+	struct object *obj = lookup_object(r, oid);
 	if (!obj)
 		return create_object(r, oid->hash,
 				     alloc_commit_node(r));
diff --git a/delta-islands.c b/delta-islands.c
index 2186bd0738..5f3ab914f5 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -296,7 +296,7 @@ void resolve_tree_islands(struct repository *r,
 			if (S_ISGITLINK(entry.mode))
 				continue;
 
-			obj = lookup_object(r, entry.oid.hash);
+			obj = lookup_object(r, &entry.oid);
 			if (!obj)
 				continue;
 
diff --git a/fetch-pack.c b/fetch-pack.c
index 1c10f54e78..07bc48a1a5 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -286,7 +286,7 @@ static int find_common(struct fetch_negotiator *negotiator,
 		 * we cannot trust the object flags).
 		 */
 		if (!args->no_dependents &&
-		    ((o = lookup_object(the_repository, remote->hash)) != NULL) &&
+		    ((o = lookup_object(the_repository, remote)) != NULL) &&
 				(o->flags & COMPLETE)) {
 			continue;
 		}
@@ -364,7 +364,7 @@ static int find_common(struct fetch_negotiator *negotiator,
 			if (skip_prefix(reader.line, "unshallow ", &arg)) {
 				if (get_oid_hex(arg, &oid))
 					die(_("invalid unshallow line: %s"), reader.line);
-				if (!lookup_object(the_repository, oid.hash))
+				if (!lookup_object(the_repository, &oid))
 					die(_("object not found: %s"), reader.line);
 				/* make sure that it is parsed as shallow */
 				if (!parse_object(the_repository, &oid))
@@ -707,7 +707,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	for (ref = *refs; ref; ref = ref->next) {
 		struct object *o = deref_tag(the_repository,
 					     lookup_object(the_repository,
-					     ref->old_oid.hash),
+					     &ref->old_oid),
 					     NULL, 0);
 
 		if (!o || o->type != OBJ_COMMIT || !(o->flags & COMPLETE))
@@ -734,7 +734,7 @@ static int everything_local(struct fetch_pack_args *args,
 		const struct object_id *remote = &ref->old_oid;
 		struct object *o;
 
-		o = lookup_object(the_repository, remote->hash);
+		o = lookup_object(the_repository, remote);
 		if (!o || !(o->flags & COMPLETE)) {
 			retval = 0;
 			print_verbose(args, "want %s (%s)", oid_to_hex(remote),
@@ -1048,7 +1048,7 @@ static void add_wants(int no_dependents, const struct ref *wants, struct strbuf
 		 * we cannot trust the object flags).
 		 */
 		if (!no_dependents &&
-		    ((o = lookup_object(the_repository, remote->hash)) != NULL) &&
+		    ((o = lookup_object(the_repository, remote)) != NULL) &&
 		    (o->flags & COMPLETE)) {
 			continue;
 		}
@@ -1275,7 +1275,7 @@ static void receive_shallow_info(struct fetch_pack_args *args,
 		if (skip_prefix(reader->line, "unshallow ", &arg)) {
 			if (get_oid_hex(arg, &oid))
 				die(_("invalid unshallow line: %s"), reader->line);
-			if (!lookup_object(the_repository, oid.hash))
+			if (!lookup_object(the_repository, &oid))
 				die(_("object not found: %s"), reader->line);
 			/* make sure that it is parsed as shallow */
 			if (!parse_object(the_repository, &oid))
diff --git a/http-push.c b/http-push.c
index 96a98e1e61..0353f9f514 100644
--- a/http-push.c
+++ b/http-push.c
@@ -723,7 +723,7 @@ static void one_remote_object(const struct object_id *oid)
 {
 	struct object *obj;
 
-	obj = lookup_object(the_repository, oid->hash);
+	obj = lookup_object(the_repository, oid);
 	if (!obj)
 		obj = parse_object(the_repository, oid);
 
diff --git a/object.c b/object.c
index d5b1d8daaf..34c1d0dc8f 100644
--- a/object.c
+++ b/object.c
@@ -85,7 +85,7 @@ static void insert_obj_hash(struct object *obj, struct object **hash, unsigned i
  * Look up the record for the given sha1 in the hash map stored in
  * obj_hash.  Return NULL if it was not found.
  */
-struct object *lookup_object(struct repository *r, const unsigned char *sha1)
+struct object *lookup_object(struct repository *r, const struct object_id *oid)
 {
 	unsigned int i, first;
 	struct object *obj;
@@ -93,9 +93,9 @@ struct object *lookup_object(struct repository *r, const unsigned char *sha1)
 	if (!r->parsed_objects->obj_hash)
 		return NULL;
 
-	first = i = hash_obj(sha1, r->parsed_objects->obj_hash_size);
+	first = i = hash_obj(oid->hash, r->parsed_objects->obj_hash_size);
 	while ((obj = r->parsed_objects->obj_hash[i]) != NULL) {
-		if (hasheq(sha1, obj->oid.hash))
+		if (oideq(oid, &obj->oid))
 			break;
 		i++;
 		if (i == r->parsed_objects->obj_hash_size)
@@ -180,7 +180,7 @@ void *object_as_type(struct repository *r, struct object *obj, enum object_type
 
 struct object *lookup_unknown_object(const struct object_id *oid)
 {
-	struct object *obj = lookup_object(the_repository, oid->hash);
+	struct object *obj = lookup_object(the_repository, oid);
 	if (!obj)
 		obj = create_object(the_repository, oid->hash,
 				    alloc_object_node(the_repository));
@@ -256,7 +256,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	void *buffer;
 	struct object *obj;
 
-	obj = lookup_object(r, oid->hash);
+	obj = lookup_object(r, oid);
 	if (obj && obj->parsed)
 		return obj;
 
@@ -268,7 +268,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 			return NULL;
 		}
 		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
-		return lookup_object(r, oid->hash);
+		return lookup_object(r, oid);
 	}
 
 	buffer = repo_read_object_file(r, oid, &type, &size);
diff --git a/object.h b/object.h
index 5e0ccfe0e4..47301186a4 100644
--- a/object.h
+++ b/object.h
@@ -116,7 +116,7 @@ struct object *get_indexed_object(unsigned int);
  * half-initialised objects, the caller is expected to initialize them
  * by calling parse_object() on them.
  */
-struct object *lookup_object(struct repository *r, const unsigned char *sha1);
+struct object *lookup_object(struct repository *r, const struct object_id *oid);
 
 void *create_object(struct repository *r, const unsigned char *sha1, void *obj);
 
diff --git a/reachable.c b/reachable.c
index 0d00a91de4..8f50235b28 100644
--- a/reachable.c
+++ b/reachable.c
@@ -109,7 +109,7 @@ static int add_recent_loose(const struct object_id *oid,
 			    const char *path, void *data)
 {
 	struct stat st;
-	struct object *obj = lookup_object(the_repository, oid->hash);
+	struct object *obj = lookup_object(the_repository, oid);
 
 	if (obj && obj->flags & SEEN)
 		return 0;
@@ -134,7 +134,7 @@ static int add_recent_packed(const struct object_id *oid,
 			     struct packed_git *p, uint32_t pos,
 			     void *data)
 {
-	struct object *obj = lookup_object(the_repository, oid->hash);
+	struct object *obj = lookup_object(the_repository, oid);
 
 	if (obj && obj->flags & SEEN)
 		return 0;
diff --git a/tag.c b/tag.c
index 7445b8f6ea..3ae00ba1ab 100644
--- a/tag.c
+++ b/tag.c
@@ -100,7 +100,7 @@ struct object *deref_tag_noverify(struct object *o)
 
 struct tag *lookup_tag(struct repository *r, const struct object_id *oid)
 {
-	struct object *obj = lookup_object(r, oid->hash);
+	struct object *obj = lookup_object(r, oid);
 	if (!obj)
 		return create_object(r, oid->hash,
 				     alloc_tag_node(r));
diff --git a/tree.c b/tree.c
index f416afc57d..0ebb8c5b02 100644
--- a/tree.c
+++ b/tree.c
@@ -197,7 +197,7 @@ int read_tree(struct repository *r, struct tree *tree, int stage,
 
 struct tree *lookup_tree(struct repository *r, const struct object_id *oid)
 {
-	struct object *obj = lookup_object(r, oid->hash);
+	struct object *obj = lookup_object(r, oid);
 	if (!obj)
 		return create_object(r, oid->hash,
 				     alloc_tree_node(r));
diff --git a/upload-pack.c b/upload-pack.c
index ecc19641fe..a0f170b5b5 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -534,7 +534,7 @@ static int get_reachable_list(struct object_array *src,
 		if (parse_oid_hex(namebuf, &oid, &p) || *p != '\n')
 			break;
 
-		o = lookup_object(the_repository, oid.hash);
+		o = lookup_object(the_repository, &oid);
 		if (o && o->type == OBJ_COMMIT) {
 			o->flags &= ~TMP_MARK;
 		}
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/17] object: convert internal hash_obj() to object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (6 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 07/17] object: convert lookup_object() " Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 09/17] object: convert create_object() to use object_id Jeff King
                       ` (8 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Now that lookup_object() has an object_id, we can consistently pass that
around instead of a raw sha1. We still convert to a hash to pass to
sha1hash(), but the goal is for that to go away shortly.

Signed-off-by: Jeff King <peff@peff.net>
---
 object.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/object.c b/object.c
index 34c1d0dc8f..dbfdbe504d 100644
--- a/object.c
+++ b/object.c
@@ -59,9 +59,9 @@ int type_from_string_gently(const char *str, ssize_t len, int gentle)
  * the specified sha1.  n must be a power of 2.  Please note that the
  * return value is *not* consistent across computer architectures.
  */
-static unsigned int hash_obj(const unsigned char *sha1, unsigned int n)
+static unsigned int hash_obj(const struct object_id *oid, unsigned int n)
 {
-	return sha1hash(sha1) & (n - 1);
+	return sha1hash(oid->hash) & (n - 1);
 }
 
 /*
@@ -71,7 +71,7 @@ static unsigned int hash_obj(const unsigned char *sha1, unsigned int n)
  */
 static void insert_obj_hash(struct object *obj, struct object **hash, unsigned int size)
 {
-	unsigned int j = hash_obj(obj->oid.hash, size);
+	unsigned int j = hash_obj(&obj->oid, size);
 
 	while (hash[j]) {
 		j++;
@@ -93,7 +93,7 @@ struct object *lookup_object(struct repository *r, const struct object_id *oid)
 	if (!r->parsed_objects->obj_hash)
 		return NULL;
 
-	first = i = hash_obj(oid->hash, r->parsed_objects->obj_hash_size);
+	first = i = hash_obj(oid, r->parsed_objects->obj_hash_size);
 	while ((obj = r->parsed_objects->obj_hash[i]) != NULL) {
 		if (oideq(oid, &obj->oid))
 			break;
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/17] object: convert create_object() to use object_id
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (7 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 08/17] object: convert internal hash_obj() to object_id Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20 14:21       ` Ramsay Jones
  2019-06-20  7:41     ` [PATCH 10/17] khash: drop broken oid_map typedef Jeff King
                       ` (7 subsequent siblings)
  16 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

There are no callers left of lookup_object() that aren't just passing us
the "hash" member of a "struct object_id". Let's take the whole struct,
which gets us closer to removing all raw sha1 variables.

Signed-off-by: Jeff King <peff@peff.net>
---
 blob.c         | 3 +--
 commit-graph.c | 2 +-
 commit.c       | 3 +--
 object.c       | 6 +++---
 object.h       | 2 +-
 tag.c          | 3 +--
 tree.c         | 3 +--
 7 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/blob.c b/blob.c
index b9c7180b7c..36f9abda19 100644
--- a/blob.c
+++ b/blob.c
@@ -9,8 +9,7 @@ struct blob *lookup_blob(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid->hash,
-				     alloc_blob_node(r));
+		return create_object(r, oid, alloc_blob_node(r));
 	return object_as_type(r, obj, OBJ_BLOB, 0);
 }
 
diff --git a/commit-graph.c b/commit-graph.c
index 7c5e54875f..5a62131d68 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1214,7 +1214,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		odb_commit = (struct commit *)create_object(r, cur_oid.hash, alloc_commit_node(r));
+		odb_commit = (struct commit *)create_object(r, &cur_oid, alloc_commit_node(r));
 		if (parse_commit_internal(odb_commit, 0, 0)) {
 			graph_report(_("failed to parse commit %s from object database for commit-graph"),
 				     oid_to_hex(&cur_oid));
diff --git a/commit.c b/commit.c
index f47c75afae..b71ac195d4 100644
--- a/commit.c
+++ b/commit.c
@@ -59,8 +59,7 @@ struct commit *lookup_commit(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid->hash,
-				     alloc_commit_node(r));
+		return create_object(r, oid, alloc_commit_node(r));
 	return object_as_type(r, obj, OBJ_COMMIT, 0);
 }
 
diff --git a/object.c b/object.c
index dbfdbe504d..317647da3e 100644
--- a/object.c
+++ b/object.c
@@ -141,13 +141,13 @@ static void grow_object_hash(struct repository *r)
 	r->parsed_objects->obj_hash_size = new_hash_size;
 }
 
-void *create_object(struct repository *r, const unsigned char *sha1, void *o)
+void *create_object(struct repository *r, const struct object_id *oid, void *o)
 {
 	struct object *obj = o;
 
 	obj->parsed = 0;
 	obj->flags = 0;
-	hashcpy(obj->oid.hash, sha1);
+	oidcpy(&obj->oid, oid);
 
 	if (r->parsed_objects->obj_hash_size - 1 <= r->parsed_objects->nr_objs * 2)
 		grow_object_hash(r);
@@ -182,7 +182,7 @@ struct object *lookup_unknown_object(const struct object_id *oid)
 {
 	struct object *obj = lookup_object(the_repository, oid);
 	if (!obj)
-		obj = create_object(the_repository, oid->hash,
+		obj = create_object(the_repository, oid,
 				    alloc_object_node(the_repository));
 	return obj;
 }
diff --git a/object.h b/object.h
index 47301186a4..0120892bbd 100644
--- a/object.h
+++ b/object.h
@@ -118,7 +118,7 @@ struct object *get_indexed_object(unsigned int);
  */
 struct object *lookup_object(struct repository *r, const struct object_id *oid);
 
-void *create_object(struct repository *r, const unsigned char *sha1, void *obj);
+void *create_object(struct repository *r, const struct object_id *oid, void *obj);
 
 void *object_as_type(struct repository *r, struct object *obj, enum object_type type, int quiet);
 
diff --git a/tag.c b/tag.c
index 3ae00ba1ab..5db870edb9 100644
--- a/tag.c
+++ b/tag.c
@@ -102,8 +102,7 @@ struct tag *lookup_tag(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid->hash,
-				     alloc_tag_node(r));
+		return create_object(r, oid, alloc_tag_node(r));
 	return object_as_type(r, obj, OBJ_TAG, 0);
 }
 
diff --git a/tree.c b/tree.c
index 0ebb8c5b02..4720945e6a 100644
--- a/tree.c
+++ b/tree.c
@@ -199,8 +199,7 @@ struct tree *lookup_tree(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid->hash,
-				     alloc_tree_node(r));
+		return create_object(r, oid, alloc_tree_node(r));
 	return object_as_type(r, obj, OBJ_TREE, 0);
 }
 
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/17] khash: drop broken oid_map typedef
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (8 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 09/17] object: convert create_object() to use object_id Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 11/17] khash: rename kh_oid_t to kh_oid_set Jeff King
                       ` (6 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19)
added a khash "oid_map" type to match the existing "oid" type, which is
a simple set (i.e., just keys, no values). But in setting up the
khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is
the set type.

Nobody noticed the breakage because there are not yet any callers; the
type was added just as a match to the existing sha1 types (whose map
type confusingly _is_ called khash_sha1, and it has no matching set
type).

We could easily fix this with s/oid/oid_map/ in the typedef. But let's
take this a step further, and just drop the typedef entirely.  These
typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But
the actual khash-derived type names are descriptive enough; this is just
adding an extra layer of indirection. The khash names do not quite
follow our usual style (e.g., they end in "_t"), but since we end up
using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just
typedef-ing the struct name is not really helping much.

And there are already many cases where we use the raw khash type names
anyway (e.g., the "set" variant defined just above us does not have such
a typedef!).

So let's drop this typedef, and the matching oid_pos one (which actually
_does_ have a user, but we can easily convert it).

We'll leave the khash_sha1 typedef around. The ultimate fate of its
callers should be conversion to kh_oid_map_t, so there's no point in
going through the noise of changing the names now.

Signed-off-by: Jeff King <peff@peff.net>
---
 khash.h       | 2 --
 pack-bitmap.c | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/khash.h b/khash.h
index af747a683f..64d4eeb2bd 100644
--- a/khash.h
+++ b/khash.h
@@ -345,9 +345,7 @@ static inline int oid_equal(struct object_id a, struct object_id b)
 KHASH_INIT(oid, struct object_id, int, 0, oid_hash, oid_equal)
 
 KHASH_INIT(oid_map, struct object_id, void *, 1, oid_hash, oid_equal)
-typedef kh_oid_t khash_oid_map;
 
 KHASH_INIT(oid_pos, struct object_id, int, 1, oid_hash, oid_equal)
-typedef kh_oid_pos_t khash_oid_pos;
 
 #endif /* __AC_KHASH_H */
diff --git a/pack-bitmap.c b/pack-bitmap.c
index ff1f07e249..998133588f 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -365,7 +365,7 @@ struct include_data {
 static inline int bitmap_position_extended(struct bitmap_index *bitmap_git,
 					   const struct object_id *oid)
 {
-	khash_oid_pos *positions = bitmap_git->ext_index.positions;
+	kh_oid_pos_t *positions = bitmap_git->ext_index.positions;
 	khiter_t pos = kh_get_oid_pos(positions, *oid);
 
 	if (pos < kh_end(positions)) {
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 11/17] khash: rename kh_oid_t to kh_oid_set
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (9 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 10/17] khash: drop broken oid_map typedef Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 12/17] delta-islands: convert island_marks khash to use oids Jeff King
                       ` (5 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

khash lets us define a hash as either a map or a set (i.e., with no
"value" type). For the oid maps we define, "oid" is the set and
"oid_map" is the map. As the bug in the previous commit shows, it's easy
to pick the wrong one.

So let's make the names more distinct: "oid_set" and "oid_map".

An alternative naming scheme would be to actually name the type after
the key/value types. So e.g., "oid" _would_ be the set, since it has no
value type. And "oid_map" would become "oid_void" or similar (and
"oid_pos" becomes "oid_int"). That's better in some ways: it's more
regular, and a given map type can be more reasily reused in multiple
contexts (e.g., something storing an "int" that isn't a "pos"). But it's
also slightly less descriptive.

Signed-off-by: Jeff King <peff@peff.net>
---
 khash.h  |  2 +-
 oidset.c | 12 ++++++------
 oidset.h |  4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/khash.h b/khash.h
index 64d4eeb2bd..ae9f78347f 100644
--- a/khash.h
+++ b/khash.h
@@ -342,7 +342,7 @@ static inline int oid_equal(struct object_id a, struct object_id b)
 	return oideq(&a, &b);
 }
 
-KHASH_INIT(oid, struct object_id, int, 0, oid_hash, oid_equal)
+KHASH_INIT(oid_set, struct object_id, int, 0, oid_hash, oid_equal)
 
 KHASH_INIT(oid_map, struct object_id, void *, 1, oid_hash, oid_equal)
 
diff --git a/oidset.c b/oidset.c
index fe4eb921df..8bdecb13de 100644
--- a/oidset.c
+++ b/oidset.c
@@ -5,33 +5,33 @@ void oidset_init(struct oidset *set, size_t initial_size)
 {
 	memset(&set->set, 0, sizeof(set->set));
 	if (initial_size)
-		kh_resize_oid(&set->set, initial_size);
+		kh_resize_oid_set(&set->set, initial_size);
 }
 
 int oidset_contains(const struct oidset *set, const struct object_id *oid)
 {
-	khiter_t pos = kh_get_oid(&set->set, *oid);
+	khiter_t pos = kh_get_oid_set(&set->set, *oid);
 	return pos != kh_end(&set->set);
 }
 
 int oidset_insert(struct oidset *set, const struct object_id *oid)
 {
 	int added;
-	kh_put_oid(&set->set, *oid, &added);
+	kh_put_oid_set(&set->set, *oid, &added);
 	return !added;
 }
 
 int oidset_remove(struct oidset *set, const struct object_id *oid)
 {
-	khiter_t pos = kh_get_oid(&set->set, *oid);
+	khiter_t pos = kh_get_oid_set(&set->set, *oid);
 	if (pos == kh_end(&set->set))
 		return 0;
-	kh_del_oid(&set->set, pos);
+	kh_del_oid_set(&set->set, pos);
 	return 1;
 }
 
 void oidset_clear(struct oidset *set)
 {
-	kh_release_oid(&set->set);
+	kh_release_oid_set(&set->set);
 	oidset_init(set, 0);
 }
diff --git a/oidset.h b/oidset.h
index 14f18f791f..505fad578b 100644
--- a/oidset.h
+++ b/oidset.h
@@ -20,7 +20,7 @@
  * A single oidset; should be zero-initialized (or use OIDSET_INIT).
  */
 struct oidset {
-	kh_oid_t set;
+	kh_oid_set_t set;
 };
 
 #define OIDSET_INIT { { 0 } }
@@ -62,7 +62,7 @@ int oidset_remove(struct oidset *set, const struct object_id *oid);
 void oidset_clear(struct oidset *set);
 
 struct oidset_iter {
-	kh_oid_t *set;
+	kh_oid_set_t *set;
 	khiter_t iter;
 };
 
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 12/17] delta-islands: convert island_marks khash to use oids
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (10 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 11/17] khash: rename kh_oid_t to kh_oid_set Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20 17:38       ` Jonathan Tan
  2019-06-20  7:41     ` [PATCH 13/17] pack-bitmap: convert khash_sha1 maps into kh_oid_map Jeff King
                       ` (4 subsequent siblings)
  16 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

All of the users of this map have an actual "struct object_id" rather
than a bare sha1. Let's use the more descriptive type (and get one step
closer to dropping khash_sha1 entirely).

Signed-off-by: Jeff King <peff@peff.net>
---
 delta-islands.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/delta-islands.c b/delta-islands.c
index 5f3ab914f5..88d102298c 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -22,7 +22,7 @@
 
 KHASH_INIT(str, const char *, void *, 1, kh_str_hash_func, kh_str_hash_equal)
 
-static khash_sha1 *island_marks;
+static kh_oid_map_t *island_marks;
 static unsigned island_counter;
 static unsigned island_counter_core;
 
@@ -105,7 +105,7 @@ int in_same_island(const struct object_id *trg_oid, const struct object_id *src_
 	 * If we don't have a bitmap for the target, we can delta it
 	 * against anything -- it's not an important object
 	 */
-	trg_pos = kh_get_sha1(island_marks, trg_oid->hash);
+	trg_pos = kh_get_oid_map(island_marks, *trg_oid);
 	if (trg_pos >= kh_end(island_marks))
 		return 1;
 
@@ -113,7 +113,7 @@ int in_same_island(const struct object_id *trg_oid, const struct object_id *src_
 	 * if the source (our delta base) doesn't have a bitmap,
 	 * we don't want to base any deltas on it!
 	 */
-	src_pos = kh_get_sha1(island_marks, src_oid->hash);
+	src_pos = kh_get_oid_map(island_marks, *src_oid);
 	if (src_pos >= kh_end(island_marks))
 		return 0;
 
@@ -129,11 +129,11 @@ int island_delta_cmp(const struct object_id *a, const struct object_id *b)
 	if (!island_marks)
 		return 0;
 
-	a_pos = kh_get_sha1(island_marks, a->hash);
+	a_pos = kh_get_oid_map(island_marks, *a);
 	if (a_pos < kh_end(island_marks))
 		a_bitmap = kh_value(island_marks, a_pos);
 
-	b_pos = kh_get_sha1(island_marks, b->hash);
+	b_pos = kh_get_oid_map(island_marks, *b);
 	if (b_pos < kh_end(island_marks))
 		b_bitmap = kh_value(island_marks, b_pos);
 
@@ -154,7 +154,7 @@ static struct island_bitmap *create_or_get_island_marks(struct object *obj)
 	khiter_t pos;
 	int hash_ret;
 
-	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
+	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);
 	if (hash_ret)
 		kh_value(island_marks, pos) = island_bitmap_new(NULL);
 
@@ -167,7 +167,7 @@ static void set_island_marks(struct object *obj, struct island_bitmap *marks)
 	khiter_t pos;
 	int hash_ret;
 
-	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
+	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);
 	if (hash_ret) {
 		/*
 		 * We don't have one yet; make a copy-on-write of the
@@ -279,7 +279,7 @@ void resolve_tree_islands(struct repository *r,
 		struct name_entry entry;
 		khiter_t pos;
 
-		pos = kh_get_sha1(island_marks, ent->idx.oid.hash);
+		pos = kh_get_oid_map(island_marks, ent->idx.oid);
 		if (pos >= kh_end(island_marks))
 			continue;
 
@@ -456,7 +456,7 @@ static void deduplicate_islands(struct repository *r)
 
 void load_delta_islands(struct repository *r)
 {
-	island_marks = kh_init_sha1();
+	island_marks = kh_init_oid_map();
 	remote_islands = kh_init_str();
 
 	git_config(island_config_callback, NULL);
@@ -468,7 +468,7 @@ void load_delta_islands(struct repository *r)
 
 void propagate_island_marks(struct commit *commit)
 {
-	khiter_t pos = kh_get_sha1(island_marks, commit->object.oid.hash);
+	khiter_t pos = kh_get_oid_map(island_marks, commit->object.oid);
 
 	if (pos < kh_end(island_marks)) {
 		struct commit_list *p;
@@ -490,7 +490,7 @@ int compute_pack_layers(struct packing_data *to_pack)
 
 	for (i = 0; i < to_pack->nr_objects; ++i) {
 		struct object_entry *entry = &to_pack->objects[i];
-		khiter_t pos = kh_get_sha1(island_marks, entry->idx.oid.hash);
+		khiter_t pos = kh_get_oid_map(island_marks, entry->idx.oid);
 
 		oe_set_layer(to_pack, entry, 1);
 
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 13/17] pack-bitmap: convert khash_sha1 maps into kh_oid_map
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (11 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 12/17] delta-islands: convert island_marks khash to use oids Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 14/17] khash: drop sha1-specific map types Jeff King
                       ` (3 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

All of the users of our khash_sha1 maps actually have a "struct
object_id". Let's use the more descriptive type.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-bitmap-write.c | 14 +++++++-------
 pack-bitmap.c       |  8 ++++----
 pack-bitmap.h       |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 0637378533..fa78a460c9 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -28,8 +28,8 @@ struct bitmap_writer {
 	struct ewah_bitmap *blobs;
 	struct ewah_bitmap *tags;
 
-	khash_sha1 *bitmaps;
-	khash_sha1 *reused;
+	kh_oid_map_t *bitmaps;
+	kh_oid_map_t *reused;
 	struct packing_data *to_pack;
 
 	struct bitmapped_commit *selected;
@@ -175,7 +175,7 @@ add_to_include_set(struct bitmap *base, struct commit *commit)
 	if (bitmap_get(base, bitmap_pos))
 		return 0;
 
-	hash_pos = kh_get_sha1(writer.bitmaps, commit->object.oid.hash);
+	hash_pos = kh_get_oid_map(writer.bitmaps, commit->object.oid);
 	if (hash_pos < kh_end(writer.bitmaps)) {
 		struct bitmapped_commit *bc = kh_value(writer.bitmaps, hash_pos);
 		bitmap_or_ewah(base, bc->bitmap);
@@ -256,7 +256,7 @@ void bitmap_writer_build(struct packing_data *to_pack)
 	struct bitmap *base = bitmap_new();
 	struct rev_info revs;
 
-	writer.bitmaps = kh_init_sha1();
+	writer.bitmaps = kh_init_oid_map();
 	writer.to_pack = to_pack;
 
 	if (writer.show_progress)
@@ -311,7 +311,7 @@ void bitmap_writer_build(struct packing_data *to_pack)
 		if (i >= reuse_after)
 			stored->flags |= BITMAP_FLAG_REUSE;
 
-		hash_pos = kh_put_sha1(writer.bitmaps, object->oid.hash, &hash_ret);
+		hash_pos = kh_put_oid_map(writer.bitmaps, object->oid, &hash_ret);
 		if (hash_ret == 0)
 			die("Duplicate entry when writing index: %s",
 			    oid_to_hex(&object->oid));
@@ -366,7 +366,7 @@ void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack)
 	if (!(bitmap_git = prepare_bitmap_git(to_pack->repo)))
 		return;
 
-	writer.reused = kh_init_sha1();
+	writer.reused = kh_init_oid_map();
 	rebuild_existing_bitmaps(bitmap_git, to_pack, writer.reused,
 				 writer.show_progress);
 	/*
@@ -382,7 +382,7 @@ static struct ewah_bitmap *find_reused_bitmap(const struct object_id *oid)
 	if (!writer.reused)
 		return NULL;
 
-	hash_pos = kh_get_sha1(writer.reused, oid->hash);
+	hash_pos = kh_get_oid_map(writer.reused, *oid);
 	if (hash_pos >= kh_end(writer.reused))
 		return NULL;
 
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 998133588f..ed2befaac6 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1041,7 +1041,7 @@ static int rebuild_bitmap(uint32_t *reposition,
 
 int rebuild_existing_bitmaps(struct bitmap_index *bitmap_git,
 			     struct packing_data *mapping,
-			     khash_sha1 *reused_bitmaps,
+			     kh_oid_map_t *reused_bitmaps,
 			     int show_progress)
 {
 	uint32_t i, num_objects;
@@ -1080,9 +1080,9 @@ int rebuild_existing_bitmaps(struct bitmap_index *bitmap_git,
 			if (!rebuild_bitmap(reposition,
 					    lookup_stored_bitmap(stored),
 					    rebuild)) {
-				hash_pos = kh_put_sha1(reused_bitmaps,
-						       stored->oid.hash,
-						       &hash_ret);
+				hash_pos = kh_put_oid_map(reused_bitmaps,
+							  stored->oid,
+							  &hash_ret);
 				kh_value(reused_bitmaps, hash_pos) =
 					bitmap_to_ewah(rebuild);
 			}
diff --git a/pack-bitmap.h b/pack-bitmap.h
index ee9792264c..00de3ec8e4 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -51,7 +51,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *,
 				       struct packed_git **packfile,
 				       uint32_t *entries, off_t *up_to);
 int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping,
-			     khash_sha1 *reused_bitmaps, int show_progress);
+			     kh_oid_map_t *reused_bitmaps, int show_progress);
 void free_bitmap_index(struct bitmap_index *);
 
 /*
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 14/17] khash: drop sha1-specific map types
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (12 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 13/17] pack-bitmap: convert khash_sha1 maps into kh_oid_map Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20  7:41     ` [PATCH 15/17] khash: rename oid helper functions Jeff King
                       ` (2 subsequent siblings)
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

All of the callers of khash_sha1 and khash_sha1_pos have been removed,
in favor of using maps that use "struct object_id" as their keys. Let's
drop these now-obsolete types.

Signed-off-by: Jeff King <peff@peff.net>
---
 khash.h | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/khash.h b/khash.h
index ae9f78347f..cb2cd3d7e4 100644
--- a/khash.h
+++ b/khash.h
@@ -324,14 +324,6 @@ static const double __ac_HASH_UPPER = 0.77;
 		code;												\
 	} }
 
-#define __kh_oid_cmp(a, b) (hashcmp(a, b) == 0)
-
-KHASH_INIT(sha1, const unsigned char *, void *, 1, sha1hash, __kh_oid_cmp)
-typedef kh_sha1_t khash_sha1;
-
-KHASH_INIT(sha1_pos, const unsigned char *, int, 1, sha1hash, __kh_oid_cmp)
-typedef kh_sha1_pos_t khash_sha1_pos;
-
 static inline unsigned int oid_hash(struct object_id oid)
 {
 	return sha1hash(oid.hash);
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 15/17] khash: rename oid helper functions
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (13 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 14/17] khash: drop sha1-specific map types Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20 17:44       ` Junio C Hamano
  2019-06-20  7:41     ` [PATCH 16/17] hash.h: move object_id definition from cache.h Jeff King
  2019-06-20  7:41     ` [PATCH 17/17] hashmap: convert sha1hash() to oidhash() Jeff King
  16 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

For use in object_id hash tables, we have oid_hash() and oid_equal().
But these are confusingly similar to the existing oideq() and the
oidhash() we plan to add to replace sha1hash().

The big difference from those functions is that rather than accepting a
const pointer to the "struct object_id", we take the arguments by value
(which is a khash internal convention). So let's make that obvious by
calling them oidhash_by_value() and oideq_by_value().

Those names are fairly horrendous to type, but we rarely need to do so;
they are passed to the khash implementation macro and then only used
internally. Callers get to use the nice kh_put_oid_map(), etc.

Signed-off-by: Jeff King <peff@peff.net>
---
 khash.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/khash.h b/khash.h
index cb2cd3d7e4..f911d2b005 100644
--- a/khash.h
+++ b/khash.h
@@ -324,20 +324,20 @@ static const double __ac_HASH_UPPER = 0.77;
 		code;												\
 	} }
 
-static inline unsigned int oid_hash(struct object_id oid)
+static inline unsigned int oidhash_by_value(struct object_id oid)
 {
 	return sha1hash(oid.hash);
 }
 
-static inline int oid_equal(struct object_id a, struct object_id b)
+static inline int oideq_by_value(struct object_id a, struct object_id b)
 {
 	return oideq(&a, &b);
 }
 
-KHASH_INIT(oid_set, struct object_id, int, 0, oid_hash, oid_equal)
+KHASH_INIT(oid_set, struct object_id, int, 0, oidhash_by_value, oideq_by_value)
 
-KHASH_INIT(oid_map, struct object_id, void *, 1, oid_hash, oid_equal)
+KHASH_INIT(oid_map, struct object_id, void *, 1, oidhash_by_value, oideq_by_value)
 
-KHASH_INIT(oid_pos, struct object_id, int, 1, oid_hash, oid_equal)
+KHASH_INIT(oid_pos, struct object_id, int, 1, oidhash_by_value, oideq_by_value)
 
 #endif /* __AC_KHASH_H */
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 16/17] hash.h: move object_id definition from cache.h
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (14 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 15/17] khash: rename oid helper functions Jeff King
@ 2019-06-20  7:41     ` Jeff King
  2019-06-20 17:41       ` Junio C Hamano
  2019-06-20  7:41     ` [PATCH 17/17] hashmap: convert sha1hash() to oidhash() Jeff King
  16 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Our hashmap.h helpfully defines a sha1hash() function. But it cannot
define a similar oidhash() without including all of cache.h, which
itself wants to include hashmap.h! Let's break this circular dependency
by moving the definition to hash.h, along with the remaining RAWSZ
macros, etc. That will put them with the existing git_hash_algo
definition.

One alternative would be to move oidhash() into cache.h, but it's
already quite bloated. We're better off moving things out than in.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h | 24 ------------------------
 hash.h  | 24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/cache.h b/cache.h
index bf20337ef4..37e0b82064 100644
--- a/cache.h
+++ b/cache.h
@@ -43,30 +43,6 @@ int git_deflate_end_gently(git_zstream *);
 int git_deflate(git_zstream *, int flush);
 unsigned long git_deflate_bound(git_zstream *, unsigned long);
 
-/* The length in bytes and in hex digits of an object name (SHA-1 value). */
-#define GIT_SHA1_RAWSZ 20
-#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
-/* The block size of SHA-1. */
-#define GIT_SHA1_BLKSZ 64
-
-/* The length in bytes and in hex digits of an object name (SHA-256 value). */
-#define GIT_SHA256_RAWSZ 32
-#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
-/* The block size of SHA-256. */
-#define GIT_SHA256_BLKSZ 64
-
-/* The length in byte and in hex digits of the largest possible hash value. */
-#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ
-#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ
-/* The largest possible block size for any supported hash. */
-#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ
-
-struct object_id {
-	unsigned char hash[GIT_MAX_RAWSZ];
-};
-
-#define the_hash_algo the_repository->hash_algo
-
 #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
 #define DTYPE(de)	((de)->d_type)
 #else
diff --git a/hash.h b/hash.h
index 661c9f2281..52a4f1a3f4 100644
--- a/hash.h
+++ b/hash.h
@@ -139,4 +139,28 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
 	return p - hash_algos;
 }
 
+/* The length in bytes and in hex digits of an object name (SHA-1 value). */
+#define GIT_SHA1_RAWSZ 20
+#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
+/* The block size of SHA-1. */
+#define GIT_SHA1_BLKSZ 64
+
+/* The length in bytes and in hex digits of an object name (SHA-256 value). */
+#define GIT_SHA256_RAWSZ 32
+#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
+/* The block size of SHA-256. */
+#define GIT_SHA256_BLKSZ 64
+
+/* The length in byte and in hex digits of the largest possible hash value. */
+#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ
+#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ
+/* The largest possible block size for any supported hash. */
+#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ
+
+struct object_id {
+	unsigned char hash[GIT_MAX_RAWSZ];
+};
+
+#define the_hash_algo the_repository->hash_algo
+
 #endif
-- 
2.22.0.732.g5402924b4b


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 17/17] hashmap: convert sha1hash() to oidhash()
  2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
                       ` (15 preceding siblings ...)
  2019-06-20  7:41     ` [PATCH 16/17] hash.h: move object_id definition from cache.h Jeff King
@ 2019-06-20  7:41     ` Jeff King
  16 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20  7:41 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

There are no callers left of sha1hash() that do not simply pass the
"hash" member of a "struct object_id". Let's get rid of the outdated
sha1-specific function and provide one that operates on the whole struct
(even though the technique, taking the first few bytes of the hash, will
remain the same).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/describe.c | 4 ++--
 decorate.c         | 2 +-
 diffcore-rename.c  | 2 +-
 hashmap.h          | 8 +++++---
 khash.h            | 2 +-
 object.c           | 2 +-
 pack-objects.c     | 2 +-
 patch-ids.c        | 2 +-
 8 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 0a5cde00a2..200154297d 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -76,7 +76,7 @@ static int commit_name_neq(const void *unused_cmp_data,
 
 static inline struct commit_name *find_commit_name(const struct object_id *peeled)
 {
-	return hashmap_get_from_hash(&names, sha1hash(peeled->hash), peeled);
+	return hashmap_get_from_hash(&names, oidhash(peeled), peeled);
 }
 
 static int replace_name(struct commit_name *e,
@@ -123,7 +123,7 @@ static void add_to_known_names(const char *path,
 		if (!e) {
 			e = xmalloc(sizeof(struct commit_name));
 			oidcpy(&e->peeled, peeled);
-			hashmap_entry_init(e, sha1hash(peeled->hash));
+			hashmap_entry_init(e, oidhash(peeled));
 			hashmap_add(&names, e);
 			e->path = NULL;
 		}
diff --git a/decorate.c b/decorate.c
index de31331fa4..a605b1b5f4 100644
--- a/decorate.c
+++ b/decorate.c
@@ -8,7 +8,7 @@
 
 static unsigned int hash_obj(const struct object *obj, unsigned int n)
 {
-	return sha1hash(obj->oid.hash) % n;
+	return oidhash(&obj->oid) % n;
 }
 
 static void *insert_decoration(struct decoration *n, const struct object *base, void *decoration)
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 07bd34b631..1e50d491c1 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -266,7 +266,7 @@ static unsigned int hash_filespec(struct repository *r,
 		hash_object_file(filespec->data, filespec->size, "blob",
 				 &filespec->oid);
 	}
-	return sha1hash(filespec->oid.hash);
+	return oidhash(&filespec->oid);
 }
 
 static int find_identical_files(struct hashmap *srcs,
diff --git a/hashmap.h b/hashmap.h
index f95593b6cf..8424911566 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -1,6 +1,8 @@
 #ifndef HASHMAP_H
 #define HASHMAP_H
 
+#include "hash.h"
+
 /*
  * Generic implementation of hash-based key-value mappings.
  *
@@ -118,14 +120,14 @@ unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len);
  * the results will be different on big-endian and little-endian
  * platforms, so they should not be stored or transferred over the net.
  */
-static inline unsigned int sha1hash(const unsigned char *sha1)
+static inline unsigned int oidhash(const struct object_id *oid)
 {
 	/*
-	 * Equivalent to 'return *(unsigned int *)sha1;', but safe on
+	 * Equivalent to 'return *(unsigned int *)oid->hash;', but safe on
 	 * platforms that don't support unaligned reads.
 	 */
 	unsigned int hash;
-	memcpy(&hash, sha1, sizeof(hash));
+	memcpy(&hash, oid->hash, sizeof(hash));
 	return hash;
 }
 
diff --git a/khash.h b/khash.h
index f911d2b005..21c2095216 100644
--- a/khash.h
+++ b/khash.h
@@ -326,7 +326,7 @@ static const double __ac_HASH_UPPER = 0.77;
 
 static inline unsigned int oidhash_by_value(struct object_id oid)
 {
-	return sha1hash(oid.hash);
+	return oidhash(&oid);
 }
 
 static inline int oideq_by_value(struct object_id a, struct object_id b)
diff --git a/object.c b/object.c
index 317647da3e..94db02214a 100644
--- a/object.c
+++ b/object.c
@@ -61,7 +61,7 @@ int type_from_string_gently(const char *str, ssize_t len, int gentle)
  */
 static unsigned int hash_obj(const struct object_id *oid, unsigned int n)
 {
-	return sha1hash(oid->hash) & (n - 1);
+	return oidhash(oid) & (n - 1);
 }
 
 /*
diff --git a/pack-objects.c b/pack-objects.c
index 00a5f6e0ec..52560293b6 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -11,7 +11,7 @@ static uint32_t locate_object_entry_hash(struct packing_data *pdata,
 {
 	uint32_t i, mask = (pdata->index_size - 1);
 
-	i = sha1hash(oid->hash) & mask;
+	i = oidhash(oid) & mask;
 
 	while (pdata->index[i] > 0) {
 		uint32_t pos = pdata->index[i] - 1;
diff --git a/patch-ids.c b/patch-ids.c
index f70d396654..e8c150d0c9 100644
--- a/patch-ids.c
+++ b/patch-ids.c
@@ -83,7 +83,7 @@ static int init_patch_id_entry(struct patch_id *patch,
 	if (commit_patch_id(commit, &ids->diffopts, &header_only_patch_id, 1, 0))
 		return -1;
 
-	hashmap_entry_init(patch, sha1hash(header_only_patch_id.hash));
+	hashmap_entry_init(patch, oidhash(&header_only_patch_id));
 	return 0;
 }
 
-- 
2.22.0.732.g5402924b4b

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/17] object: convert create_object() to use object_id
  2019-06-20  7:41     ` [PATCH 09/17] object: convert create_object() to use object_id Jeff King
@ 2019-06-20 14:21       ` Ramsay Jones
  2019-06-20 18:23         ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Ramsay Jones @ 2019-06-20 14:21 UTC (permalink / raw)
  To: Jeff King, Christian Couder
  Cc: git, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder



On 20/06/2019 08:41, Jeff King wrote:
> There are no callers left of lookup_object() that aren't just passing us

s/lookup_object/create_object/

ATB,
Ramsay Jones

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/17] describe: fix accidental oid/hash type-punning
  2019-06-20  7:40     ` [PATCH 01/17] describe: fix accidental oid/hash type-punning Jeff King
@ 2019-06-20 16:32       ` Junio C Hamano
  2019-06-20 18:25         ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2019-06-20 16:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Jeff King <peff@peff.net> writes:

> The find_commit_name() function passes an object_id.hash as the key of a
> hashmap. That ends up in commit_name_neq(), which then feeds it to
> oideq(). Which means we should actually be the whole "struct object_id".
>
> It works anyway because pointers to the two are interchangeable. And
> because we're going through a layer of void pointers, the compiler
> doesn't notice the type mismatch.

Wow.  Good eyes.  I wouldn't have noticed this (and for the reasons
you stated, it is very tricky for any clever compiler to notice it).

Impressed.

> But it's worth cleaning up (especially since once we switch away from
> sha1hash() on the same line, accessing the hash member will look doubly
> out of place).

Yup.  Thanks.

>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/describe.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 1409cedce2..0a5cde00a2 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -76,7 +76,7 @@ static int commit_name_neq(const void *unused_cmp_data,
>  
>  static inline struct commit_name *find_commit_name(const struct object_id *peeled)
>  {
> -	return hashmap_get_from_hash(&names, sha1hash(peeled->hash), peeled->hash);
> +	return hashmap_get_from_hash(&names, sha1hash(peeled->hash), peeled);
>  }
>  
>  static int replace_name(struct commit_name *e,

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/17] delta-islands: convert island_marks khash to use oids
  2019-06-20  7:41     ` [PATCH 12/17] delta-islands: convert island_marks khash to use oids Jeff King
@ 2019-06-20 17:38       ` Jonathan Tan
  2019-06-20 18:29         ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Jonathan Tan @ 2019-06-20 17:38 UTC (permalink / raw)
  To: peff; +Cc: git, Jonathan Tan

> @@ -105,7 +105,7 @@ int in_same_island(const struct object_id *trg_oid, const struct object_id *src_
>  	 * If we don't have a bitmap for the target, we can delta it
>  	 * against anything -- it's not an important object
>  	 */
> -	trg_pos = kh_get_sha1(island_marks, trg_oid->hash);
> +	trg_pos = kh_get_oid_map(island_marks, *trg_oid);

[snip]

> @@ -154,7 +154,7 @@ static struct island_bitmap *create_or_get_island_marks(struct object *obj)
>  	khiter_t pos;
>  	int hash_ret;
>  
> -	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
> +	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);

Thanks for doing this cleanup. The entire series (17 patches) look good
to me. The only remotely surprising thing to me was that OIDs are passed
by value on the stack, both for kh_get_oid_map() and kh_put_oid_map(),
but I see that this is how things currently work (and anyway, changing
this is beyond the scope of this patch set).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 16/17] hash.h: move object_id definition from cache.h
  2019-06-20  7:41     ` [PATCH 16/17] hash.h: move object_id definition from cache.h Jeff King
@ 2019-06-20 17:41       ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2019-06-20 17:41 UTC (permalink / raw)
  To: Jeff King
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Jeff King <peff@peff.net> writes:

> Our hashmap.h helpfully defines a sha1hash() function. But it cannot
> define a similar oidhash() without including all of cache.h, which
> itself wants to include hashmap.h! Let's break this circular dependency
> by moving the definition to hash.h, along with the remaining RAWSZ
> macros, etc. That will put them with the existing git_hash_algo
> definition.
>
> One alternative would be to move oidhash() into cache.h, but it's
> already quite bloated. We're better off moving things out than in.

Makes sense.

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  cache.h | 24 ------------------------
>  hash.h  | 24 ++++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 24 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index bf20337ef4..37e0b82064 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -43,30 +43,6 @@ int git_deflate_end_gently(git_zstream *);
>  int git_deflate(git_zstream *, int flush);
>  unsigned long git_deflate_bound(git_zstream *, unsigned long);
>  
> -/* The length in bytes and in hex digits of an object name (SHA-1 value). */
> -#define GIT_SHA1_RAWSZ 20
> -#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
> -/* The block size of SHA-1. */
> -#define GIT_SHA1_BLKSZ 64
> -
> -/* The length in bytes and in hex digits of an object name (SHA-256 value). */
> -#define GIT_SHA256_RAWSZ 32
> -#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
> -/* The block size of SHA-256. */
> -#define GIT_SHA256_BLKSZ 64
> -
> -/* The length in byte and in hex digits of the largest possible hash value. */
> -#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ
> -#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ
> -/* The largest possible block size for any supported hash. */
> -#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ
> -
> -struct object_id {
> -	unsigned char hash[GIT_MAX_RAWSZ];
> -};
> -
> -#define the_hash_algo the_repository->hash_algo
> -
>  #if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)
>  #define DTYPE(de)	((de)->d_type)
>  #else
> diff --git a/hash.h b/hash.h
> index 661c9f2281..52a4f1a3f4 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -139,4 +139,28 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
>  	return p - hash_algos;
>  }
>  
> +/* The length in bytes and in hex digits of an object name (SHA-1 value). */
> +#define GIT_SHA1_RAWSZ 20
> +#define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ)
> +/* The block size of SHA-1. */
> +#define GIT_SHA1_BLKSZ 64
> +
> +/* The length in bytes and in hex digits of an object name (SHA-256 value). */
> +#define GIT_SHA256_RAWSZ 32
> +#define GIT_SHA256_HEXSZ (2 * GIT_SHA256_RAWSZ)
> +/* The block size of SHA-256. */
> +#define GIT_SHA256_BLKSZ 64
> +
> +/* The length in byte and in hex digits of the largest possible hash value. */
> +#define GIT_MAX_RAWSZ GIT_SHA256_RAWSZ
> +#define GIT_MAX_HEXSZ GIT_SHA256_HEXSZ
> +/* The largest possible block size for any supported hash. */
> +#define GIT_MAX_BLKSZ GIT_SHA256_BLKSZ
> +
> +struct object_id {
> +	unsigned char hash[GIT_MAX_RAWSZ];
> +};
> +
> +#define the_hash_algo the_repository->hash_algo
> +
>  #endif

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 15/17] khash: rename oid helper functions
  2019-06-20  7:41     ` [PATCH 15/17] khash: rename oid helper functions Jeff King
@ 2019-06-20 17:44       ` Junio C Hamano
  2019-06-20 18:27         ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2019-06-20 17:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Jeff King <peff@peff.net> writes:

> For use in object_id hash tables, we have oid_hash() and oid_equal().
> But these are confusingly similar to the existing oideq() and the
> oidhash() we plan to add to replace sha1hash().
>
> The big difference from those functions is that rather than accepting a
> const pointer to the "struct object_id", we take the arguments by value
> (which is a khash internal convention). So let's make that obvious by
> calling them oidhash_by_value() and oideq_by_value().
>
> Those names are fairly horrendous to type, but we rarely need to do so;
> they are passed to the khash implementation macro and then only used
> internally. Callers get to use the nice kh_put_oid_map(), etc.

The pass-by-value interface feels a bit unforunate but hopefully
"static inline" would help us avoid actually copying the struct left
and right as we make calls to them X-<.


> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  khash.h | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/khash.h b/khash.h
> index cb2cd3d7e4..f911d2b005 100644
> --- a/khash.h
> +++ b/khash.h
> @@ -324,20 +324,20 @@ static const double __ac_HASH_UPPER = 0.77;
>  		code;												\
>  	} }
>  
> -static inline unsigned int oid_hash(struct object_id oid)
> +static inline unsigned int oidhash_by_value(struct object_id oid)
>  {
>  	return sha1hash(oid.hash);
>  }
>  
> -static inline int oid_equal(struct object_id a, struct object_id b)
> +static inline int oideq_by_value(struct object_id a, struct object_id b)
>  {
>  	return oideq(&a, &b);
>  }
>  
> -KHASH_INIT(oid_set, struct object_id, int, 0, oid_hash, oid_equal)
> +KHASH_INIT(oid_set, struct object_id, int, 0, oidhash_by_value, oideq_by_value)
>  
> -KHASH_INIT(oid_map, struct object_id, void *, 1, oid_hash, oid_equal)
> +KHASH_INIT(oid_map, struct object_id, void *, 1, oidhash_by_value, oideq_by_value)
>  
> -KHASH_INIT(oid_pos, struct object_id, int, 1, oid_hash, oid_equal)
> +KHASH_INIT(oid_pos, struct object_id, int, 1, oidhash_by_value, oideq_by_value)
>  
>  #endif /* __AC_KHASH_H */

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/17] object: convert create_object() to use object_id
  2019-06-20 14:21       ` Ramsay Jones
@ 2019-06-20 18:23         ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20 18:23 UTC (permalink / raw)
  To: Ramsay Jones
  Cc: Christian Couder, git, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Jonathan Tan,
	SZEDER Gábor, Christian Couder

On Thu, Jun 20, 2019 at 03:21:43PM +0100, Ramsay Jones wrote:

> 
> 
> On 20/06/2019 08:41, Jeff King wrote:
> > There are no callers left of lookup_object() that aren't just passing us
> 
> s/lookup_object/create_object/

Heh, thanks. I took a final pass at these commit messages to normalize
them, since they are all basically doing the same thing, and obviously I
screwed up a bit of cut-and-paste.

(Normally I'd say that is a sign they should be lumped together, but the
diffs are so noisy due to the caller updates that I think it was worth
keeping them split).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/17] describe: fix accidental oid/hash type-punning
  2019-06-20 16:32       ` Junio C Hamano
@ 2019-06-20 18:25         ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20 18:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Thu, Jun 20, 2019 at 09:32:49AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The find_commit_name() function passes an object_id.hash as the key of a
> > hashmap. That ends up in commit_name_neq(), which then feeds it to
> > oideq(). Which means we should actually be the whole "struct object_id".
> >
> > It works anyway because pointers to the two are interchangeable. And
> > because we're going through a layer of void pointers, the compiler
> > doesn't notice the type mismatch.
> 
> Wow.  Good eyes.  I wouldn't have noticed this (and for the reasons
> you stated, it is very tricky for any clever compiler to notice it).
> 
> Impressed.

It only looks impressive in retrospect. It became very obvious when
updating the reference to sha1hash(peeled->hash) in the same line, and
then scratching my head about why this one was looking at the hash
member. But history rewriting lets me re-order it to make myself look
good. Thanks, "rebase -i"! :)

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 15/17] khash: rename oid helper functions
  2019-06-20 17:44       ` Junio C Hamano
@ 2019-06-20 18:27         ` Jeff King
  2019-06-23 16:00           ` René Scharfe
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2019-06-20 18:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

On Thu, Jun 20, 2019 at 10:44:17AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > For use in object_id hash tables, we have oid_hash() and oid_equal().
> > But these are confusingly similar to the existing oideq() and the
> > oidhash() we plan to add to replace sha1hash().
> >
> > The big difference from those functions is that rather than accepting a
> > const pointer to the "struct object_id", we take the arguments by value
> > (which is a khash internal convention). So let's make that obvious by
> > calling them oidhash_by_value() and oideq_by_value().
> >
> > Those names are fairly horrendous to type, but we rarely need to do so;
> > they are passed to the khash implementation macro and then only used
> > internally. Callers get to use the nice kh_put_oid_map(), etc.
> 
> The pass-by-value interface feels a bit unforunate but hopefully
> "static inline" would help us avoid actually copying the struct left
> and right as we make calls to them X-<.

Yeah, exactly. I think it should end up quite fast, though if anybody
(René?) wants to try tweaking khash and timing it, be my guest.

I do think if it took the more usual pass-by-const-pointer we'd probably
still end up with the exact same code from an optimizing compiler, since
all of the references and dereferences would cancel out once it was
inlined.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/17] delta-islands: convert island_marks khash to use oids
  2019-06-20 17:38       ` Jonathan Tan
@ 2019-06-20 18:29         ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-20 18:29 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Thu, Jun 20, 2019 at 10:38:23AM -0700, Jonathan Tan wrote:

> > @@ -154,7 +154,7 @@ static struct island_bitmap *create_or_get_island_marks(struct object *obj)
> >  	khiter_t pos;
> >  	int hash_ret;
> >  
> > -	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
> > +	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);
> 
> Thanks for doing this cleanup. The entire series (17 patches) look good
> to me. The only remotely surprising thing to me was that OIDs are passed
> by value on the stack, both for kh_get_oid_map() and kh_put_oid_map(),
> but I see that this is how things currently work (and anyway, changing
> this is beyond the scope of this patch set).

Thanks for reviewing. Yeah, the pass-by-value surprised me too, as it's
been a while since I've had to touch khash. I think it all cancels out
performance-wise because of the inlining, but it might be a fun thing to
experiment with.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 15/17] khash: rename oid helper functions
  2019-06-20 18:27         ` Jeff King
@ 2019-06-23 16:00           ` René Scharfe
  2019-06-23 22:46             ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: René Scharfe @ 2019-06-23 16:00 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, SZEDER Gábor, Christian Couder

Am 20.06.19 um 20:27 schrieb Jeff King:
> On Thu, Jun 20, 2019 at 10:44:17AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>
>>> For use in object_id hash tables, we have oid_hash() and oid_equal().
>>> But these are confusingly similar to the existing oideq() and the
>>> oidhash() we plan to add to replace sha1hash().
>>>
>>> The big difference from those functions is that rather than accepting a
>>> const pointer to the "struct object_id", we take the arguments by value
>>> (which is a khash internal convention). So let's make that obvious by
>>> calling them oidhash_by_value() and oideq_by_value().
>>>
>>> Those names are fairly horrendous to type, but we rarely need to do so;
>>> they are passed to the khash implementation macro and then only used
>>> internally. Callers get to use the nice kh_put_oid_map(), etc.
>>
>> The pass-by-value interface feels a bit unforunate but hopefully
>> "static inline" would help us avoid actually copying the struct left
>> and right as we make calls to them X-<.
>
> Yeah, exactly. I think it should end up quite fast, though if anybody
> (René?) wants to try tweaking khash and timing it, be my guest.
>
> I do think if it took the more usual pass-by-const-pointer we'd probably
> still end up with the exact same code from an optimizing compiler, since
> all of the references and dereferences would cancel out once it was
> inlined.

I suspect it depends on the compiler and the exact details.  Here's a
simple experiment: https://godbolt.org/z/kuv4NE.  It has a comparison
function for call by value and one for call by reference as well as a
wrapper for each with the opposite interface.

An enlightened compiler would emit the same code for functions with the
same interface, but none of the current ones do in all cases.  Clang
and MSVC do emit the same code for the two call by value functions, so
there's hope.  And moving to call by reference might make matters worse.
GCC adds some 128-bit moves to both wrappers for some reason.

Perhaps it doesn't matter much anyway e.g. due to caching, especially
for the branch-free variants.  A definite answer would require
measurements.  Your cleanup would make the necessary surgery on khash.h
a bit easier by reducing the number of functions and definitions.

René

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 15/17] khash: rename oid helper functions
  2019-06-23 16:00           ` René Scharfe
@ 2019-06-23 22:46             ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2019-06-23 22:46 UTC (permalink / raw)
  To: René Scharfe
  Cc: Junio C Hamano, Christian Couder, git,
	Ævar Arnfjörð Bjarmason, Jonathan Tan,
	SZEDER Gábor, Christian Couder

On Sun, Jun 23, 2019 at 06:00:50PM +0200, René Scharfe wrote:

> > I do think if it took the more usual pass-by-const-pointer we'd probably
> > still end up with the exact same code from an optimizing compiler, since
> > all of the references and dereferences would cancel out once it was
> > inlined.
> 
> I suspect it depends on the compiler and the exact details.  Here's a
> simple experiment: https://godbolt.org/z/kuv4NE.  It has a comparison
> function for call by value and one for call by reference as well as a
> wrapper for each with the opposite interface.
> 
> An enlightened compiler would emit the same code for functions with the
> same interface, but none of the current ones do in all cases.  Clang
> and MSVC do emit the same code for the two call by value functions, so
> there's hope.  And moving to call by reference might make matters worse.
> GCC adds some 128-bit moves to both wrappers for some reason.

Hmm. I'm unsure whether your simplified setup is really showing
something interesting or whether we'd need to have true "static inline"
functions that are actually _used_ in a hash table to see if there are
any significant differences.

But...

> Perhaps it doesn't matter much anyway e.g. due to caching, especially
> for the branch-free variants.  A definite answer would require
> measurements.  Your cleanup would make the necessary surgery on khash.h
> a bit easier by reducing the number of functions and definitions.

Yeah, my gut feeling is it probably doesn't matter much to the overall
operation even if the inner code is slightly different (though I'm happy
to be overridden by real data). And it's definitely something we can
punt on for later (or never if nobody feels like dealing with it).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-06-24  2:46 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-15 10:06 [PATCH v4 0/4] Test oidmap Christian Couder
2019-06-15 10:06 ` [PATCH v4 1/4] t/helper: add test-oidmap.c Christian Couder
2019-06-15 10:07 ` [PATCH v4 2/4] t: add t0016-oidmap.sh Christian Couder
2019-06-15 10:07 ` [PATCH v4 3/4] oidmap: use sha1hash() instead of static hash() function Christian Couder
2019-06-15 10:07 ` [PATCH v4 4/4] test-hashmap: remove 'hash' command Christian Couder
2019-06-19 21:42 ` [PATCH v4 0/4] Test oidmap Jeff King
2019-06-19 22:09   ` Jeff King
2019-06-19 22:25     ` Christian Couder
2019-06-20  7:39   ` [PATCH 0/17] drop non-object_id hashing Jeff King
2019-06-20  7:40     ` [PATCH 01/17] describe: fix accidental oid/hash type-punning Jeff King
2019-06-20 16:32       ` Junio C Hamano
2019-06-20 18:25         ` Jeff King
2019-06-20  7:40     ` [PATCH 02/17] upload-pack: rename a "sha1" variable to "oid" Jeff King
2019-06-20  7:40     ` [PATCH 03/17] pack-bitmap-write: convert some helpers to use object_id Jeff King
2019-06-20  7:41     ` [PATCH 04/17] pack-objects: convert packlist_find() " Jeff King
2019-06-20  7:41     ` [PATCH 05/17] pack-objects: convert locate_object_entry_hash() to object_id Jeff King
2019-06-20  7:41     ` [PATCH 06/17] object: convert lookup_unknown_object() to use object_id Jeff King
2019-06-20  7:41     ` [PATCH 07/17] object: convert lookup_object() " Jeff King
2019-06-20  7:41     ` [PATCH 08/17] object: convert internal hash_obj() to object_id Jeff King
2019-06-20  7:41     ` [PATCH 09/17] object: convert create_object() to use object_id Jeff King
2019-06-20 14:21       ` Ramsay Jones
2019-06-20 18:23         ` Jeff King
2019-06-20  7:41     ` [PATCH 10/17] khash: drop broken oid_map typedef Jeff King
2019-06-20  7:41     ` [PATCH 11/17] khash: rename kh_oid_t to kh_oid_set Jeff King
2019-06-20  7:41     ` [PATCH 12/17] delta-islands: convert island_marks khash to use oids Jeff King
2019-06-20 17:38       ` Jonathan Tan
2019-06-20 18:29         ` Jeff King
2019-06-20  7:41     ` [PATCH 13/17] pack-bitmap: convert khash_sha1 maps into kh_oid_map Jeff King
2019-06-20  7:41     ` [PATCH 14/17] khash: drop sha1-specific map types Jeff King
2019-06-20  7:41     ` [PATCH 15/17] khash: rename oid helper functions Jeff King
2019-06-20 17:44       ` Junio C Hamano
2019-06-20 18:27         ` Jeff King
2019-06-23 16:00           ` René Scharfe
2019-06-23 22:46             ` Jeff King
2019-06-20  7:41     ` [PATCH 16/17] hash.h: move object_id definition from cache.h Jeff King
2019-06-20 17:41       ` Junio C Hamano
2019-06-20  7:41     ` [PATCH 17/17] hashmap: convert sha1hash() to oidhash() Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).