[PATCH 0/7] oid_array cleanups

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/7] oid_array cleanups
@ 2020-03-30 14:02 Jeff King
  2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:02 UTC (permalink / raw)
  To: git

I recently encountered a repo in the wild that had over 2^31 objects,
and found that cat-file barfed:

  $ git cat-file --batch-all-objects --batch-check
  fatal: size_t overflow: 32 * 18446744071562067968

The issue is that we use an "int" to store the count and the allocation.
Fortunately our st_mult() protection kicks in before we end up with an
undersized buffer, so this shouldn't be dangerous. And while I'd argue
that having that many objects is probably silly and likely to cause
other problems, I'd just as soon we kept our allocating code as robust
as possible.

The first patch is the actual fix, followed by some related type
cleanup. The rest of it is just leftovers from the
s/sha1_array/oid_array/ transition a few years back.

  [1/7]: oid_array: use size_t for count and allocation
  [2/7]: oid_array: use size_t for iteration
  [3/7]: oid_array: rename source file from sha1-array
  [4/7]: test-tool: rename sha1-array to oid-array
  [5/7]: bisect: stop referring to sha1_array
  [6/7]: ref-filter: stop referring to "sha1 array"
  [7/7]: oidset: stop referring to sha1-array

 Makefile                                         |  4 ++--
 bisect.c                                         |  8 ++++----
 builtin/cat-file.c                               |  2 +-
 builtin/diff.c                                   |  2 +-
 builtin/fetch-pack.c                             |  2 +-
 builtin/pack-objects.c                           |  2 +-
 builtin/pull.c                                   |  2 +-
 builtin/receive-pack.c                           |  2 +-
 builtin/send-pack.c                              |  2 +-
 builtin/tag.c                                    |  2 +-
 cache.h                                          |  2 +-
 combine-diff.c                                   |  2 +-
 connect.c                                        |  2 +-
 delta-islands.c                                  |  2 +-
 fetch-pack.c                                     |  2 +-
 object-store.h                                   |  2 +-
 sha1-array.c => oid-array.c                      | 10 +++++-----
 sha1-array.h => oid-array.h                      |  6 +++---
 oidset.h                                         |  2 +-
 parse-options-cb.c                               |  2 +-
 ref-filter.c                                     |  7 +++----
 ref-filter.h                                     |  2 +-
 remote-curl.c                                    |  2 +-
 send-pack.c                                      |  2 +-
 sha1-name.c                                      |  2 +-
 shallow.c                                        |  2 +-
 submodule.c                                      |  2 +-
 t/helper/{test-sha1-array.c => test-oid-array.c} |  8 ++++----
 t/helper/test-tool.c                             |  2 +-
 t/helper/test-tool.h                             |  2 +-
 t/t0064-sha1-array.sh                            | 16 ++++++++--------
 transport.c                                      |  2 +-
 32 files changed, 54 insertions(+), 55 deletions(-)
 rename sha1-array.c => oid-array.c (93%)
 rename sha1-array.h => oid-array.h (97%)
 rename t/helper/{test-sha1-array.c => test-oid-array.c} (83%)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/7] oid_array: use size_t for count and allocation
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
@ 2020-03-30 14:03 ` Jeff King
  2020-03-30 14:09   ` Jeff King
  2020-04-15  0:27   ` Taylor Blau
  2020-03-30 14:03 ` [PATCH 2/7] oid_array: use size_t for iteration Jeff King
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:03 UTC (permalink / raw)
  To: git

The oid_array object uses an "int" to store the number of items and the
allocated size. It's rather unlikely for somebody to have more than 2^31
objects in a repository (the sha1's alone would be 40GB!), but if they
do, we'd overflow our alloc variable.

You can reproduce this case with something like:

  git init repo
  cd repo

  # make a pack with 2^24 objects
  perl -e '
    my $nr = 2**24;

    for (my $i = 0; $i < $nr; $i++) {
	    print "blob\n";
	    print "data 4\n";
	    print pack("N", $i);
    }
  ' | git fast-import

  # now make 256 copies of it; most of these objects will be duplicates,
  # but oid_array doesn't de-dup until all values are read and it can
  # sort the result.
  cd .git/objects/pack/
  pack=$(echo *.pack)
  idx=$(echo *.idx)
  for i in $(seq 0 255); do
    # no need to waste disk space
    ln "$pack" "pack-extra-$i.pack"
    ln "$idx" "pack-extra-$i.idx"
  done

  # and now force an oid_array to store all of it
  git cat-file --batch-all-objects --batch-check

which results in:

  fatal: size_t overflow: 32 * 18446744071562067968

So the good news is that st_mult() sees the problem (the large number is
because our int wraps negative, and then that gets cast to a size_t),
doing the job it was meant to: bailing in crazy situations rather than
causing an undersized buffer.

But we should avoid hitting this case at all, and instead limit
ourselves based on what malloc() is willing to give us. We can easily do
that by switching to size_t.

The cat-file process above made it to ~120GB virtual set size before the
integer overflow (our internal hash storage is 32-bytes now in
preparation for sha256, so we'd expect ~128GB total needed, plus
potentially more to copy from one realloc'd block to another)). After
this patch (and about 130GB of RAM+swap), it does eventually read in the
whole set. No test for obvious reasons.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1-array.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sha1-array.h b/sha1-array.h
index dc1bca9c9a..c5e4b9324f 100644
--- a/sha1-array.h
+++ b/sha1-array.h
@@ -49,8 +49,8 @@
  */
 struct oid_array {
 	struct object_id *oid;
-	int nr;
-	int alloc;
+	size_t nr;
+	size_t alloc;
 	int sorted;
 };

-- 
2.26.0.597.g7e08ed78ff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/7] oid_array: use size_t for iteration
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
  2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
@ 2020-03-30 14:03 ` Jeff King
  2020-03-30 14:03 ` [PATCH 3/7] oid_array: rename source file from sha1-array Jeff King
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:03 UTC (permalink / raw)
  To: git

The previous commit started using size_t for our allocations. There are
some iterations that use int or unsigned, though. These aren't dangerous
with respect to memory, but they could produce incorrect results.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1-array.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sha1-array.c b/sha1-array.c
index 3eeadfede9..bada0c4353 100644
--- a/sha1-array.c
+++ b/sha1-array.c
@@ -46,7 +46,7 @@ int oid_array_for_each(struct oid_array *array,
 		       for_each_oid_fn fn,
 		       void *data)
 {
-	int i;
+	size_t i;
 
 	/* No oid_array_sort() here! See sha1-array.h */
 
@@ -62,7 +62,7 @@ int oid_array_for_each_unique(struct oid_array *array,
 			      for_each_oid_fn fn,
 			      void *data)
 {
-	int i;
+	size_t i;
 
 	if (!array->sorted)
 		oid_array_sort(array);
@@ -82,7 +82,7 @@ void oid_array_filter(struct oid_array *array,
 		      for_each_oid_fn want,
 		      void *cb_data)
 {
-	unsigned nr = array->nr, src, dst;
+	size_t nr = array->nr, src, dst;
 	struct object_id *oids = array->oid;
 
 	for (src = dst = 0; src < nr; src++) {
-- 
2.26.0.597.g7e08ed78ff


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/7] oid_array: rename source file from sha1-array
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
  2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
  2020-03-30 14:03 ` [PATCH 2/7] oid_array: use size_t for iteration Jeff King
@ 2020-03-30 14:03 ` Jeff King
  2020-04-15  0:34   ` Taylor Blau
  2020-03-30 14:04 ` [PATCH 4/7] test-tool: rename sha1-array to oid-array Jeff King
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:03 UTC (permalink / raw)
  To: git

We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.

Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).

I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile                    | 2 +-
 bisect.c                    | 2 +-
 builtin/cat-file.c          | 2 +-
 builtin/diff.c              | 2 +-
 builtin/fetch-pack.c        | 2 +-
 builtin/pack-objects.c      | 2 +-
 builtin/pull.c              | 2 +-
 builtin/receive-pack.c      | 2 +-
 builtin/send-pack.c         | 2 +-
 builtin/tag.c               | 2 +-
 cache.h                     | 2 +-
 combine-diff.c              | 2 +-
 connect.c                   | 2 +-
 delta-islands.c             | 2 +-
 fetch-pack.c                | 2 +-
 object-store.h              | 2 +-
 sha1-array.c => oid-array.c | 4 ++--
 sha1-array.h => oid-array.h | 2 +-
 parse-options-cb.c          | 2 +-
 ref-filter.h                | 2 +-
 remote-curl.c               | 2 +-
 send-pack.c                 | 2 +-
 sha1-name.c                 | 2 +-
 shallow.c                   | 2 +-
 submodule.c                 | 2 +-
 t/helper/test-sha1-array.c  | 2 +-
 transport.c                 | 2 +-
 27 files changed, 28 insertions(+), 28 deletions(-)
 rename sha1-array.c => oid-array.c (96%)
 rename sha1-array.h => oid-array.h (98%)

diff --git a/Makefile b/Makefile
index ef1ff2228f..abfe5cc29b 100644
--- a/Makefile
+++ b/Makefile
@@ -929,6 +929,7 @@ LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
 LIB_OBJS += oidmap.o
 LIB_OBJS += oidset.o
+LIB_OBJS += oid-array.o
 LIB_OBJS += packfile.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
@@ -978,7 +979,6 @@ LIB_OBJS += sequencer.o
 LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
-LIB_OBJS += sha1-array.o
 LIB_OBJS += sha1-lookup.o
 LIB_OBJS += sha1-file.o
 LIB_OBJS += sha1-name.o
diff --git a/bisect.c b/bisect.c
index 9154f810f7..64b579b6ea 100644
--- a/bisect.c
+++ b/bisect.c
@@ -10,7 +10,7 @@
 #include "run-command.h"
 #include "log-tree.h"
 #include "bisect.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "argv-array.h"
 #include "commit-slab.h"
 #include "commit-reach.h"
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 6ecc8ee6dc..0d03fdac6e 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -12,7 +12,7 @@
 #include "userdiff.h"
 #include "streaming.h"
 #include "tree-walk.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "packfile.h"
 #include "object-store.h"
 #include "promisor-remote.h"
diff --git a/builtin/diff.c b/builtin/diff.c
index 42ac803091..8537b17bd5 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -17,7 +17,7 @@
 #include "log-tree.h"
 #include "builtin.h"
 #include "submodule.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 
 #define DIFF_NO_INDEX_EXPLICIT 1
 #define DIFF_NO_INDEX_IMPLICIT 2
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index dc1485c8aa..4771100072 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -3,7 +3,7 @@
 #include "fetch-pack.h"
 #include "remote.h"
 #include "connect.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "protocol.h"
 
 static const char fetch_pack_usage[] =
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index dc7c58ce3f..fdd18c7ccb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -26,7 +26,7 @@
 #include "pack-bitmap.h"
 #include "delta-islands.h"
 #include "reachable.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "argv-array.h"
 #include "list.h"
 #include "packfile.h"
diff --git a/builtin/pull.c b/builtin/pull.c
index e42665b681..9429786604 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -12,7 +12,7 @@
 #include "parse-options.h"
 #include "exec-cmd.h"
 #include "run-command.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "remote.h"
 #include "dir.h"
 #include "rebase.h"
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 2cc18bbffd..d46147f709 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -13,7 +13,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "string-list.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "connected.h"
 #include "argv-array.h"
 #include "version.h"
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 098ebf22d0..f2c5a34402 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -11,7 +11,7 @@
 #include "quote.h"
 #include "transport.h"
 #include "version.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
 #include "protocol.h"
diff --git a/builtin/tag.c b/builtin/tag.c
index cc30d346f5..dd160b49c7 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -17,7 +17,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "gpg-interface.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "column.h"
 #include "ref-filter.h"
 
diff --git a/cache.h b/cache.h
index c77b95870a..0f0485ecfe 100644
--- a/cache.h
+++ b/cache.h
@@ -14,7 +14,7 @@
 #include "pack-revindex.h"
 #include "hash.h"
 #include "path.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "repository.h"
 #include "mem-pool.h"
 
diff --git a/combine-diff.c b/combine-diff.c
index d5c4d839dc..002e0e5438 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -10,7 +10,7 @@
 #include "log-tree.h"
 #include "refs.h"
 #include "userdiff.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "revision.h"
 
 static int compare_paths(const struct combine_diff_path *one,
diff --git a/connect.c b/connect.c
index b6451ab5e8..23013c6344 100644
--- a/connect.c
+++ b/connect.c
@@ -9,7 +9,7 @@
 #include "connect.h"
 #include "url.h"
 #include "string-list.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "transport.h"
 #include "strbuf.h"
 #include "version.h"
diff --git a/delta-islands.c b/delta-islands.c
index 09dbd3cf72..aa98b2e541 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -17,7 +17,7 @@
 #include "pack-bitmap.h"
 #include "pack-objects.h"
 #include "delta-islands.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "config.h"
 
 KHASH_INIT(str, const char *, void *, 1, kh_str_hash_func, kh_str_hash_equal)
diff --git a/fetch-pack.c b/fetch-pack.c
index 1734a573b0..0b07b3ee73 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,7 +15,7 @@
 #include "connect.h"
 #include "transport.h"
 #include "version.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "oidset.h"
 #include "packfile.h"
 #include "object-store.h"
diff --git a/object-store.h b/object-store.h
index be72fee7d5..d1e490f203 100644
--- a/object-store.h
+++ b/object-store.h
@@ -4,7 +4,7 @@
 #include "cache.h"
 #include "oidmap.h"
 #include "list.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "strbuf.h"
 #include "thread-utils.h"
 
diff --git a/sha1-array.c b/oid-array.c
similarity index 96%
rename from sha1-array.c
rename to oid-array.c
index bada0c4353..8657a5cedf 100644
--- a/sha1-array.c
+++ b/oid-array.c
@@ -1,5 +1,5 @@
 #include "cache.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "sha1-lookup.h"
 
 void oid_array_append(struct oid_array *array, const struct object_id *oid)
@@ -48,7 +48,7 @@ int oid_array_for_each(struct oid_array *array,
 {
 	size_t i;
 
-	/* No oid_array_sort() here! See sha1-array.h */
+	/* No oid_array_sort() here! See oid-array.h */
 
 	for (i = 0; i < array->nr; i++) {
 		int ret = fn(array->oid + i, data);
diff --git a/sha1-array.h b/oid-array.h
similarity index 98%
rename from sha1-array.h
rename to oid-array.h
index c5e4b9324f..f28d322c90 100644
--- a/sha1-array.h
+++ b/oid-array.h
@@ -19,7 +19,7 @@
  *
  * void some_func(void)
  * {
- *     struct sha1_array hashes = OID_ARRAY_INIT;
+ *     struct oid_array hashes = OID_ARRAY_INIT;
  *     struct object_id oid;
  *
  *     // Read objects into our set
diff --git a/parse-options-cb.c b/parse-options-cb.c
index a28b55be48..86cd393013 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -5,7 +5,7 @@
 #include "color.h"
 #include "string-list.h"
 #include "argv-array.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 
 /*----- some often used options -----*/
 
diff --git a/ref-filter.h b/ref-filter.h
index f1dcff4c6e..64330e9601 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -1,7 +1,7 @@
 #ifndef REF_FILTER_H
 #define REF_FILTER_H
 
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "refs.h"
 #include "commit.h"
 #include "parse-options.h"
diff --git a/remote-curl.c b/remote-curl.c
index e4cd321844..1c9aa3d0ab 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -12,7 +12,7 @@
 #include "sideband.h"
 #include "argv-array.h"
 #include "credential.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "send-pack.h"
 #include "protocol.h"
 #include "quote.h"
diff --git a/send-pack.c b/send-pack.c
index 0407841ae8..da4741ce4a 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -12,7 +12,7 @@
 #include "quote.h"
 #include "transport.h"
 #include "version.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "gpg-interface.h"
 #include "cache.h"
 
diff --git a/sha1-name.c b/sha1-name.c
index 5bb006e5a9..6561cd9097 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -8,7 +8,7 @@
 #include "refs.h"
 #include "remote.h"
 #include "dir.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "packfile.h"
 #include "object-store.h"
 #include "repository.h"
diff --git a/shallow.c b/shallow.c
index 7fd04afed1..14f7fa6e27 100644
--- a/shallow.c
+++ b/shallow.c
@@ -8,7 +8,7 @@
 #include "pkt-line.h"
 #include "remote.h"
 #include "refs.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "diff.h"
 #include "revision.h"
 #include "commit-slab.h"
diff --git a/submodule.c b/submodule.c
index c3aadf3fff..e2ef5698c8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -12,7 +12,7 @@
 #include "diffcore.h"
 #include "refs.h"
 #include "string-list.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "argv-array.h"
 #include "blob.h"
 #include "thread-utils.h"
diff --git a/t/helper/test-sha1-array.c b/t/helper/test-sha1-array.c
index ad5e69f9d3..6f7d3b939e 100644
--- a/t/helper/test-sha1-array.c
+++ b/t/helper/test-sha1-array.c
@@ -1,6 +1,6 @@
 #include "test-tool.h"
 #include "cache.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 
 static int print_oid(const struct object_id *oid, void *data)
 {
diff --git a/transport.c b/transport.c
index 1fdc7dac1a..471c5bd339 100644
--- a/transport.c
+++ b/transport.c
@@ -16,7 +16,7 @@
 #include "url.h"
 #include "submodule.h"
 #include "string-list.h"
-#include "sha1-array.h"
+#include "oid-array.h"
 #include "sigchain.h"
 #include "transport-internal.h"
 #include "protocol.h"
-- 
2.26.0.597.g7e08ed78ff


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/7] test-tool: rename sha1-array to oid-array
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
                   ` (2 preceding siblings ...)
  2020-03-30 14:03 ` [PATCH 3/7] oid_array: rename source file from sha1-array Jeff King
@ 2020-03-30 14:04 ` Jeff King
  2020-03-30 14:04 ` [PATCH 5/7] bisect: stop referring to sha1_array Jeff King
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:04 UTC (permalink / raw)
  To: git

This matches the actual data structure name, as well as the source file
that contains the code we're testing. The test scripts need updating to
use the new name, as well.

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile                                         |  2 +-
 t/helper/{test-sha1-array.c => test-oid-array.c} |  6 +++---
 t/helper/test-tool.c                             |  2 +-
 t/helper/test-tool.h                             |  2 +-
 t/t0064-sha1-array.sh                            | 16 ++++++++--------
 5 files changed, 14 insertions(+), 14 deletions(-)
 rename t/helper/{test-sha1-array.c => test-oid-array.c} (85%)

diff --git a/Makefile b/Makefile
index abfe5cc29b..dc356ce4dd 100644
--- a/Makefile
+++ b/Makefile
@@ -738,7 +738,7 @@ TEST_BUILTINS_OBJS += test-run-command.o
 TEST_BUILTINS_OBJS += test-scrap-cache-tree.o
 TEST_BUILTINS_OBJS += test-serve-v2.o
 TEST_BUILTINS_OBJS += test-sha1.o
-TEST_BUILTINS_OBJS += test-sha1-array.o
+TEST_BUILTINS_OBJS += test-oid-array.o
 TEST_BUILTINS_OBJS += test-sha256.o
 TEST_BUILTINS_OBJS += test-sigchain.o
 TEST_BUILTINS_OBJS += test-strcmp-offset.o
diff --git a/t/helper/test-sha1-array.c b/t/helper/test-oid-array.c
similarity index 85%
rename from t/helper/test-sha1-array.c
rename to t/helper/test-oid-array.c
index 6f7d3b939e..ce9fd5f091 100644
--- a/t/helper/test-sha1-array.c
+++ b/t/helper/test-oid-array.c
@@ -8,7 +8,7 @@ static int print_oid(const struct object_id *oid, void *data)
 	return 0;
 }
 
-int cmd__sha1_array(int argc, const char **argv)
+int cmd__oid_array(int argc, const char **argv)
 {
 	struct oid_array array = OID_ARRAY_INIT;
 	struct strbuf line = STRBUF_INIT;
@@ -19,11 +19,11 @@ int cmd__sha1_array(int argc, const char **argv)
 
 		if (skip_prefix(line.buf, "append ", &arg)) {
 			if (get_oid_hex(arg, &oid))
-				die("not a hexadecimal SHA1: %s", arg);
+				die("not a hexadecimal oid: %s", arg);
 			oid_array_append(&array, &oid);
 		} else if (skip_prefix(line.buf, "lookup ", &arg)) {
 			if (get_oid_hex(arg, &oid))
-				die("not a hexadecimal SHA1: %s", arg);
+				die("not a hexadecimal oid: %s", arg);
 			printf("%d\n", oid_array_lookup(&array, &oid));
 		} else if (!strcmp(line.buf, "clear"))
 			oid_array_clear(&array);
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 31eedcd241..c1b320c24c 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -37,6 +37,7 @@ static struct test_cmd cmds[] = {
 	{ "match-trees", cmd__match_trees },
 	{ "mergesort", cmd__mergesort },
 	{ "mktemp", cmd__mktemp },
+	{ "oid-array", cmd__oid_array },
 	{ "oidmap", cmd__oidmap },
 	{ "online-cpus", cmd__online_cpus },
 	{ "parse-options", cmd__parse_options },
@@ -57,7 +58,6 @@ static struct test_cmd cmds[] = {
 	{ "scrap-cache-tree", cmd__scrap_cache_tree },
 	{ "serve-v2", cmd__serve_v2 },
 	{ "sha1", cmd__sha1 },
-	{ "sha1-array", cmd__sha1_array },
 	{ "sha256", cmd__sha256 },
 	{ "sigchain", cmd__sigchain },
 	{ "strcmp-offset", cmd__strcmp_offset },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 4eb5e6609e..1cbaec02f3 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -47,7 +47,7 @@ int cmd__run_command(int argc, const char **argv);
 int cmd__scrap_cache_tree(int argc, const char **argv);
 int cmd__serve_v2(int argc, const char **argv);
 int cmd__sha1(int argc, const char **argv);
-int cmd__sha1_array(int argc, const char **argv);
+int cmd__oid_array(int argc, const char **argv);
 int cmd__sha256(int argc, const char **argv);
 int cmd__sigchain(int argc, const char **argv);
 int cmd__strcmp_offset(int argc, const char **argv);
diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh
index 5dda570b9a..45685af2fd 100755
--- a/t/t0064-sha1-array.sh
+++ b/t/t0064-sha1-array.sh
@@ -18,7 +18,7 @@ test_expect_success 'ordered enumeration' '
 	{
 		echoid append 88 44 aa 55 &&
 		echo for_each_unique
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	test_cmp expect actual
 '
 
@@ -28,15 +28,15 @@ test_expect_success 'ordered enumeration with duplicate suppression' '
 		echoid append 88 44 aa 55 &&
 		echoid append 88 44 aa 55 &&
 		echo for_each_unique
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'lookup' '
 	{
 		echoid append 88 44 aa 55 &&
 		echoid lookup 55
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -eq 1
 '
@@ -45,7 +45,7 @@ test_expect_success 'lookup non-existing entry' '
 	{
 		echoid append 88 44 aa 55 &&
 		echoid lookup 33
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -lt 0
 '
@@ -55,7 +55,7 @@ test_expect_success 'lookup with duplicates' '
 		echoid append 88 44 aa 55 &&
 		echoid append 88 44 aa 55 &&
 		echoid lookup 55
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -ge 2 &&
 	test "$n" -le 3
@@ -66,7 +66,7 @@ test_expect_success 'lookup non-existing entry with duplicates' '
 		echoid append 88 44 aa 55 &&
 		echoid append 88 44 aa 55 &&
 		echoid lookup 66
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -lt 0
 '
@@ -81,7 +81,7 @@ test_expect_success 'lookup with almost duplicate values' '
 		echo "append $id1" &&
 		echo "append $id2" &&
 		echoid lookup 55
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -eq 0
 '
@@ -90,7 +90,7 @@ test_expect_success 'lookup with single duplicate value' '
 	{
 		echoid append 55 55 &&
 		echoid lookup 55
-	} | test-tool sha1-array >actual &&
+	} | test-tool oid-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -ge 0 &&
 	test "$n" -le 1
-- 
2.26.0.597.g7e08ed78ff


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/7] bisect: stop referring to sha1_array
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
                   ` (3 preceding siblings ...)
  2020-03-30 14:04 ` [PATCH 4/7] test-tool: rename sha1-array to oid-array Jeff King
@ 2020-03-30 14:04 ` Jeff King
  2020-03-30 14:04 ` [PATCH 6/7] ref-filter: stop referring to "sha1 array" Jeff King
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:04 UTC (permalink / raw)
  To: git

Our join_sha1_array_hex() function long ago switched to using an
oid_array; let's change the name to match.

Signed-off-by: Jeff King <peff@peff.net>
---
 bisect.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bisect.c b/bisect.c
index 64b579b6ea..d5e830410f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -473,7 +473,7 @@ static void read_bisect_paths(struct argv_array *array)
 	fclose(fp);
 }
 
-static char *join_sha1_array_hex(struct oid_array *array, char delim)
+static char *join_oid_array_hex(struct oid_array *array, char delim)
 {
 	struct strbuf joined_hexs = STRBUF_INIT;
 	int i;
@@ -765,7 +765,7 @@ static enum bisect_error handle_bad_merge_base(void)
 {
 	if (is_expected_rev(current_bad_oid)) {
 		char *bad_hex = oid_to_hex(current_bad_oid);
-		char *good_hex = join_sha1_array_hex(&good_revs, ' ');
+		char *good_hex = join_oid_array_hex(&good_revs, ' ');
 		if (!strcmp(term_bad, "bad") && !strcmp(term_good, "good")) {
 			fprintf(stderr, _("The merge base %s is bad.\n"
 				"This means the bug has been fixed "
@@ -796,7 +796,7 @@ static void handle_skipped_merge_base(const struct object_id *mb)
 {
 	char *mb_hex = oid_to_hex(mb);
 	char *bad_hex = oid_to_hex(current_bad_oid);
-	char *good_hex = join_sha1_array_hex(&good_revs, ' ');
+	char *good_hex = join_oid_array_hex(&good_revs, ' ');
 
 	warning(_("the merge base between %s and [%s] "
 		"must be skipped.\n"
-- 
2.26.0.597.g7e08ed78ff


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/7] ref-filter: stop referring to "sha1 array"
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
                   ` (4 preceding siblings ...)
  2020-03-30 14:04 ` [PATCH 5/7] bisect: stop referring to sha1_array Jeff King
@ 2020-03-30 14:04 ` Jeff King
  2020-03-30 14:04 ` [PATCH 7/7] oidset: stop referring to sha1-array Jeff King
  2020-04-15  0:35 ` [PATCH 0/7] oid_array cleanups Taylor Blau
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:04 UTC (permalink / raw)
  To: git

A comment refers to the "sha1s in the given sha1 array". But this became
an oid_array along with everywhere else in 910650d2f8 (Rename sha1_array
to oid_array, 2017-03-31). Plus there's an extra line of leftover
editing cruft we can drop.

Signed-off-by: Jeff King <peff@peff.net>
---
 ref-filter.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index b1812cb69a..35776838f4 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1976,10 +1976,9 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 }
 
 /*
- * Given a ref (sha1, refname), check if the ref belongs to the array
- * of sha1s. If the given ref is a tag, check if the given tag points
- * at one of the sha1s in the given sha1 array.
- * the given sha1_array.
+ * Given a ref (oid, refname), check if the ref belongs to the array
+ * of oids. If the given ref is a tag, check if the given tag points
+ * at one of the oids in the given oid array.
  * NEEDSWORK:
  * 1. Only a single level of inderection is obtained, we might want to
  * change this to account for multiple levels (e.g. annotated tags
-- 
2.26.0.597.g7e08ed78ff


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/7] oidset: stop referring to sha1-array
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
                   ` (5 preceding siblings ...)
  2020-03-30 14:04 ` [PATCH 6/7] ref-filter: stop referring to "sha1 array" Jeff King
@ 2020-03-30 14:04 ` Jeff King
  2020-04-15  0:35 ` [PATCH 0/7] oid_array cleanups Taylor Blau
  7 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:04 UTC (permalink / raw)
  To: git

Ths has been oid_array for some time, though the source only recently
moved from sha1-array.[ch] to oid-array.[ch]. In either case, we should
say "oid-array" here.

Signed-off-by: Jeff King <peff@peff.net>
---
 oidset.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/oidset.h b/oidset.h
index d8a106b127..3a2d9d1115 100644
--- a/oidset.h
+++ b/oidset.h
@@ -4,7 +4,7 @@
 #include "khash.h"
 
 /**
- * This API is similar to sha1-array, in that it maintains a set of object ids
+ * This API is similar to oid-array, in that it maintains a set of object ids
  * in a memory-efficient way. The major differences are:
  *
  *   1. It uses a hash, so we can do online duplicate removal, rather than
-- 
2.26.0.597.g7e08ed78ff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/7] oid_array: use size_t for count and allocation
  2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
@ 2020-03-30 14:09   ` Jeff King
  2020-04-15  0:27   ` Taylor Blau
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff King @ 2020-03-30 14:09 UTC (permalink / raw)
  To: git

On Mon, Mar 30, 2020 at 10:03:10AM -0400, Jeff King wrote:

> The oid_array object uses an "int" to store the number of items and the
> allocated size. It's rather unlikely for somebody to have more than 2^31
> objects in a repository (the sha1's alone would be 40GB!), but if they
> do, we'd overflow our alloc variable.
> 
> You can reproduce this case with something like:
> 
>   git init repo
>   cd repo
> 
>   # make a pack with 2^24 objects
>   perl -e '
>     my $nr = 2**24;
> 
>     for (my $i = 0; $i < $nr; $i++) {
> 	    print "blob\n";
> 	    print "data 4\n";
> 	    print pack("N", $i);
>     }
>   ' | git fast-import

I briefly tried to create a packfile manually that had a ton of objects
in it, thinking it would be faster. Conceptually it's pretty simple to
write a pack header, and then a series of pack-object headers (which are
the same single-byte each time) and then a 4-byte body. But you have to
zlib compress the body, which created a bunch of headaches.

So I arrived at this fast-import solution, which was...not super fast.
Profiling showed that we were spending 80% of the time inserting into
our custom hashtable, which is fixed at 2^16 entries and then chains
beyond that. Swapping it out for a khash proved much faster, but I'm not
sure if the memory games are too gross (see the comment in find_object
below).

I also didn't look into whether we could get rid of the extra allocating
pool (and store the structs right in the hash), or if it's necessary for
their pointers to be stable.

---
diff --git a/fast-import.c b/fast-import.c
index 202dda11a6..427dd73d26 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -39,12 +39,25 @@
 
 struct object_entry {
 	struct pack_idx_entry idx;
-	struct object_entry *next;
 	uint32_t type : TYPE_BITS,
 		pack_id : PACK_ID_BITS,
 		depth : DEPTH_BITS;
 };
 
+static inline unsigned int object_entry_hash(struct object_entry *oe)
+{
+	return oidhash(&oe->idx.oid);
+}
+
+static inline int object_entry_equal(struct object_entry *a,
+				     struct object_entry *b)
+{
+	return oideq(&a->idx.oid, &b->idx.oid);
+}
+
+KHASH_INIT(object_entry_set, struct object_entry *, int, 0,
+	   object_entry_hash, object_entry_equal);
+
 struct object_entry_pool {
 	struct object_entry_pool *next_pool;
 	struct object_entry *next_free;
@@ -178,7 +191,7 @@ static off_t pack_size;
 /* Table of objects we've written. */
 static unsigned int object_entry_alloc = 5000;
 static struct object_entry_pool *blocks;
-static struct object_entry *object_table[1 << 16];
+static kh_object_entry_set_t object_table;
 static struct mark_set *marks;
 static const char *export_marks_file;
 static const char *import_marks_file;
@@ -455,44 +468,40 @@ static struct object_entry *new_object(struct object_id *oid)
 
 static struct object_entry *find_object(struct object_id *oid)
 {
-	unsigned int h = oid->hash[0] << 8 | oid->hash[1];
-	struct object_entry *e;
-	for (e = object_table[h]; e; e = e->next)
-		if (oideq(oid, &e->idx.oid))
-			return e;
+	/*
+	 * this cast works because we only look at the oid part of the entry,
+	 * and it comes first in the struct
+	 */
+	khiter_t pos = kh_get_object_entry_set(&object_table,
+					       (struct object_entry *)oid);
+	if (pos != kh_end(&object_table))
+		return kh_key(&object_table, pos);
 	return NULL;
 }
 
 static struct object_entry *insert_object(struct object_id *oid)
 {
-	unsigned int h = oid->hash[0] << 8 | oid->hash[1];
-	struct object_entry *e = object_table[h];
-
-	while (e) {
-		if (oideq(oid, &e->idx.oid))
-			return e;
-		e = e->next;
-	}
+	struct object_entry *e;
+	int redundant;
 
 	e = new_object(oid);
-	e->next = object_table[h];
 	e->idx.offset = 0;
-	object_table[h] = e;
+	kh_put_object_entry_set(&object_table, e, &redundant);
 	return e;
 }
 
 static void invalidate_pack_id(unsigned int id)
 {
-	unsigned int h;
 	unsigned long lu;
 	struct tag *t;
+	khiter_t iter;
 
-	for (h = 0; h < ARRAY_SIZE(object_table); h++) {
-		struct object_entry *e;
-
-		for (e = object_table[h]; e; e = e->next)
+	for (iter = kh_begin(&object_table); iter != kh_end(&object_table); iter++) {
+		if (kh_exist(&object_table, iter)) {
+			struct object_entry *e = kh_key(&object_table, iter);
 			if (e->pack_id == id)
 				e->pack_id = MAX_PACK_ID;
+		}
 	}
 
 	for (lu = 0; lu < branch_table_sz; lu++) {

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/7] oid_array: use size_t for count and allocation
  2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
  2020-03-30 14:09   ` Jeff King
@ 2020-04-15  0:27   ` Taylor Blau
  1 sibling, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2020-04-15  0:27 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, Mar 30, 2020 at 10:03:09AM -0400, Jeff King wrote:
> The oid_array object uses an "int" to store the number of items and the
> allocated size. It's rather unlikely for somebody to have more than 2^31
> objects in a repository (the sha1's alone would be 40GB!), but if they
> do, we'd overflow our alloc variable.
>
> You can reproduce this case with something like:
>
>   git init repo
>   cd repo
>
>   # make a pack with 2^24 objects
>   perl -e '
>     my $nr = 2**24;
>
>     for (my $i = 0; $i < $nr; $i++) {
> 	    print "blob\n";
> 	    print "data 4\n";
> 	    print pack("N", $i);
>     }
>   ' | git fast-import
>
>   # now make 256 copies of it; most of these objects will be duplicates,
>   # but oid_array doesn't de-dup until all values are read and it can
>   # sort the result.
>   cd .git/objects/pack/
>   pack=$(echo *.pack)
>   idx=$(echo *.idx)
>   for i in $(seq 0 255); do
>     # no need to waste disk space
>     ln "$pack" "pack-extra-$i.pack"
>     ln "$idx" "pack-extra-$i.idx"
>   done
>
>   # and now force an oid_array to store all of it
>   git cat-file --batch-all-objects --batch-check
>
> which results in:
>
>   fatal: size_t overflow: 32 * 18446744071562067968
>
> So the good news is that st_mult() sees the problem (the large number is
> because our int wraps negative, and then that gets cast to a size_t),
> doing the job it was meant to: bailing in crazy situations rather than
> causing an undersized buffer.
>
> But we should avoid hitting this case at all, and instead limit
> ourselves based on what malloc() is willing to give us. We can easily do
> that by switching to size_t.
>
> The cat-file process above made it to ~120GB virtual set size before the
> integer overflow (our internal hash storage is 32-bytes now in
> preparation for sha256, so we'd expect ~128GB total needed, plus
> potentially more to copy from one realloc'd block to another)). After
> this patch (and about 130GB of RAM+swap), it does eventually read in the
> whole set. No test for obvious reasons.

;). This patch looks good, and makes immediate sense given your
explanation.

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  sha1-array.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/sha1-array.h b/sha1-array.h
> index dc1bca9c9a..c5e4b9324f 100644
> --- a/sha1-array.h
> +++ b/sha1-array.h
> @@ -49,8 +49,8 @@
>   */
>  struct oid_array {
>  	struct object_id *oid;
> -	int nr;
> -	int alloc;
> +	size_t nr;
> +	size_t alloc;
>  	int sorted;
>  };
>
> --
> 2.26.0.597.g7e08ed78ff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/7] oid_array: rename source file from sha1-array
  2020-03-30 14:03 ` [PATCH 3/7] oid_array: rename source file from sha1-array Jeff King
@ 2020-04-15  0:34   ` Taylor Blau
  0 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2020-04-15  0:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Mon, Mar 30, 2020 at 10:03:46AM -0400, Jeff King wrote:
> We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
> oid_array, 2017-03-31), but the file is still called sha1-array. Besides
> being slightly confusing, it makes it more annoying to grep for leftover
> occurrences of "sha1" in various files, because the header is included
> in so many places.
>
> Let's complete the transition by renaming the source and header files
> (and fixing up a few comment references).
>
> I kept the "-" in the name, as that seems to be our style; cf.
> fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
> We also have oidmap.h and oidset.h without any punctuation, but those
> are "struct oidmap" and "struct oidset" in the code. We _could_ make
> this "oidarray" to match, but somehow it looks uglier to me because of
> the length of "array" (plus it would be a very invasive patch for little
> gain).

I was wondering what you were planning to do about that. But I think
that what you did is the right move. Sure, perhaps it'd be more
consistent to call this "oidarray", or rename everything else to
"oid_map" and "oid_set".

I prefer what you did here to that, I think because I also find the
"oidarray" spelling to be weird.

>
> Signed-off-by: Jeff King <peff@peff.net>
> ---

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/7] oid_array cleanups
  2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
                   ` (6 preceding siblings ...)
  2020-03-30 14:04 ` [PATCH 7/7] oidset: stop referring to sha1-array Jeff King
@ 2020-04-15  0:35 ` Taylor Blau
  7 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2020-04-15  0:35 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Hi Peff,

On Mon, Mar 30, 2020 at 10:02:47AM -0400, Jeff King wrote:
> I recently encountered a repo in the wild that had over 2^31 objects,
> and found that cat-file barfed:
>
>   $ git cat-file --batch-all-objects --batch-check
>   fatal: size_t overflow: 32 * 18446744071562067968
>
> The issue is that we use an "int" to store the count and the allocation.
> Fortunately our st_mult() protection kicks in before we end up with an
> undersized buffer, so this shouldn't be dangerous. And while I'd argue
> that having that many objects is probably silly and likely to cause
> other problems, I'd just as soon we kept our allocating code as robust
> as possible.
>
> The first patch is the actual fix, followed by some related type
> cleanup. The rest of it is just leftovers from the
> s/sha1_array/oid_array/ transition a few years back.
>
>   [1/7]: oid_array: use size_t for count and allocation
>   [2/7]: oid_array: use size_t for iteration
>   [3/7]: oid_array: rename source file from sha1-array
>   [4/7]: test-tool: rename sha1-array to oid-array
>   [5/7]: bisect: stop referring to sha1_array
>   [6/7]: ref-filter: stop referring to "sha1 array"
>   [7/7]: oidset: stop referring to sha1-array

Thanks for this. I reviewed the patch, which was a breeze thanks to how
you broke it out. I don't think that I said anything useful in my actual
review, which is to say that this looks good from my perspective.

Sorry that I let it sit in my inbox for a few days. Trying to get better
about that :).

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-15  0:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-30 14:02 [PATCH 0/7] oid_array cleanups Jeff King
2020-03-30 14:03 ` [PATCH 1/7] oid_array: use size_t for count and allocation Jeff King
2020-03-30 14:09   ` Jeff King
2020-04-15  0:27   ` Taylor Blau
2020-03-30 14:03 ` [PATCH 2/7] oid_array: use size_t for iteration Jeff King
2020-03-30 14:03 ` [PATCH 3/7] oid_array: rename source file from sha1-array Jeff King
2020-04-15  0:34   ` Taylor Blau
2020-03-30 14:04 ` [PATCH 4/7] test-tool: rename sha1-array to oid-array Jeff King
2020-03-30 14:04 ` [PATCH 5/7] bisect: stop referring to sha1_array Jeff King
2020-03-30 14:04 ` [PATCH 6/7] ref-filter: stop referring to "sha1 array" Jeff King
2020-03-30 14:04 ` [PATCH 7/7] oidset: stop referring to sha1-array Jeff King
2020-04-15  0:35 ` [PATCH 0/7] oid_array cleanups Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).