git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] reftable: avoid reading and writing empty keys
@ 2022-01-12 18:07 Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
                   ` (7 more replies)
  0 siblings, 8 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys

this series makes sure that the object record does not have to consider
empty keys (and therefore, a NULL memcpy destination)

while we're at it add some more tests, and fix a naming mistake.

Han-Wen Nienhuys (7):
  Documentation: object_id_len goes up to 31
  reftable: reject 0 object_id_len
  reftable: add a test that verifies that writing empty keys fails
  reftable: avoid writing empty keys at the block layer
  reftable: ensure that obj_id_len is >= 2 on writing
  reftable: add test for length of disambiguating prefix
  reftable: rename writer_stats to reftable_writer_stats

 Documentation/technical/reftable.txt |   2 +-
 reftable/block.c                     |  27 ++++---
 reftable/block_test.c                |   5 ++
 reftable/reader.c                    |   5 ++
 reftable/readwrite_test.c            | 103 ++++++++++++++++++++++++++-
 reftable/reftable-writer.h           |   2 +-
 reftable/writer.c                    |   9 +--
 7 files changed, 135 insertions(+), 18 deletions(-)


base-commit: 90d242d36e248acfae0033274b524bfa55a947fd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1185%2Fhanwen%2Fobj-id-len-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1185/hanwen/obj-id-len-v1
Pull-Request: https://github.com/git/git/pull/1185
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/7] Documentation: object_id_len goes up to 31
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The value is stored in a 5-bit field, so we can't support more without
a format version upgrade.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Documentation/technical/reftable.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index d7c3b645cfb..6a67cc4174f 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -443,7 +443,7 @@ Obj block format
 Object blocks are optional. Writers may choose to omit object blocks,
 especially if readers will not use the object name to ref mapping.
 
-Object blocks use unique, abbreviated 2-32 object name keys, mapping to
+Object blocks use unique, abbreviated 2-31 byte object name keys, mapping to
 ref blocks containing references pointing to that object directly, or as
 the peeled value of an annotated tag. Like ref blocks, object blocks use
 the file's standard block size. The abbreviation length is available in
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/7] reftable: reject 0 object_id_len
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
but we forbid 0, so we can we can be sure that we never read a
0-length key.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/reader.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/reftable/reader.c b/reftable/reader.c
index 006709a645a..21843f5e935 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -155,6 +155,11 @@ static int parse_footer(struct reftable_reader *r, uint8_t *footer,
 	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
 				     r->log_offsets.offset > 0);
 	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	if (r->obj_offsets.is_present && !r->object_id_len) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
 	err = 0;
 done:
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/7] reftable: add a test that verifies that writing empty keys fails
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Empty keys can only be written as ref records with empty names. The
log record has a logical timestamp in the key, so the key is never
empty.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 70c7aedba2c..a315c8992e8 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -602,6 +602,29 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_empty_key(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record ref = {
+		.refname = "",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_DELETION,
+	};
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+	err = reftable_writer_add_ref(w, &ref);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_key_order(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -681,6 +704,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_table_read_write_seek_index);
 	RUN_TEST(test_table_refs_for_no_index);
 	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/7] reftable: avoid writing empty keys at the block layer
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-01-12 18:07 ` [PATCH 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-14  1:26   ` Junio C Hamano
  2022-01-12 18:07 ` [PATCH 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The public interface (reftable_writer) already ensures that keys are
written in strictly increasing order, and an empty key by definition
fails this check.

However, by also enforcing this at the block layer, it is easier to
verify that records (which are written into blocks) never have to
consider the possibility of empty keys.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/block.c      | 27 +++++++++++++++++----------
 reftable/block_test.c |  5 +++++
 reftable/writer.c     |  3 +--
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index 855e3f5c947..8725eaaf64f 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -88,8 +88,9 @@ uint8_t block_writer_type(struct block_writer *bw)
 	return bw->buf[bw->header_off];
 }
 
-/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
-   success */
+/* Adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success. Returns REFTABLE_API_ERROR if attempting to write a record with
+   empty key. */
 int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 {
 	struct strbuf empty = STRBUF_INIT;
@@ -105,8 +106,14 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 	int is_restart = 0;
 	struct strbuf key = STRBUF_INIT;
 	int n = 0;
+	int err = -1;
 
 	reftable_record_key(rec, &key);
+	if (!key.len) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
 	n = reftable_encode_key(&is_restart, out, last, key,
 				reftable_record_val_type(rec));
 	if (n < 0)
@@ -118,16 +125,11 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 		goto done;
 	string_view_consume(&out, n);
 
-	if (block_writer_register_restart(w, start.len - out.len, is_restart,
-					  &key) < 0)
-		goto done;
-
-	strbuf_release(&key);
-	return 0;
-
+	err = block_writer_register_restart(w, start.len - out.len, is_restart,
+					    &key);
 done:
 	strbuf_release(&key);
-	return -1;
+	return err;
 }
 
 int block_writer_finish(struct block_writer *w)
@@ -324,6 +326,9 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec)
 	if (n < 0)
 		return -1;
 
+	if (!key.len)
+		return REFTABLE_FORMAT_ERROR;
+
 	string_view_consume(&in, n);
 	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
 	if (n < 0)
@@ -350,6 +355,8 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key)
 	int n = reftable_decode_key(key, &extra, empty, in);
 	if (n < 0)
 		return n;
+	if (!key->len)
+		return -1;
 
 	return 0;
 }
diff --git a/reftable/block_test.c b/reftable/block_test.c
index 4b3ea262dcb..5112ddbf468 100644
--- a/reftable/block_test.c
+++ b/reftable/block_test.c
@@ -42,6 +42,11 @@ static void test_block_read_write(void)
 			  header_off, hash_size(GIT_SHA1_FORMAT_ID));
 	reftable_record_from_ref(&rec, &ref);
 
+	ref.refname = "";
+	ref.value_type = REFTABLE_REF_DELETION;
+	n = block_writer_add(&bw, &rec);
+	EXPECT(n == REFTABLE_API_ERROR);
+
 	for (i = 0; i < N; i++) {
 		char name[100];
 		uint8_t hash[GIT_SHA1_RAWSZ];
diff --git a/reftable/writer.c b/reftable/writer.c
index 35c8649c9b7..e3c042b9d84 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -238,14 +238,13 @@ static int writer_add_record(struct reftable_writer *w,
 
 	writer_reinit_block_writer(w, reftable_record_type(rec));
 	err = block_writer_add(w->block_writer, rec);
-	if (err < 0) {
+	if (err == -1) {
 		/* we are writing into memory, so an error can only mean it
 		 * doesn't fit. */
 		err = REFTABLE_ENTRY_TOO_BIG_ERROR;
 		goto done;
 	}
 
-	err = 0;
 done:
 	strbuf_release(&key);
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 5/7] reftable: ensure that obj_id_len is >= 2 on writing
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-01-12 18:07 ` [PATCH 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

When writing the same hash many times, we might decide to use a
length-1 object ID prefix for the ObjectID => ref table, which is out
of spec.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 37 +++++++++++++++++++++++++++++++++++++
 reftable/writer.c         |  4 +++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index a315c8992e8..b4371b75724 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -602,6 +602,42 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_min_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 2);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -707,5 +743,6 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
diff --git a/reftable/writer.c b/reftable/writer.c
index e3c042b9d84..f94af531351 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -508,7 +508,9 @@ static void object_record_free(void *void_arg, void *key)
 static int writer_dump_object_index(struct reftable_writer *w)
 {
 	struct write_record_arg closure = { .w = w };
-	struct common_prefix_arg common = { NULL };
+	struct common_prefix_arg common = {
+		.max = 1,		/* obj_id_len should be >= 2. */
+	};
 	if (w->obj_index_tree) {
 		infix_walk(w->obj_index_tree, &update_common, &common);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 6/7] reftable: add test for length of disambiguating prefix
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-01-12 18:07 ` [PATCH 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-01-12 18:07 ` [PATCH 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The ID => ref map is trimming object IDs to a disambiguating prefix.
Check that we are computing their length correctly.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index b4371b75724..5f45cee39d6 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -638,6 +638,43 @@ static void test_write_object_id_min_length(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		ref.value.val1[15] = i;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 16);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -743,6 +780,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_length);
 	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 7/7] reftable: rename writer_stats to reftable_writer_stats
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-01-12 18:07 ` [PATCH 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
@ 2022-01-12 18:07 ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-01-12 18:07 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This function is part of the reftable API, so it should use the
reftable_ prefix

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c  | 8 ++++----
 reftable/reftable-writer.h | 2 +-
 reftable/writer.c          | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 5f45cee39d6..b1ff46c18a9 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -100,7 +100,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	for (i = 0; i < stats->ref_stats.blocks; i++) {
 		int off = i * opts.block_size;
 		if (off == 0) {
@@ -239,7 +239,7 @@ static void test_log_write_read(void)
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	EXPECT(stats->log_stats.blocks > 0);
 	reftable_writer_free(w);
 	w = NULL;
@@ -633,7 +633,7 @@ static void test_write_object_id_min_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 2);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 2);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
@@ -670,7 +670,7 @@ static void test_write_object_id_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 16);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 16);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index a560dc17255..db8de197f6c 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -143,7 +143,7 @@ int reftable_writer_close(struct reftable_writer *w);
 
    This struct becomes invalid when the writer is freed.
  */
-const struct reftable_stats *writer_stats(struct reftable_writer *w);
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w);
 
 /* reftable_writer_free deallocates memory for the writer */
 void reftable_writer_free(struct reftable_writer *w);
diff --git a/reftable/writer.c b/reftable/writer.c
index f94af531351..495784e6294 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -688,7 +688,7 @@ static int writer_flush_block(struct reftable_writer *w)
 	return writer_flush_nonempty_block(w);
 }
 
-const struct reftable_stats *writer_stats(struct reftable_writer *w)
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w)
 {
 	return &w->stats;
 }
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/7] reftable: avoid writing empty keys at the block layer
  2022-01-12 18:07 ` [PATCH 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
@ 2022-01-14  1:26   ` Junio C Hamano
  2022-01-17 13:10     ` Han-Wen Nienhuys
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2022-01-14  1:26 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/reftable/block_test.c b/reftable/block_test.c
> index 4b3ea262dcb..5112ddbf468 100644
> --- a/reftable/block_test.c
> +++ b/reftable/block_test.c
> @@ -42,6 +42,11 @@ static void test_block_read_write(void)
>  			  header_off, hash_size(GIT_SHA1_FORMAT_ID));
>  	reftable_record_from_ref(&rec, &ref);
>  
> +	ref.refname = "";
> +	ref.value_type = REFTABLE_REF_DELETION;
> +	n = block_writer_add(&bw, &rec);
> +	EXPECT(n == REFTABLE_API_ERROR);
> +

The preimage of this hunk has been invalidated by your 9c498398
(reftable: make reftable_record a tagged union, 2021-12-22).

I see that the hn/reftable-coverity-fixes topic, which the commit is
a part of, has been expecting a reroll since last year---are you
plannning to rebuild that series after landing this series first?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/7] reftable: avoid writing empty keys at the block layer
  2022-01-14  1:26   ` Junio C Hamano
@ 2022-01-17 13:10     ` Han-Wen Nienhuys
  2022-01-17 19:11       ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Han-Wen Nienhuys @ 2022-01-17 13:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Fri, Jan 14, 2022 at 2:26 AM Junio C Hamano <gitster@pobox.com> wrote:
> > +     ref.refname = "";
> > +     ref.value_type = REFTABLE_REF_DELETION;
> > +     n = block_writer_add(&bw, &rec);
> > +     EXPECT(n == REFTABLE_API_ERROR);
> > +
>
> The preimage of this hunk has been invalidated by your 9c498398
> (reftable: make reftable_record a tagged union, 2021-12-22).
>
> I see that the hn/reftable-coverity-fixes topic, which the commit is
> a part of, has been expecting a reroll since last year---are you

I sent a reroll on Jan 12. As that has had more scrutiny, it's best if
that lands first. This series is less urgent.

> plannning to rebuild that series after landing this series first?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/7] reftable: avoid writing empty keys at the block layer
  2022-01-17 13:10     ` Han-Wen Nienhuys
@ 2022-01-17 19:11       ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2022-01-17 19:11 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

Han-Wen Nienhuys <hanwen@google.com> writes:

>> I see that the hn/reftable-coverity-fixes topic, which the commit is
>> a part of, has been expecting a reroll since last year---are you
>
> I sent a reroll on Jan 12. As that has had more scrutiny, it's best if
> that lands first. This series is less urgent.

Hmph, it probably was missed during pre-release freeze.

https://lore.kernel.org/git/e16bf0c5212ae85daa0d6aa2c78d551824b542bd.1640199396.git.gitgitgadget@gmail.com/

not showing the updated ones did not help, either but stuff outside
the upcoming release are not urgent right now, so it is OK.

Thanks.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 0/7] reftable: avoid reading and writing empty keys
  2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-01-12 18:07 ` [PATCH 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55 ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
                     ` (8 more replies)
  7 siblings, 9 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

this series makes sure that the object record does not have to consider
empty keys (and therefore, a NULL memcpy destination)

while we're at it add some more tests, and fix a naming mistake.

Han-Wen Nienhuys (7):
  Documentation: object_id_len goes up to 31
  reftable: reject 0 object_id_len
  reftable: add a test that verifies that writing empty keys fails
  reftable: avoid writing empty keys at the block layer
  reftable: ensure that obj_id_len is >= 2 on writing
  reftable: add test for length of disambiguating prefix
  reftable: rename writer_stats to reftable_writer_stats

 Documentation/technical/reftable.txt |   2 +-
 reftable/block.c                     |  27 ++++---
 reftable/block_test.c                |   5 ++
 reftable/reader.c                    |   5 ++
 reftable/readwrite_test.c            | 105 ++++++++++++++++++++++++++-
 reftable/reftable-writer.h           |   2 +-
 reftable/writer.c                    |   9 ++-
 7 files changed, 136 insertions(+), 19 deletions(-)


base-commit: 45fe28c951c3e70666ee4ef8379772851a8e4d32
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1185%2Fhanwen%2Fobj-id-len-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1185/hanwen/obj-id-len-v2
Pull-Request: https://github.com/git/git/pull/1185

Range-diff vs v1:

 1:  2d2e1177ff7 = 1:  80d29e8f269 Documentation: object_id_len goes up to 31
 2:  747c9e9a4c8 = 2:  4c1a19fc4ae reftable: reject 0 object_id_len
 3:  4eefedb0d07 = 3:  600b115f8b1 reftable: add a test that verifies that writing empty keys fails
 4:  e4c1cc58265 ! 4:  ba036ee8543 reftable: avoid writing empty keys at the block layer
     @@ reftable/block.c: int block_reader_first_key(struct block_reader *br, struct str
      
       ## reftable/block_test.c ##
      @@ reftable/block_test.c: static void test_block_read_write(void)
     + 	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
       			  header_off, hash_size(GIT_SHA1_FORMAT_ID));
     - 	reftable_record_from_ref(&rec, &ref);
       
     -+	ref.refname = "";
     -+	ref.value_type = REFTABLE_REF_DELETION;
     ++	rec.u.ref.refname = "";
     ++	rec.u.ref.value_type = REFTABLE_REF_DELETION;
      +	n = block_writer_add(&bw, &rec);
      +	EXPECT(n == REFTABLE_API_ERROR);
      +
 5:  3a72aba447c = 5:  2bd3d44ba57 reftable: ensure that obj_id_len is >= 2 on writing
 6:  a5dfa048884 = 6:  82d36ee0e0d reftable: add test for length of disambiguating prefix
 7:  37aa7744c84 ! 7:  c6ffdb3471c reftable: rename writer_stats to reftable_writer_stats
     @@ reftable/readwrite_test.c: static void test_log_write_read(void)
       	n = reftable_writer_close(w);
       	EXPECT(n == 0);
       
     +-	stats = writer_stats(w);
     ++	stats = reftable_writer_stats(w);
     + 	EXPECT(stats->log_stats.blocks > 0);
     + 	reftable_writer_free(w);
     + 	w = NULL;
     +@@ reftable/readwrite_test.c: static void test_log_zlib_corruption(void)
     + 	n = reftable_writer_close(w);
     + 	EXPECT(n == 0);
     + 
      -	stats = writer_stats(w);
      +	stats = reftable_writer_stats(w);
       	EXPECT(stats->log_stats.blocks > 0);

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/7] Documentation: object_id_len goes up to 31
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The value is stored in a 5-bit field, so we can't support more without
a format version upgrade.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Documentation/technical/reftable.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index d7c3b645cfb..6a67cc4174f 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -443,7 +443,7 @@ Obj block format
 Object blocks are optional. Writers may choose to omit object blocks,
 especially if readers will not use the object name to ref mapping.
 
-Object blocks use unique, abbreviated 2-32 object name keys, mapping to
+Object blocks use unique, abbreviated 2-31 byte object name keys, mapping to
 ref blocks containing references pointing to that object directly, or as
 the peeled value of an annotated tag. Like ref blocks, object blocks use
 the file's standard block size. The abbreviation length is available in
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 2/7] reftable: reject 0 object_id_len
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-18  0:32     ` Junio C Hamano
  2022-02-17 13:55   ` [PATCH v2 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
but we forbid 0, so we can we can be sure that we never read a
0-length key.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/reader.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/reftable/reader.c b/reftable/reader.c
index 00906e7a2de..54b4025105c 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -155,6 +155,11 @@ static int parse_footer(struct reftable_reader *r, uint8_t *footer,
 	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
 				     r->log_offsets.offset > 0);
 	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	if (r->obj_offsets.is_present && !r->object_id_len) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
 	err = 0;
 done:
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/7] reftable: add a test that verifies that writing empty keys fails
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Empty keys can only be written as ref records with empty names. The
log record has a logical timestamp in the key, so the key is never
empty.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 605ba0f9fd4..fd5922e55f6 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -667,6 +667,29 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_empty_key(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record ref = {
+		.refname = "",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_DELETION,
+	};
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+	err = reftable_writer_add_ref(w, &ref);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_key_order(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -746,6 +769,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_table_read_write_seek_index);
 	RUN_TEST(test_table_refs_for_no_index);
 	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-02-17 13:55   ` [PATCH v2 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 23:55     ` Junio C Hamano
  2022-02-17 13:55   ` [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The public interface (reftable_writer) already ensures that keys are
written in strictly increasing order, and an empty key by definition
fails this check.

However, by also enforcing this at the block layer, it is easier to
verify that records (which are written into blocks) never have to
consider the possibility of empty keys.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/block.c      | 27 +++++++++++++++++----------
 reftable/block_test.c |  5 +++++
 reftable/writer.c     |  3 +--
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index 2170748c5e9..4a095afe1e2 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -88,8 +88,9 @@ uint8_t block_writer_type(struct block_writer *bw)
 	return bw->buf[bw->header_off];
 }
 
-/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
-   success */
+/* Adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success. Returns REFTABLE_API_ERROR if attempting to write a record with
+   empty key. */
 int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 {
 	struct strbuf empty = STRBUF_INIT;
@@ -105,8 +106,14 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 	int is_restart = 0;
 	struct strbuf key = STRBUF_INIT;
 	int n = 0;
+	int err = -1;
 
 	reftable_record_key(rec, &key);
+	if (!key.len) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
 	n = reftable_encode_key(&is_restart, out, last, key,
 				reftable_record_val_type(rec));
 	if (n < 0)
@@ -118,16 +125,11 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 		goto done;
 	string_view_consume(&out, n);
 
-	if (block_writer_register_restart(w, start.len - out.len, is_restart,
-					  &key) < 0)
-		goto done;
-
-	strbuf_release(&key);
-	return 0;
-
+	err = block_writer_register_restart(w, start.len - out.len, is_restart,
+					    &key);
 done:
 	strbuf_release(&key);
-	return -1;
+	return err;
 }
 
 int block_writer_finish(struct block_writer *w)
@@ -332,6 +334,9 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec)
 	if (n < 0)
 		return -1;
 
+	if (!key.len)
+		return REFTABLE_FORMAT_ERROR;
+
 	string_view_consume(&in, n);
 	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
 	if (n < 0)
@@ -358,6 +363,8 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key)
 	int n = reftable_decode_key(key, &extra, empty, in);
 	if (n < 0)
 		return n;
+	if (!key->len)
+		return -1;
 
 	return 0;
 }
diff --git a/reftable/block_test.c b/reftable/block_test.c
index fa2ee092ec0..cb88af4a563 100644
--- a/reftable/block_test.c
+++ b/reftable/block_test.c
@@ -42,6 +42,11 @@ static void test_block_read_write(void)
 	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
 			  header_off, hash_size(GIT_SHA1_FORMAT_ID));
 
+	rec.u.ref.refname = "";
+	rec.u.ref.value_type = REFTABLE_REF_DELETION;
+	n = block_writer_add(&bw, &rec);
+	EXPECT(n == REFTABLE_API_ERROR);
+
 	for (i = 0; i < N; i++) {
 		char name[100];
 		uint8_t hash[GIT_SHA1_RAWSZ];
diff --git a/reftable/writer.c b/reftable/writer.c
index 944c2329ab5..d54215a50dc 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -240,14 +240,13 @@ static int writer_add_record(struct reftable_writer *w,
 
 	writer_reinit_block_writer(w, reftable_record_type(rec));
 	err = block_writer_add(w->block_writer, rec);
-	if (err < 0) {
+	if (err == -1) {
 		/* we are writing into memory, so an error can only mean it
 		 * doesn't fit. */
 		err = REFTABLE_ENTRY_TOO_BIG_ERROR;
 		goto done;
 	}
 
-	err = 0;
 done:
 	strbuf_release(&key);
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-02-17 13:55   ` [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-18  0:01     ` Junio C Hamano
  2022-02-17 13:55   ` [PATCH v2 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

When writing the same hash many times, we might decide to use a
length-1 object ID prefix for the ObjectID => ref table, which is out
of spec.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 37 +++++++++++++++++++++++++++++++++++++
 reftable/writer.c         |  4 +++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index fd5922e55f6..35142eb070e 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -667,6 +667,42 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_min_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 2);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -772,5 +808,6 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
diff --git a/reftable/writer.c b/reftable/writer.c
index d54215a50dc..5e4e6e93416 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -515,7 +515,9 @@ static void object_record_free(void *void_arg, void *key)
 static int writer_dump_object_index(struct reftable_writer *w)
 {
 	struct write_record_arg closure = { .w = w };
-	struct common_prefix_arg common = { NULL };
+	struct common_prefix_arg common = {
+		.max = 1,		/* obj_id_len should be >= 2. */
+	};
 	if (w->obj_index_tree) {
 		infix_walk(w->obj_index_tree, &update_common, &common);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 6/7] reftable: add test for length of disambiguating prefix
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-02-17 13:55   ` [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-17 13:55   ` [PATCH v2 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The ID => ref map is trimming object IDs to a disambiguating prefix.
Check that we are computing their length correctly.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 35142eb070e..a1b835785a3 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -703,6 +703,43 @@ static void test_write_object_id_min_length(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		ref.value.val1[15] = i;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 16);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -808,6 +845,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_length);
 	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 7/7] reftable: rename writer_stats to reftable_writer_stats
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-02-17 13:55   ` [PATCH v2 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 13:55   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-18  0:02   ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Junio C Hamano
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
  8 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-17 13:55 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This function is part of the reftable API, so it should use the
reftable_ prefix

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c  | 10 +++++-----
 reftable/reftable-writer.h |  2 +-
 reftable/writer.c          |  2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index a1b835785a3..469ab79a5ad 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -100,7 +100,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	for (i = 0; i < stats->ref_stats.blocks; i++) {
 		int off = i * opts.block_size;
 		if (off == 0) {
@@ -239,7 +239,7 @@ static void test_log_write_read(void)
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	EXPECT(stats->log_stats.blocks > 0);
 	reftable_writer_free(w);
 	w = NULL;
@@ -330,7 +330,7 @@ static void test_log_zlib_corruption(void)
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	EXPECT(stats->log_stats.blocks > 0);
 	reftable_writer_free(w);
 	w = NULL;
@@ -698,7 +698,7 @@ static void test_write_object_id_min_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 2);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 2);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
@@ -735,7 +735,7 @@ static void test_write_object_id_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 16);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 16);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index a560dc17255..db8de197f6c 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -143,7 +143,7 @@ int reftable_writer_close(struct reftable_writer *w);
 
    This struct becomes invalid when the writer is freed.
  */
-const struct reftable_stats *writer_stats(struct reftable_writer *w);
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w);
 
 /* reftable_writer_free deallocates memory for the writer */
 void reftable_writer_free(struct reftable_writer *w);
diff --git a/reftable/writer.c b/reftable/writer.c
index 5e4e6e93416..6d979e245ff 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -695,7 +695,7 @@ static int writer_flush_block(struct reftable_writer *w)
 	return writer_flush_nonempty_block(w);
 }
 
-const struct reftable_stats *writer_stats(struct reftable_writer *w)
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w)
 {
 	return &w->stats;
 }
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer
  2022-02-17 13:55   ` [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
@ 2022-02-17 23:55     ` Junio C Hamano
  2022-02-21 14:32       ` Han-Wen Nienhuys
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2022-02-17 23:55 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -105,8 +106,14 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
>  	int is_restart = 0;
>  	struct strbuf key = STRBUF_INIT;
>  	int n = 0;
> +	int err = -1;
>  
>  	reftable_record_key(rec, &key);
> +	if (!key.len) {
> +		err = REFTABLE_API_ERROR;
> +		goto done;
> +	}

OK; we get an API_ERROR when trying to write a bad one.  And ...

> @@ -332,6 +334,9 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec)
>  	if (n < 0)
>  		return -1;
>  
> +	if (!key.len)
> +		return REFTABLE_FORMAT_ERROR;

... we get a FORMAT_ERROR when the data we try to read is bad
(i.e. not our fault).  OK.

> @@ -358,6 +363,8 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key)
>  	int n = reftable_decode_key(key, &extra, empty, in);
>  	if (n < 0)
>  		return n;
> +	if (!key->len)
> +		return -1;

It is curious that this gets a different error out of the same
sequence, i.e. decode-key did not return an error but the length of
the key happens to be 0, not FORMAT_ERROR.

> diff --git a/reftable/writer.c b/reftable/writer.c
> index 944c2329ab5..d54215a50dc 100644
> --- a/reftable/writer.c
> +++ b/reftable/writer.c
> @@ -240,14 +240,13 @@ static int writer_add_record(struct reftable_writer *w,
>  
>  	writer_reinit_block_writer(w, reftable_record_type(rec));
>  	err = block_writer_add(w->block_writer, rec);
> -	if (err < 0) {
> +	if (err == -1) {
>  		/* we are writing into memory, so an error can only mean it
>  		 * doesn't fit. */
>  		err = REFTABLE_ENTRY_TOO_BIG_ERROR;
>  		goto done;
>  	}
>  
> -	err = 0;

Is this "doesn't fit" related to "we catch 0-length keys", or an
unrelated fix was included in this step by "rebase -i" mistake?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing
  2022-02-17 13:55   ` [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
@ 2022-02-18  0:01     ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2022-02-18  0:01 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/reftable/writer.c b/reftable/writer.c
> index d54215a50dc..5e4e6e93416 100644
> --- a/reftable/writer.c
> +++ b/reftable/writer.c
> @@ -515,7 +515,9 @@ static void object_record_free(void *void_arg, void *key)
>  static int writer_dump_object_index(struct reftable_writer *w)
>  {
>  	struct write_record_arg closure = { .w = w };
> -	struct common_prefix_arg common = { NULL };
> +	struct common_prefix_arg common = {
> +		.max = 1,		/* obj_id_len should be >= 2. */
> +	};

It feels somewhat strange that we have to set .max to set the floor
for the minimum length, but given the way update_common() uses and
maintains the member, it is more like "max size we have seen so
far", and by pretending that we have already seen common prefix of
length 1, we'd force ourselves that we need at least 2 to
differentiate.

>  	if (w->obj_index_tree) {
>  		infix_walk(w->obj_index_tree, &update_common, &common);
>  	}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 0/7] reftable: avoid reading and writing empty keys
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-02-17 13:55   ` [PATCH v2 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
@ 2022-02-18  0:02   ` Junio C Hamano
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
  8 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2022-02-18  0:02 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> this series makes sure that the object record does not have to consider
> empty keys (and therefore, a NULL memcpy destination)
>
> while we're at it add some more tests, and fix a naming mistake.

Looking good.  Will queue.

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/7] reftable: reject 0 object_id_len
  2022-02-17 13:55   ` [PATCH v2 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
@ 2022-02-18  0:32     ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2022-02-18  0:32 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Han-Wen Nienhuys <hanwen@google.com>
>
> The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
> but we forbid 0, so we can we can be sure that we never read a

s/we can we can/we can/;

> 0-length key.
>
> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
> ---
>  reftable/reader.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/reftable/reader.c b/reftable/reader.c
> index 00906e7a2de..54b4025105c 100644
> --- a/reftable/reader.c
> +++ b/reftable/reader.c
> @@ -155,6 +155,11 @@ static int parse_footer(struct reftable_reader *r, uint8_t *footer,
>  	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
>  				     r->log_offsets.offset > 0);
>  	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
> +	if (r->obj_offsets.is_present && !r->object_id_len) {
> +		err = REFTABLE_FORMAT_ERROR;
> +		goto done;
> +	}
> +
>  	err = 0;
>  done:
>  	return err;

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer
  2022-02-17 23:55     ` Junio C Hamano
@ 2022-02-21 14:32       ` Han-Wen Nienhuys
  0 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys @ 2022-02-21 14:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Han-Wen Nienhuys via GitGitGadget, git, Han-Wen Nienhuys

On Fri, Feb 18, 2022 at 12:55 AM Junio C Hamano <gitster@pobox.com> wrote:
> > @@ -358,6 +363,8 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key)
> >       int n = reftable_decode_key(key, &extra, empty, in);
> >       if (n < 0)
> >               return n;
> > +     if (!key->len)
> > +             return -1;
>
> It is curious that this gets a different error out of the same
> sequence, i.e. decode-key did not return an error but the length of
> the key happens to be 0, not FORMAT_ERROR.

fixed.

> > --- a/reftable/writer.c
> > +++ b/reftable/writer.c
> > @@ -240,14 +240,13 @@ static int writer_add_record(struct reftable_writer *w,
> >
> >       writer_reinit_block_writer(w, reftable_record_type(rec));
> >       err = block_writer_add(w->block_writer, rec);
> > -     if (err < 0) {
> > +     if (err == -1) {
> >               /* we are writing into memory, so an error can only mean it
> >                * doesn't fit. */
> >               err = REFTABLE_ENTRY_TOO_BIG_ERROR;
> >               goto done;
> >       }
> >
> > -     err = 0;
>
> Is this "doesn't fit" related to "we catch 0-length keys", or an
> unrelated fix was included in this step by "rebase -i" mistake?

We don't want to reinterpret API_ERROR (from block_writer_add) as
ENTRY_TOO_BIG_ERROR, so we have to tweak the condition here.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Liana Sebastian

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 0/7] reftable: avoid reading and writing empty keys
  2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
                     ` (7 preceding siblings ...)
  2022-02-18  0:02   ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Junio C Hamano
@ 2022-02-21 18:46   ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
                       ` (7 more replies)
  8 siblings, 8 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys

this series makes sure that the object record does not have to consider
empty keys (and therefore, a NULL memcpy destination)

while we're at it add some more tests, and fix a naming mistake.

Han-Wen Nienhuys (7):
  Documentation: object_id_len goes up to 31
  reftable: reject 0 object_id_len
  reftable: add a test that verifies that writing empty keys fails
  reftable: avoid writing empty keys at the block layer
  reftable: ensure that obj_id_len is >= 2 on writing
  reftable: add test for length of disambiguating prefix
  reftable: rename writer_stats to reftable_writer_stats

 Documentation/technical/reftable.txt |   2 +-
 reftable/block.c                     |  27 ++++---
 reftable/block_test.c                |   5 ++
 reftable/reader.c                    |   5 ++
 reftable/readwrite_test.c            | 105 ++++++++++++++++++++++++++-
 reftable/reftable-writer.h           |   2 +-
 reftable/writer.c                    |   9 ++-
 7 files changed, 136 insertions(+), 19 deletions(-)


base-commit: 45fe28c951c3e70666ee4ef8379772851a8e4d32
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1185%2Fhanwen%2Fobj-id-len-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1185/hanwen/obj-id-len-v3
Pull-Request: https://github.com/git/git/pull/1185

Range-diff vs v2:

 1:  80d29e8f269 = 1:  80d29e8f269 Documentation: object_id_len goes up to 31
 2:  4c1a19fc4ae ! 2:  68e7bc32ff8 reftable: reject 0 object_id_len
     @@ Commit message
          reftable: reject 0 object_id_len
      
          The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
     -    but we forbid 0, so we can we can be sure that we never read a
     -    0-length key.
     +    but we forbid 0, so we can be sure that we never read a 0-length key.
      
          Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
      
 3:  600b115f8b1 = 3:  8b5aebdb07a reftable: add a test that verifies that writing empty keys fails
 4:  ba036ee8543 ! 4:  a9372cacd1b reftable: avoid writing empty keys at the block layer
     @@ reftable/block.c: int block_reader_first_key(struct block_reader *br, struct str
       	if (n < 0)
       		return n;
      +	if (!key->len)
     -+		return -1;
     ++		return REFTABLE_FORMAT_ERROR;
       
       	return 0;
       }
 5:  2bd3d44ba57 = 5:  0b8a42399dd reftable: ensure that obj_id_len is >= 2 on writing
 6:  82d36ee0e0d = 6:  bdccd969475 reftable: add test for length of disambiguating prefix
 7:  c6ffdb3471c = 7:  72499a14e38 reftable: rename writer_stats to reftable_writer_stats

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 1/7] Documentation: object_id_len goes up to 31
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The value is stored in a 5-bit field, so we can't support more without
a format version upgrade.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 Documentation/technical/reftable.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt
index d7c3b645cfb..6a67cc4174f 100644
--- a/Documentation/technical/reftable.txt
+++ b/Documentation/technical/reftable.txt
@@ -443,7 +443,7 @@ Obj block format
 Object blocks are optional. Writers may choose to omit object blocks,
 especially if readers will not use the object name to ref mapping.
 
-Object blocks use unique, abbreviated 2-32 object name keys, mapping to
+Object blocks use unique, abbreviated 2-31 byte object name keys, mapping to
 ref blocks containing references pointing to that object directly, or as
 the peeled value of an annotated tag. Like ref blocks, object blocks use
 the file's standard block size. The abbreviation length is available in
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 2/7] reftable: reject 0 object_id_len
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
but we forbid 0, so we can be sure that we never read a 0-length key.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/reader.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/reftable/reader.c b/reftable/reader.c
index 00906e7a2de..54b4025105c 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -155,6 +155,11 @@ static int parse_footer(struct reftable_reader *r, uint8_t *footer,
 	r->log_offsets.is_present = (first_block_typ == BLOCK_TYPE_LOG ||
 				     r->log_offsets.offset > 0);
 	r->obj_offsets.is_present = r->obj_offsets.offset > 0;
+	if (r->obj_offsets.is_present && !r->object_id_len) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
 	err = 0;
 done:
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 3/7] reftable: add a test that verifies that writing empty keys fails
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

Empty keys can only be written as ref records with empty names. The
log record has a logical timestamp in the key, so the key is never
empty.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 605ba0f9fd4..fd5922e55f6 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -667,6 +667,29 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_empty_key(void)
+{
+	struct reftable_write_options opts = { 0 };
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	struct reftable_ref_record ref = {
+		.refname = "",
+		.update_index = 1,
+		.value_type = REFTABLE_REF_DELETION,
+	};
+	int err;
+
+	reftable_writer_set_limits(w, 1, 1);
+	err = reftable_writer_add_ref(w, &ref);
+	EXPECT(err == REFTABLE_API_ERROR);
+
+	err = reftable_writer_close(w);
+	EXPECT(err == REFTABLE_EMPTY_TABLE_ERROR);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_key_order(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -746,6 +769,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_table_read_write_seek_index);
 	RUN_TEST(test_table_refs_for_no_index);
 	RUN_TEST(test_table_refs_for_obj_index);
+	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
 	return 0;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 4/7] reftable: avoid writing empty keys at the block layer
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-02-21 18:46     ` [PATCH v3 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The public interface (reftable_writer) already ensures that keys are
written in strictly increasing order, and an empty key by definition
fails this check.

However, by also enforcing this at the block layer, it is easier to
verify that records (which are written into blocks) never have to
consider the possibility of empty keys.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/block.c      | 27 +++++++++++++++++----------
 reftable/block_test.c |  5 +++++
 reftable/writer.c     |  3 +--
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index 2170748c5e9..34d4d073692 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -88,8 +88,9 @@ uint8_t block_writer_type(struct block_writer *bw)
 	return bw->buf[bw->header_off];
 }
 
-/* adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
-   success */
+/* Adds the reftable_record to the block. Returns -1 if it does not fit, 0 on
+   success. Returns REFTABLE_API_ERROR if attempting to write a record with
+   empty key. */
 int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 {
 	struct strbuf empty = STRBUF_INIT;
@@ -105,8 +106,14 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 	int is_restart = 0;
 	struct strbuf key = STRBUF_INIT;
 	int n = 0;
+	int err = -1;
 
 	reftable_record_key(rec, &key);
+	if (!key.len) {
+		err = REFTABLE_API_ERROR;
+		goto done;
+	}
+
 	n = reftable_encode_key(&is_restart, out, last, key,
 				reftable_record_val_type(rec));
 	if (n < 0)
@@ -118,16 +125,11 @@ int block_writer_add(struct block_writer *w, struct reftable_record *rec)
 		goto done;
 	string_view_consume(&out, n);
 
-	if (block_writer_register_restart(w, start.len - out.len, is_restart,
-					  &key) < 0)
-		goto done;
-
-	strbuf_release(&key);
-	return 0;
-
+	err = block_writer_register_restart(w, start.len - out.len, is_restart,
+					    &key);
 done:
 	strbuf_release(&key);
-	return -1;
+	return err;
 }
 
 int block_writer_finish(struct block_writer *w)
@@ -332,6 +334,9 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec)
 	if (n < 0)
 		return -1;
 
+	if (!key.len)
+		return REFTABLE_FORMAT_ERROR;
+
 	string_view_consume(&in, n);
 	n = reftable_record_decode(rec, key, extra, in, it->br->hash_size);
 	if (n < 0)
@@ -358,6 +363,8 @@ int block_reader_first_key(struct block_reader *br, struct strbuf *key)
 	int n = reftable_decode_key(key, &extra, empty, in);
 	if (n < 0)
 		return n;
+	if (!key->len)
+		return REFTABLE_FORMAT_ERROR;
 
 	return 0;
 }
diff --git a/reftable/block_test.c b/reftable/block_test.c
index fa2ee092ec0..cb88af4a563 100644
--- a/reftable/block_test.c
+++ b/reftable/block_test.c
@@ -42,6 +42,11 @@ static void test_block_read_write(void)
 	block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size,
 			  header_off, hash_size(GIT_SHA1_FORMAT_ID));
 
+	rec.u.ref.refname = "";
+	rec.u.ref.value_type = REFTABLE_REF_DELETION;
+	n = block_writer_add(&bw, &rec);
+	EXPECT(n == REFTABLE_API_ERROR);
+
 	for (i = 0; i < N; i++) {
 		char name[100];
 		uint8_t hash[GIT_SHA1_RAWSZ];
diff --git a/reftable/writer.c b/reftable/writer.c
index 944c2329ab5..d54215a50dc 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -240,14 +240,13 @@ static int writer_add_record(struct reftable_writer *w,
 
 	writer_reinit_block_writer(w, reftable_record_type(rec));
 	err = block_writer_add(w->block_writer, rec);
-	if (err < 0) {
+	if (err == -1) {
 		/* we are writing into memory, so an error can only mean it
 		 * doesn't fit. */
 		err = REFTABLE_ENTRY_TOO_BIG_ERROR;
 		goto done;
 	}
 
-	err = 0;
 done:
 	strbuf_release(&key);
 	return err;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 5/7] reftable: ensure that obj_id_len is >= 2 on writing
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-02-21 18:46     ` [PATCH v3 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

When writing the same hash many times, we might decide to use a
length-1 object ID prefix for the ObjectID => ref table, which is out
of spec.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 37 +++++++++++++++++++++++++++++++++++++
 reftable/writer.c         |  4 +++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index fd5922e55f6..35142eb070e 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -667,6 +667,42 @@ static void test_write_empty_table(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_min_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 2);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -772,5 +808,6 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
diff --git a/reftable/writer.c b/reftable/writer.c
index d54215a50dc..5e4e6e93416 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -515,7 +515,9 @@ static void object_record_free(void *void_arg, void *key)
 static int writer_dump_object_index(struct reftable_writer *w)
 {
 	struct write_record_arg closure = { .w = w };
-	struct common_prefix_arg common = { NULL };
+	struct common_prefix_arg common = {
+		.max = 1,		/* obj_id_len should be >= 2. */
+	};
 	if (w->obj_index_tree) {
 		infix_walk(w->obj_index_tree, &update_common, &common);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 6/7] reftable: add test for length of disambiguating prefix
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-02-21 18:46     ` [PATCH v3 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-21 18:46     ` [PATCH v3 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
  2022-02-23 21:37     ` [PATCH v3 0/7] reftable: avoid reading and writing empty keys Junio C Hamano
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

The ID => ref map is trimming object IDs to a disambiguating prefix.
Check that we are computing their length correctly.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index 35142eb070e..a1b835785a3 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -703,6 +703,43 @@ static void test_write_object_id_min_length(void)
 	strbuf_release(&buf);
 }
 
+static void test_write_object_id_length(void)
+{
+	struct reftable_write_options opts = {
+		.block_size = 75,
+	};
+	struct strbuf buf = STRBUF_INIT;
+	struct reftable_writer *w =
+		reftable_new_writer(&strbuf_add_void, &buf, &opts);
+	uint8_t hash[GIT_SHA1_RAWSZ] = {42};
+	struct reftable_ref_record ref = {
+		.update_index = 1,
+		.value_type = REFTABLE_REF_VAL1,
+		.value.val1 = hash,
+	};
+	int err;
+	int i;
+
+	reftable_writer_set_limits(w, 1, 1);
+
+	/* Write the same hash in many refs. If there is only 1 hash, the
+	 * disambiguating prefix is length 0 */
+	for (i = 0; i < 256; i++) {
+		char name[256];
+		snprintf(name, sizeof(name), "ref%05d", i);
+		ref.refname = name;
+		ref.value.val1[15] = i;
+		err = reftable_writer_add_ref(w, &ref);
+		EXPECT_ERR(err);
+	}
+
+	err = reftable_writer_close(w);
+	EXPECT_ERR(err);
+	EXPECT(writer_stats(w)->object_id_len == 16);
+	reftable_writer_free(w);
+	strbuf_release(&buf);
+}
+
 static void test_write_empty_key(void)
 {
 	struct reftable_write_options opts = { 0 };
@@ -808,6 +845,7 @@ int readwrite_test_main(int argc, const char *argv[])
 	RUN_TEST(test_write_empty_key);
 	RUN_TEST(test_write_empty_table);
 	RUN_TEST(test_log_overflow);
+	RUN_TEST(test_write_object_id_length);
 	RUN_TEST(test_write_object_id_min_length);
 	return 0;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 7/7] reftable: rename writer_stats to reftable_writer_stats
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-02-21 18:46     ` [PATCH v3 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
@ 2022-02-21 18:46     ` Han-Wen Nienhuys via GitGitGadget
  2022-02-23 21:37     ` [PATCH v3 0/7] reftable: avoid reading and writing empty keys Junio C Hamano
  7 siblings, 0 replies; 33+ messages in thread
From: Han-Wen Nienhuys via GitGitGadget @ 2022-02-21 18:46 UTC (permalink / raw)
  To: git; +Cc: Han-Wen Nienhuys, Han-Wen Nienhuys, Han-Wen Nienhuys

From: Han-Wen Nienhuys <hanwen@google.com>

This function is part of the reftable API, so it should use the
reftable_ prefix

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
---
 reftable/readwrite_test.c  | 10 +++++-----
 reftable/reftable-writer.h |  2 +-
 reftable/writer.c          |  2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c
index a1b835785a3..469ab79a5ad 100644
--- a/reftable/readwrite_test.c
+++ b/reftable/readwrite_test.c
@@ -100,7 +100,7 @@ static void write_table(char ***names, struct strbuf *buf, int N,
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	for (i = 0; i < stats->ref_stats.blocks; i++) {
 		int off = i * opts.block_size;
 		if (off == 0) {
@@ -239,7 +239,7 @@ static void test_log_write_read(void)
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	EXPECT(stats->log_stats.blocks > 0);
 	reftable_writer_free(w);
 	w = NULL;
@@ -330,7 +330,7 @@ static void test_log_zlib_corruption(void)
 	n = reftable_writer_close(w);
 	EXPECT(n == 0);
 
-	stats = writer_stats(w);
+	stats = reftable_writer_stats(w);
 	EXPECT(stats->log_stats.blocks > 0);
 	reftable_writer_free(w);
 	w = NULL;
@@ -698,7 +698,7 @@ static void test_write_object_id_min_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 2);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 2);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
@@ -735,7 +735,7 @@ static void test_write_object_id_length(void)
 
 	err = reftable_writer_close(w);
 	EXPECT_ERR(err);
-	EXPECT(writer_stats(w)->object_id_len == 16);
+	EXPECT(reftable_writer_stats(w)->object_id_len == 16);
 	reftable_writer_free(w);
 	strbuf_release(&buf);
 }
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index a560dc17255..db8de197f6c 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -143,7 +143,7 @@ int reftable_writer_close(struct reftable_writer *w);
 
    This struct becomes invalid when the writer is freed.
  */
-const struct reftable_stats *writer_stats(struct reftable_writer *w);
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w);
 
 /* reftable_writer_free deallocates memory for the writer */
 void reftable_writer_free(struct reftable_writer *w);
diff --git a/reftable/writer.c b/reftable/writer.c
index 5e4e6e93416..6d979e245ff 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -695,7 +695,7 @@ static int writer_flush_block(struct reftable_writer *w)
 	return writer_flush_nonempty_block(w);
 }
 
-const struct reftable_stats *writer_stats(struct reftable_writer *w)
+const struct reftable_stats *reftable_writer_stats(struct reftable_writer *w)
 {
 	return &w->stats;
 }
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 0/7] reftable: avoid reading and writing empty keys
  2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-02-21 18:46     ` [PATCH v3 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
@ 2022-02-23 21:37     ` Junio C Hamano
  7 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2022-02-23 21:37 UTC (permalink / raw)
  To: Han-Wen Nienhuys via GitGitGadget; +Cc: git, Han-Wen Nienhuys, Han-Wen Nienhuys

"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes:

> this series makes sure that the object record does not have to consider
> empty keys (and therefore, a NULL memcpy destination)
>
> while we're at it add some more tests, and fix a naming mistake.

Looking good.  Let's mark it for 'next' and below soonish.

Thanks.

>
> Han-Wen Nienhuys (7):
>   Documentation: object_id_len goes up to 31
>   reftable: reject 0 object_id_len
>   reftable: add a test that verifies that writing empty keys fails
>   reftable: avoid writing empty keys at the block layer
>   reftable: ensure that obj_id_len is >= 2 on writing
>   reftable: add test for length of disambiguating prefix
>   reftable: rename writer_stats to reftable_writer_stats
>
>  Documentation/technical/reftable.txt |   2 +-
>  reftable/block.c                     |  27 ++++---
>  reftable/block_test.c                |   5 ++
>  reftable/reader.c                    |   5 ++
>  reftable/readwrite_test.c            | 105 ++++++++++++++++++++++++++-
>  reftable/reftable-writer.h           |   2 +-
>  reftable/writer.c                    |   9 ++-
>  7 files changed, 136 insertions(+), 19 deletions(-)
>
>
> base-commit: 45fe28c951c3e70666ee4ef8379772851a8e4d32
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1185%2Fhanwen%2Fobj-id-len-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1185/hanwen/obj-id-len-v3
> Pull-Request: https://github.com/git/git/pull/1185
>
> Range-diff vs v2:
>
>  1:  80d29e8f269 = 1:  80d29e8f269 Documentation: object_id_len goes up to 31
>  2:  4c1a19fc4ae ! 2:  68e7bc32ff8 reftable: reject 0 object_id_len
>      @@ Commit message
>           reftable: reject 0 object_id_len
>       
>           The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
>      -    but we forbid 0, so we can we can be sure that we never read a
>      -    0-length key.
>      +    but we forbid 0, so we can be sure that we never read a 0-length key.
>       
>           Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
>       
>  3:  600b115f8b1 = 3:  8b5aebdb07a reftable: add a test that verifies that writing empty keys fails
>  4:  ba036ee8543 ! 4:  a9372cacd1b reftable: avoid writing empty keys at the block layer
>      @@ reftable/block.c: int block_reader_first_key(struct block_reader *br, struct str
>        	if (n < 0)
>        		return n;
>       +	if (!key->len)
>      -+		return -1;
>      ++		return REFTABLE_FORMAT_ERROR;
>        
>        	return 0;
>        }
>  5:  2bd3d44ba57 = 5:  0b8a42399dd reftable: ensure that obj_id_len is >= 2 on writing
>  6:  82d36ee0e0d = 6:  bdccd969475 reftable: add test for length of disambiguating prefix
>  7:  c6ffdb3471c = 7:  72499a14e38 reftable: rename writer_stats to reftable_writer_stats

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-02-23 21:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12 18:07 [PATCH 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
2022-01-14  1:26   ` Junio C Hamano
2022-01-17 13:10     ` Han-Wen Nienhuys
2022-01-17 19:11       ` Junio C Hamano
2022-01-12 18:07 ` [PATCH 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
2022-01-12 18:07 ` [PATCH 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
2022-02-17 13:55 ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Han-Wen Nienhuys via GitGitGadget
2022-02-17 13:55   ` [PATCH v2 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
2022-02-17 13:55   ` [PATCH v2 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
2022-02-18  0:32     ` Junio C Hamano
2022-02-17 13:55   ` [PATCH v2 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
2022-02-17 13:55   ` [PATCH v2 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
2022-02-17 23:55     ` Junio C Hamano
2022-02-21 14:32       ` Han-Wen Nienhuys
2022-02-17 13:55   ` [PATCH v2 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
2022-02-18  0:01     ` Junio C Hamano
2022-02-17 13:55   ` [PATCH v2 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
2022-02-17 13:55   ` [PATCH v2 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
2022-02-18  0:02   ` [PATCH v2 0/7] reftable: avoid reading and writing empty keys Junio C Hamano
2022-02-21 18:46   ` [PATCH v3 " Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 1/7] Documentation: object_id_len goes up to 31 Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 2/7] reftable: reject 0 object_id_len Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 3/7] reftable: add a test that verifies that writing empty keys fails Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 4/7] reftable: avoid writing empty keys at the block layer Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 5/7] reftable: ensure that obj_id_len is >= 2 on writing Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 6/7] reftable: add test for length of disambiguating prefix Han-Wen Nienhuys via GitGitGadget
2022-02-21 18:46     ` [PATCH v3 7/7] reftable: rename writer_stats to reftable_writer_stats Han-Wen Nienhuys via GitGitGadget
2022-02-23 21:37     ` [PATCH v3 0/7] reftable: avoid reading and writing empty keys Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).