git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/12] a rabbit hole of update-server-info fixes
@ 2019-04-04 23:21 Jeff King
  2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
                   ` (13 more replies)
  0 siblings, 14 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:21 UTC (permalink / raw)
  To: git

This patch series started with patch 12: I just wanted to drop the
unused "force" parameter from update_info_refs().

But that made me look at its sibling update_info_packs(), and whether it
needs its force parameter. It _does_ in fact do something (though I am
slightly doubtful that anybody actually cares in practice).

However, I did see a memory bug in its parser, which is fixed by patch
6. And a bunch of other cleanups, which are patches 7-11. Fortunately
that buggy parser is not used to look at info/packs files we pull over
dumb-http; it has its own parser. So I looked at that, and of course it
had a bug, too (fixed in patch 5).

And in those cleanups, I needed to get the basename of a pack from its
packed_git. So I looked for other code that did the same, in case we
could benefit from a shared helper. And behold, the midx code does. But
you guessed it, there was a bug (actually 2). Fixed in patches 4 (with
necessary refactoring in patch 3).

And while constructing a test for that bug, I noticed that one of the
other tests in t5319 was buggy. So that's patch 1 (with some cleanup in
patch 2).

And here we are. I present them here in reverse rabbit-hole order (which
is also roughly important fixes first, and minor cleanups last). The
individual chunks are mostly independent, but the server-info cleanups
rely on the shared pack_basename() helper added as part of the midx fix.

  [01/12]: t5319: fix bogus cat-file argument
  [02/12]: t5319: drop useless --buffer from cat-file
  [03/12]: packfile: factor out .pack to .idx name conversion
  [04/12]: packfile: check midx coverage with .idx rather than .pack
  [05/12]: http: simplify parsing of remote objects/info/packs
  [06/12]: server-info: fix blind pointer arithmetic
  [07/12]: server-info: simplify cleanup in parse_pack_def()
  [08/12]: server-info: use strbuf to read old info/packs file
  [09/12]: server-info: drop nr_alloc struct member
  [10/12]: packfile.h: drop extern from function declarations
  [11/12]: server-info: drop objdirlen pointer arithmetic
  [12/12]: update_info_refs(): drop unused force parameter

 http.c                      | 35 ++++++---------
 packfile.c                  | 31 ++++++++++---
 packfile.h                  | 86 ++++++++++++++++++++-----------------
 server-info.c               | 57 +++++++++++-------------
 t/t5319-multi-pack-index.sh | 29 ++++++++++---
 5 files changed, 132 insertions(+), 106 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/12] t5319: fix bogus cat-file argument
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
@ 2019-04-04 23:22 ` Jeff King
  2019-04-05  0:44   ` Ramsay Jones
  2019-04-04 23:22 ` [PATCH 02/12] t5319: drop useless --buffer from cat-file Jeff King
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:22 UTC (permalink / raw)
  To: git

There's no such argument as "--unordered"; it's spelled "--unsorted".
But our test failed to notice that cat-file didn't run at all because:

  1. It lost the exit code of git on the left-hand side of a pipe.

  2. It was comparing two runs of the broken invocation with and without
     a particular config variable (and indeed, both cases produced no
     output!).

Let's fix the option, but also tweak the helper function to check the
exit code.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5319-multi-pack-index.sh | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 70926b5bc0..42f4d6cd01 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -86,13 +86,14 @@ test_expect_success 'write midx with one v1 pack' '
 '
 
 midx_git_two_modes () {
+	git -c core.multiPackIndex=false $1 >expect &&
+	git -c core.multiPackIndex=true $1 >actual &&
 	if [ "$2" = "sorted" ]
 	then
-		git -c core.multiPackIndex=false $1 | sort >expect &&
-		git -c core.multiPackIndex=true $1 | sort >actual
-	else
-		git -c core.multiPackIndex=false $1 >expect &&
-		git -c core.multiPackIndex=true $1 >actual
+		sort <expect >expect.sorted &&
+		mv expect.sorted expect &&
+		sort <actual >actual.sorted &&
+		mv actual.sorted actual
 	fi &&
 	test_cmp expect actual
 }
@@ -104,7 +105,7 @@ compare_results_with_midx () {
 		midx_git_two_modes "log --raw" &&
 		midx_git_two_modes "count-objects --verbose" &&
 		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unsorted" sorted
+		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unordered" sorted
 	'
 }
 
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/12] t5319: drop useless --buffer from cat-file
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
  2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
@ 2019-04-04 23:22 ` Jeff King
  2019-04-04 23:22 ` [PATCH 03/12] packfile: factor out .pack to .idx name conversion Jeff King
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:22 UTC (permalink / raw)
  To: git

The cat-file --buffer option is the default already when using
--batch-all-objects. It doesn't hurt to specify it, but it's nice for
the test scripts to model good usage.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5319-multi-pack-index.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 42f4d6cd01..8c4d2bd849 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -104,8 +104,8 @@ compare_results_with_midx () {
 		midx_git_two_modes "rev-list --objects --all" &&
 		midx_git_two_modes "log --raw" &&
 		midx_git_two_modes "count-objects --verbose" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unordered" sorted
+		midx_git_two_modes "cat-file --batch-all-objects --batch-check" &&
+		midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted
 	'
 }
 
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/12] packfile: factor out .pack to .idx name conversion
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
  2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
  2019-04-04 23:22 ` [PATCH 02/12] t5319: drop useless --buffer from cat-file Jeff King
@ 2019-04-04 23:22 ` Jeff King
  2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:22 UTC (permalink / raw)
  To: git

When we open a pack .idx, we have to convert the "foo.pack" name stored
in the packed_git struct to "foo.idx". This isn't too complicated, but
we do encode some policy in the form of a BUG(). Let's pull this into
its own function so it can be reused.

Signed-off-by: Jeff King <peff@peff.net>
---
 packfile.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/packfile.c b/packfile.c
index 16bcb75262..054269ae5d 100644
--- a/packfile.c
+++ b/packfile.c
@@ -186,18 +186,24 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
 	return 0;
 }
 
+static char *pack_name_to_idx(const char *pack_name)
+{
+	size_t len;
+
+	if (!strip_suffix(pack_name, ".pack", &len))
+		BUG("pack_name does not end in .pack");
+	return xstrfmt("%.*s.idx", (int)len, pack_name);
+}
+
 int open_pack_index(struct packed_git *p)
 {
 	char *idx_name;
-	size_t len;
 	int ret;
 
 	if (p->index_data)
 		return 0;
 
-	if (!strip_suffix(p->pack_name, ".pack", &len))
-		BUG("pack_name does not end in .pack");
-	idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
+	idx_name = pack_name_to_idx(p->pack_name);
 	ret = check_packed_git_idx(idx_name, p);
 	free(idx_name);
 	return ret;
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (2 preceding siblings ...)
  2019-04-04 23:22 ` [PATCH 03/12] packfile: factor out .pack to .idx name conversion Jeff King
@ 2019-04-04 23:25 ` Jeff King
  2019-04-05  8:05   ` René Scharfe
  2019-04-05 12:01   ` SZEDER Gábor
  2019-04-04 23:27 ` [PATCH 05/12] http: simplify parsing of remote objects/info/packs Jeff King
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:25 UTC (permalink / raw)
  To: git

When we have a .midx that covers many packfiles, we try to avoid opening
the .idx for those packfiles. However, there are a few problems with the
filename comparison we use:

  - we ask midx_contains_pack() about the .pack name, not the .idx name.
    But it compares to the latter.

  - we compute the basename of the pack using strrchr() to find the
    final slash. But that leaves an extra "/" at the start of our
    string; we need to advance past it.

    That also raises the question of what to do when the name does not
    have a slash at all. This should generally not happen (we always
    find files in "pack/"), but it doesn't hurt to be defensive here.

The tests don't notice because there's nothing about opening those .idx
files that would cause us to give incorrect output. It's just a little
slower. The new test checks this case by corrupting the covered .idx,
and then making sure we don't complain about it.

Signed-off-by: Jeff King <peff@peff.net>
---
 packfile.c                  | 17 ++++++++++++++---
 t/t5319-multi-pack-index.sh | 14 ++++++++++++++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/packfile.c b/packfile.c
index 054269ae5d..e7ca135ed5 100644
--- a/packfile.c
+++ b/packfile.c
@@ -472,6 +472,16 @@ static unsigned int get_max_fd_limit(void)
 #endif
 }
 
+static const char *pack_basename(struct packed_git *p)
+{
+	const char *ret = strrchr(p->pack_name, '/');
+	if (ret)
+		ret = ret + 1; /* skip past slash */
+	else
+		ret = p->pack_name; /* we only have a base */
+	return ret;
+}
+
 /*
  * Do not call this directly as this leaks p->pack_fd on error return;
  * call open_packed_git() instead.
@@ -486,15 +496,16 @@ static int open_packed_git_1(struct packed_git *p)
 	ssize_t read_result;
 	const unsigned hashsz = the_hash_algo->rawsz;
 
-	if (!p->index_data) {
+	if (!p->index_data && the_repository->objects->multi_pack_index) {
 		struct multi_pack_index *m;
-		const char *pack_name = strrchr(p->pack_name, '/');
+		char *idx_name = pack_name_to_idx(pack_basename(p));
 
 		for (m = the_repository->objects->multi_pack_index;
 		     m; m = m->next) {
-			if (midx_contains_pack(m, pack_name))
+			if (midx_contains_pack(m, idx_name))
 				break;
 		}
+		free(idx_name);
 
 		if (!m && open_pack_index(p))
 			return error("packfile %s index unavailable", p->pack_name);
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 8c4d2bd849..1ebf19ec3c 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -117,6 +117,20 @@ test_expect_success 'write midx with one v2 pack' '
 
 compare_results_with_midx "one v2 pack"
 
+test_expect_success 'corrupt idx not opened' '
+	idx=$(test-tool read-midx $objdir | grep "\.idx\$") &&
+	mv $objdir/pack/$idx backup-$idx &&
+	test_when_finished "mv backup-\$idx \$objdir/pack/\$idx" &&
+
+	# This is the minimum size for a sha-1 based .idx; this lets
+	# us pass perfunctory tests, but anything that actually opens and reads
+	# the idx file will complain.
+	test_copy_bytes 1064 <backup-$idx >$objdir/pack/$idx &&
+
+	git -c core.multiPackIndex=true rev-list --objects --all 2>err &&
+	test_must_be_empty err
+'
+
 test_expect_success 'add more objects' '
 	for i in $(test_seq 6 10)
 	do
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/12] http: simplify parsing of remote objects/info/packs
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (3 preceding siblings ...)
  2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
@ 2019-04-04 23:27 ` Jeff King
  2019-04-05 10:41   ` René Scharfe
  2019-04-04 23:27 ` [PATCH 06/12] server-info: fix blind pointer arithmetic Jeff King
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:27 UTC (permalink / raw)
  To: git

We can use skip_prefix() and parse_oid_hex() to continuously increment
our pointer, rather than dealing with magic numbers. This also fixes a
few small shortcomings:

  - if we see a 'P' line that does not match our expectations, we'll
    leave our "i" counter in the middle of the line. So we'll parse:
    "P P P pack-1234.pack" as if there was just one "P" which was not
    intentional (though probably not that harmful).

  - if we see a line with the right prefix, suffix, and length, i.e.
    matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
    hex without checking if it could be parsed. This could lead to us
    looking at uninitialized garbage in the hash array. In practice this
    means we'll just make a garbage request to the server which will
    fail, though it's interesting that a malicious server could convince
    us to leak 40 bytes of uninitialized stack to them.

  - the current code is picky about seeing a newline at the end of file,
    but we can easily be more liberal

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/http.c b/http.c
index a32ad36ddf..2ef47bc779 100644
--- a/http.c
+++ b/http.c
@@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
 int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
 {
 	struct http_get_options options = {0};
-	int ret = 0, i = 0;
-	char *url, *data;
+	int ret = 0;
+	char *url;
+	const char *data;
 	struct strbuf buf = STRBUF_INIT;
-	unsigned char hash[GIT_MAX_RAWSZ];
-	const unsigned hexsz = the_hash_algo->hexsz;
+	struct object_id oid;
 
 	end_url_with_slash(&buf, base_url);
 	strbuf_addstr(&buf, "objects/info/packs");
@@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
 		goto cleanup;
 
 	data = buf.buf;
-	while (i < buf.len) {
-		switch (data[i]) {
-		case 'P':
-			i++;
-			if (i + hexsz + 12 <= buf.len &&
-			    starts_with(data + i, " pack-") &&
-			    starts_with(data + i + hexsz + 6, ".pack\n")) {
-				get_sha1_hex(data + i + 6, hash);
-				fetch_and_setup_pack_index(packs_head, hash,
-						      base_url);
-				i += hexsz + 11;
-				break;
-			}
-		default:
-			while (i < buf.len && data[i] != '\n')
-				i++;
+	while (*data) {
+		if (skip_prefix(data, "P pack-", &data) &&
+		    !parse_oid_hex(data, &oid, &data) &&
+		    skip_prefix(data, ".pack", &data) &&
+		    (*data == '\n' || *data == '\0')) {
+			fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
+		} else {
+			data = strchrnul(data, '\n');
 		}
-		i++;
+		if (*data)
+			data++; /* skip past newline */
 	}
 
 cleanup:
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/12] server-info: fix blind pointer arithmetic
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (4 preceding siblings ...)
  2019-04-04 23:27 ` [PATCH 05/12] http: simplify parsing of remote objects/info/packs Jeff King
@ 2019-04-04 23:27 ` Jeff King
  2019-04-04 23:28 ` [PATCH 07/12] server-info: simplify cleanup in parse_pack_def() Jeff King
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:27 UTC (permalink / raw)
  To: git

When we're writing out a new objects/info/packs file, we read back the
old one to try to keep the ordering the same. When we see a line
starting with "P", we expect "P pack-1234..." and blindly jump to "line
+ 2" to parse the pack name. If we saw a line with _just_ "P" and
nothing else, we'd jump past the end of the buffer and start reading
arbitrary memory.

This shouldn't be a big attack vector, as the files are local to the
repository and written by us, but it's clearly worth fixing (we do read
remote copies of the file for dumb-http fetches, but using a totally
different parser!).

Let's instead use skip_prefix() here, which avoids pointer arithmetic
altogether. Note that this converts our switch statement to an if/else
chain, making it slightly more verbose. But it will also make it easier
to do a few follow-on cleanups.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/server-info.c b/server-info.c
index e2b2d6a27a..b61a6be4c2 100644
--- a/server-info.c
+++ b/server-info.c
@@ -112,9 +112,9 @@ static struct pack_info *find_pack_by_name(const char *name)
 /* Returns non-zero when we detect that the info in the
  * old file is useless.
  */
-static int parse_pack_def(const char *line, int old_cnt)
+static int parse_pack_def(const char *packname, int old_cnt)
 {
-	struct pack_info *i = find_pack_by_name(line + 2);
+	struct pack_info *i = find_pack_by_name(packname);
 	if (i) {
 		i->old_num = old_cnt;
 		return 0;
@@ -139,24 +139,26 @@ static int read_pack_info_file(const char *infofile)
 		return 1; /* nonexistent is not an error. */
 
 	while (fgets(line, sizeof(line), fp)) {
+		const char *arg;
 		int len = strlen(line);
 		if (len && line[len-1] == '\n')
 			line[--len] = 0;
 
 		if (!len)
 			continue;
 
-		switch (line[0]) {
-		case 'P': /* P name */
-			if (parse_pack_def(line, old_cnt++))
+		if (skip_prefix(line, "P ", &arg)) {
+			/* P name */
+			if (parse_pack_def(arg, old_cnt++))
 				goto out_stale;
-			break;
-		case 'D': /* we used to emit D but that was misguided. */
-		case 'T': /* we used to emit T but nobody uses it. */
+		} else if (line[0] == 'D') {
+			/* we used to emit D but that was misguided. */
 			goto out_stale;
-		default:
+		} else if (line[0] == 'T') {
+			/* we used to emit T but nobody uses it. */
+			goto out_stale;
+		} else {
 			error("unrecognized: %s", line);
-			break;
 		}
 	}
 	fclose(fp);
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/12] server-info: simplify cleanup in parse_pack_def()
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (5 preceding siblings ...)
  2019-04-04 23:27 ` [PATCH 06/12] server-info: fix blind pointer arithmetic Jeff King
@ 2019-04-04 23:28 ` Jeff King
  2019-04-04 23:28 ` [PATCH 08/12] server-info: use strbuf to read old info/packs file Jeff King
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:28 UTC (permalink / raw)
  To: git

We have two exits from the function: either we jump to the out_stale
label or not. But in both exits we repeat our cleanup, and the only
difference is our return value. Let's just use a variable for the return
value to avoid repeating ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/server-info.c b/server-info.c
index b61a6be4c2..ba44cece7f 100644
--- a/server-info.c
+++ b/server-info.c
@@ -133,6 +133,7 @@ static int read_pack_info_file(const char *infofile)
 	FILE *fp;
 	char line[1000];
 	int old_cnt = 0;
+	int stale = 1;
 
 	fp = fopen_or_warn(infofile, "r");
 	if (!fp)
@@ -161,11 +162,11 @@ static int read_pack_info_file(const char *infofile)
 			error("unrecognized: %s", line);
 		}
 	}
-	fclose(fp);
-	return 0;
+	stale = 0;
+
  out_stale:
 	fclose(fp);
-	return 1;
+	return stale;
 }
 
 static int compare_info(const void *a_, const void *b_)
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/12] server-info: use strbuf to read old info/packs file
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (6 preceding siblings ...)
  2019-04-04 23:28 ` [PATCH 07/12] server-info: simplify cleanup in parse_pack_def() Jeff King
@ 2019-04-04 23:28 ` Jeff King
  2019-04-04 23:29 ` [PATCH 09/12] server-info: drop nr_alloc struct member Jeff King
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:28 UTC (permalink / raw)
  To: git

This old code uses fgets with a fixed-size buffer. Let's use a strbuf
instead, so we don't have to wonder if "1000" is big enough, or what
happens if we see a long line.

This also lets us drop our custom code to trim the newline.

Probably nobody actually cares about the 1000-char limit (after all, the
lines generally only say "P pack-[0-9a-f]{40}.pack"), so this is mostly
just about cleanup/readability.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/server-info.c b/server-info.c
index ba44cece7f..d4115fecbb 100644
--- a/server-info.c
+++ b/server-info.c
@@ -131,40 +131,38 @@ static int parse_pack_def(const char *packname, int old_cnt)
 static int read_pack_info_file(const char *infofile)
 {
 	FILE *fp;
-	char line[1000];
+	struct strbuf line = STRBUF_INIT;
 	int old_cnt = 0;
 	int stale = 1;
 
 	fp = fopen_or_warn(infofile, "r");
 	if (!fp)
 		return 1; /* nonexistent is not an error. */
 
-	while (fgets(line, sizeof(line), fp)) {
+	while (strbuf_getline(&line, fp) != EOF) {
 		const char *arg;
-		int len = strlen(line);
-		if (len && line[len-1] == '\n')
-			line[--len] = 0;
 
-		if (!len)
+		if (!line.len)
 			continue;
 
-		if (skip_prefix(line, "P ", &arg)) {
+		if (skip_prefix(line.buf, "P ", &arg)) {
 			/* P name */
 			if (parse_pack_def(arg, old_cnt++))
 				goto out_stale;
-		} else if (line[0] == 'D') {
+		} else if (line.buf[0] == 'D') {
 			/* we used to emit D but that was misguided. */
 			goto out_stale;
-		} else if (line[0] == 'T') {
+		} else if (line.buf[0] == 'T') {
 			/* we used to emit T but nobody uses it. */
 			goto out_stale;
 		} else {
-			error("unrecognized: %s", line);
+			error("unrecognized: %s", line.buf);
 		}
 	}
 	stale = 0;
 
  out_stale:
+	strbuf_release(&line);
 	fclose(fp);
 	return stale;
 }
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/12] server-info: drop nr_alloc struct member
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (7 preceding siblings ...)
  2019-04-04 23:28 ` [PATCH 08/12] server-info: use strbuf to read old info/packs file Jeff King
@ 2019-04-04 23:29 ` Jeff King
  2019-04-04 23:30 ` [PATCH 10/12] packfile.h: drop extern from function declarations Jeff King
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:29 UTC (permalink / raw)
  To: git

We keep an array of struct pointers, with each one representing a single
packfile. But for some reason there is a nr_alloc parameter inside each
struct, which has never been used.

This is probably cruft left over from development, where we might have
wanted a nr_alloc to dynamically grow the list. But as it turns out, we
do not dynamically grow the list at all, but rather count up the total
number of packs and use that as a maximum size. So while we're thinking
of this, let's add an assert() that documents (and checks!) that our
allocation and fill loops stay in sync.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/server-info.c b/server-info.c
index d4115fecbb..c9fbfd3a51 100644
--- a/server-info.c
+++ b/server-info.c
@@ -91,7 +91,6 @@ static struct pack_info {
 	struct packed_git *p;
 	int old_num;
 	int new_num;
-	int nr_alloc;
 } **info;
 static int num_pack;
 static const char *objdir;
@@ -213,6 +212,7 @@ static void init_pack_info(const char *infofile, int force)
 	for (i = 0, p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local)
 			continue;
+		assert(i < num_pack);
 		info[i] = xcalloc(1, sizeof(struct pack_info));
 		info[i]->p = p;
 		info[i]->old_num = -1;
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/12] packfile.h: drop extern from function declarations
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (8 preceding siblings ...)
  2019-04-04 23:29 ` [PATCH 09/12] server-info: drop nr_alloc struct member Jeff King
@ 2019-04-04 23:30 ` Jeff King
  2019-04-04 23:30 ` [PATCH 11/12] server-info: drop objdirlen pointer arithmetic Jeff King
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:30 UTC (permalink / raw)
  To: git

As CodingGuidelines recommends, we do not need an "extern" when
declaring a public function. Let's drop these. Note that we leave the
extern on report_garbage(), as that is actually a function pointer, not
a function itself.

Signed-off-by: Jeff King <peff@peff.net>
---
I hesitated on this one, because it does have a minor conflict with pu.
But I just couldn't bring myself to add a new extern, nor to stick a
non-extern one amidst a bunch of others.

 packfile.h | 80 +++++++++++++++++++++++++++---------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/packfile.h b/packfile.h
index d70c6d9afb..b40fc34fb2 100644
--- a/packfile.h
+++ b/packfile.h
@@ -15,23 +15,23 @@ struct object_info;
  *
  * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
  */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
+char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
 
 /*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_name(const unsigned char *sha1);
+char *sha1_pack_name(const unsigned char *sha1);
 
 /*
  * Return the name of the (local) pack index file with the specified
  * sha1 in its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
+char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
 typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
 				      const char *file_pach, void *data);
@@ -45,8 +45,8 @@ void for_each_file_in_pack_dir(const char *objdir,
 #define PACKDIR_FILE_GARBAGE 4
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
-extern void reprepare_packed_git(struct repository *r);
-extern void install_packed_git(struct repository *r, struct packed_git *pack);
+void reprepare_packed_git(struct repository *r);
+void install_packed_git(struct repository *r, struct packed_git *pack);
 
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
@@ -59,32 +59,32 @@ struct packed_git *get_all_packs(struct repository *r);
  */
 unsigned long approximate_object_count(void);
 
-extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
-					 struct packed_git *packs);
+struct packed_git *find_sha1_pack(const unsigned char *sha1,
+				  struct packed_git *packs);
 
-extern void pack_report(void);
+void pack_report(void);
 
 /*
  * mmap the index file for the specified packfile (if it is not
  * already mmapped).  Return 0 on success.
  */
-extern int open_pack_index(struct packed_git *);
+int open_pack_index(struct packed_git *);
 
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
  */
-extern void close_pack_index(struct packed_git *);
+void close_pack_index(struct packed_git *);
 
-extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
+uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_pack(struct packed_git *);
-extern void close_all_packs(struct raw_object_store *o);
-extern void unuse_pack(struct pack_window **);
-extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
+unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+void close_pack_windows(struct packed_git *);
+void close_pack(struct packed_git *);
+void close_all_packs(struct raw_object_store *o);
+void unuse_pack(struct pack_window **);
+void clear_delta_base_cache(void);
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
@@ -94,7 +94,7 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int
  * (like the 64-bit extended offset table), as we compare the size to the
  * fixed-length parts when we open the file.
  */
-extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
+void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
 /*
  * Perform binary search on a pack-index for a given oid. Packfile is expected to
@@ -110,59 +110,59 @@ int bsearch_pack(const struct object_id *oid, const struct packed_git *p, uint32
  * at the SHA-1 within the mmapped index.  Return NULL if there is an
  * error.
  */
-extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
+const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
 /*
  * Like nth_packed_object_sha1, but write the data into the object specified by
  * the the first argument.  Returns the first argument on success, and NULL on
  * error.
  */
-extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
+const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
 
 /*
  * Return the offset of the nth object within the specified packfile.
  * The index must already be opened.
  */
-extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
+off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 
 /*
  * If the object named sha1 is present in the specified packfile,
  * return its offset within the packfile; otherwise, return 0.
  */
-extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
+off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
 
-extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
-extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
-extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
+int is_pack_valid(struct packed_git *);
+void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object_type *, unsigned long *);
+unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
+int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
-extern void release_pack_memory(size_t);
+void release_pack_memory(size_t);
 
 /* global flag to enable extra checks when accessing packed objects */
-extern int do_check_packed_object_crc;
+int do_check_packed_object_crc;
 
-extern int packed_object_info(struct repository *r,
+int packed_object_info(struct repository *r,
 			      struct packed_git *pack,
 			      off_t offset, struct object_info *);
 
-extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
-extern const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
+void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
+const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
 
 /*
  * Iff a pack file in the given repository contains the object named by sha1,
  * return true and store its location to e.
  */
-extern int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e);
+int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e);
 
-extern int has_object_pack(const struct object_id *oid);
+int has_object_pack(const struct object_id *oid);
 
-extern int has_pack_index(const unsigned char *sha1);
+int has_pack_index(const unsigned char *sha1);
 
 /*
  * Return 1 if an object in a promisor packfile is or refers to the given
  * object, 0 otherwise.
  */
-extern int is_promisor_object(const struct object_id *oid);
+int is_promisor_object(const struct object_id *oid);
 
 /*
  * Expose a function for fuzz testing.
@@ -174,7 +174,7 @@ extern int is_promisor_object(const struct object_id *oid);
  * have a convenient entry-point for fuzz testing. For real uses, you should
  * probably use open_pack_index() or parse_pack_index() instead.
  */
-extern int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
-		    size_t idx_size, struct packed_git *p);
+int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
+	     size_t idx_size, struct packed_git *p);
 
 #endif
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 11/12] server-info: drop objdirlen pointer arithmetic
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (9 preceding siblings ...)
  2019-04-04 23:30 ` [PATCH 10/12] packfile.h: drop extern from function declarations Jeff King
@ 2019-04-04 23:30 ` Jeff King
  2019-04-04 23:31 ` [PATCH 12/12] update_info_refs(): drop unused force parameter Jeff King
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:30 UTC (permalink / raw)
  To: git

When writing objects/info/packs, we use the basename of each pack
(i.e., just the "pack-1234abcd.pack" part). We compute that manually by
adding "objdirlen + 6" to the name.

This _should_ work consistently, as we do not include non-local packs,
meaning everything should be in $objdir/pack/. Before f13d7db4af
(server-info.c: use pack_local like everybody else., 2005-12-05), this
was definitely true, since we computed "local" based on comparing the
objdir string.  Since then, we're relying on the code on packfile.c to
match our expectations of p->pack_name and p->local.

I think our expectations do still hold today, but we can be a bit more
defensive by just using pack_basename() to get the base. That
future-proofs us, and should hopefully be more obviously safe to
somebody reading the code.

Signed-off-by: Jeff King <peff@peff.net>
---
 packfile.c    |  2 +-
 packfile.h    |  6 ++++++
 server-info.c | 10 ++--------
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/packfile.c b/packfile.c
index e7ca135ed5..a0f92b7eef 100644
--- a/packfile.c
+++ b/packfile.c
@@ -472,7 +472,7 @@ static unsigned int get_max_fd_limit(void)
 #endif
 }
 
-static const char *pack_basename(struct packed_git *p)
+const char *pack_basename(struct packed_git *p)
 {
 	const char *ret = strrchr(p->pack_name, '/');
 	if (ret)
diff --git a/packfile.h b/packfile.h
index b40fc34fb2..45bf792d79 100644
--- a/packfile.h
+++ b/packfile.h
@@ -31,6 +31,12 @@ char *sha1_pack_name(const unsigned char *sha1);
  */
 char *sha1_pack_index_name(const unsigned char *sha1);
 
+/*
+ * Return the basename of the packfile, omitting any containing directory
+ * (e.g., "pack-1234abcd[...].pack").
+ */
+const char *pack_basename(struct packed_git *p);
+
 struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
 typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
diff --git a/server-info.c b/server-info.c
index c9fbfd3a51..ab03c1b3c2 100644
--- a/server-info.c
+++ b/server-info.c
@@ -93,16 +93,13 @@ static struct pack_info {
 	int new_num;
 } **info;
 static int num_pack;
-static const char *objdir;
-static int objdirlen;
 
 static struct pack_info *find_pack_by_name(const char *name)
 {
 	int i;
 	for (i = 0; i < num_pack; i++) {
 		struct packed_git *p = info[i]->p;
-		/* skip "/pack/" after ".git/objects" */
-		if (!strcmp(p->pack_name + objdirlen + 6, name))
+		if (!strcmp(pack_basename(p), name))
 			return info[i];
 	}
 	return NULL;
@@ -196,9 +193,6 @@ static void init_pack_info(const char *infofile, int force)
 	int stale;
 	int i = 0;
 
-	objdir = get_object_directory();
-	objdirlen = strlen(objdir);
-
 	for (p = get_all_packs(the_repository); p; p = p->next) {
 		/* we ignore things on alternate path since they are
 		 * not available to the pullers in general.
@@ -246,7 +240,7 @@ static int write_pack_info_file(FILE *fp)
 {
 	int i;
 	for (i = 0; i < num_pack; i++) {
-		if (fprintf(fp, "P %s\n", info[i]->p->pack_name + objdirlen + 6) < 0)
+		if (fprintf(fp, "P %s\n", pack_basename(info[i]->p)) < 0)
 			return -1;
 	}
 	if (fputc('\n', fp) == EOF)
-- 
2.21.0.714.gd1be1d035b


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 12/12] update_info_refs(): drop unused force parameter
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (10 preceding siblings ...)
  2019-04-04 23:30 ` [PATCH 11/12] server-info: drop objdirlen pointer arithmetic Jeff King
@ 2019-04-04 23:31 ` Jeff King
  2019-04-05  5:46 ` [PATCH 0/12] a rabbit hole of update-server-info fixes Junio C Hamano
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-04 23:31 UTC (permalink / raw)
  To: git

Once upon a time the force flag meant something when writing info/refs,
but it hasn't done anything since 60d0526aaa (Unoptimize info/refs
creation., 2005-09-14).

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/server-info.c b/server-info.c
index ab03c1b3c2..41274d098b 100644
--- a/server-info.c
+++ b/server-info.c
@@ -78,7 +78,7 @@ static int generate_info_refs(FILE *fp)
 	return for_each_ref(add_info_ref, fp);
 }
 
-static int update_info_refs(int force)
+static int update_info_refs(void)
 {
 	char *path = git_pathdup("info/refs");
 	int ret = update_info_file(path, generate_info_refs);
@@ -269,7 +269,7 @@ int update_server_info(int force)
 	 */
 	int errs = 0;
 
-	errs = errs | update_info_refs(force);
+	errs = errs | update_info_refs();
 	errs = errs | update_info_packs(force);
 
 	/* remove leftover rev-cache file if there is any */
-- 
2.21.0.714.gd1be1d035b

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/12] t5319: fix bogus cat-file argument
  2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
@ 2019-04-05  0:44   ` Ramsay Jones
  2019-04-05  1:41     ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Ramsay Jones @ 2019-04-05  0:44 UTC (permalink / raw)
  To: Jeff King, git



On 05/04/2019 00:22, Jeff King wrote:
> There's no such argument as "--unordered"; it's spelled "--unsorted".

Err, isn't this back-to-front? (i.e. cat-file has the _option_
"--unordered" but not "--unsorted").

I suspect that I am not reading that right! :-D

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/12] t5319: fix bogus cat-file argument
  2019-04-05  0:44   ` Ramsay Jones
@ 2019-04-05  1:41     ` Jeff King
  2019-04-05  1:46       ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05  1:41 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: git

On Fri, Apr 05, 2019 at 01:44:09AM +0100, Ramsay Jones wrote:

> 
> 
> On 05/04/2019 00:22, Jeff King wrote:
> > There's no such argument as "--unordered"; it's spelled "--unsorted".
> 
> Err, isn't this back-to-front? (i.e. cat-file has the _option_
> "--unordered" but not "--unsorted").

Oops, yes. The patch is right; we are going the other way.

Sorry for the confusion.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/12] t5319: fix bogus cat-file argument
  2019-04-05  1:41     ` Jeff King
@ 2019-04-05  1:46       ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05  1:46 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: git

On Thu, Apr 04, 2019 at 09:41:10PM -0400, Jeff King wrote:

> On Fri, Apr 05, 2019 at 01:44:09AM +0100, Ramsay Jones wrote:
> 
> > 
> > 
> > On 05/04/2019 00:22, Jeff King wrote:
> > > There's no such argument as "--unordered"; it's spelled "--unsorted".
> > 
> > Err, isn't this back-to-front? (i.e. cat-file has the _option_
> > "--unordered" but not "--unsorted").
> 
> Oops, yes. The patch is right; we are going the other way.

Reading my response I might have been unclear again. :) I mean the
actual diff text of the patch is doing the right thing. It's just the
description here which is wrong.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/12] a rabbit hole of update-server-info fixes
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (11 preceding siblings ...)
  2019-04-04 23:31 ` [PATCH 12/12] update_info_refs(): drop unused force parameter Jeff King
@ 2019-04-05  5:46 ` Junio C Hamano
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  13 siblings, 0 replies; 48+ messages in thread
From: Junio C Hamano @ 2019-04-05  5:46 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> This patch series started with patch 12: I just wanted to drop the
> unused "force" parameter from update_info_refs().
>
> But that made me look at its sibling update_info_packs(), and whether it
> ...
> And here we are. I present them here in reverse rabbit-hole order (which
> is also roughly important fixes first, and minor cleanups last). The
> individual chunks are mostly independent, but the server-info cleanups
> rely on the shared pack_basename() helper added as part of the midx fix.

A kind of cover letter to make readers chuckle.  Well written.

And of course, thanks.  It's a delight to read a nicely constructed
series like this one.

>
>   [01/12]: t5319: fix bogus cat-file argument
>   [02/12]: t5319: drop useless --buffer from cat-file
>   [03/12]: packfile: factor out .pack to .idx name conversion
>   [04/12]: packfile: check midx coverage with .idx rather than .pack
>   [05/12]: http: simplify parsing of remote objects/info/packs
>   [06/12]: server-info: fix blind pointer arithmetic
>   [07/12]: server-info: simplify cleanup in parse_pack_def()
>   [08/12]: server-info: use strbuf to read old info/packs file
>   [09/12]: server-info: drop nr_alloc struct member
>   [10/12]: packfile.h: drop extern from function declarations
>   [11/12]: server-info: drop objdirlen pointer arithmetic
>   [12/12]: update_info_refs(): drop unused force parameter
>
>  http.c                      | 35 ++++++---------
>  packfile.c                  | 31 ++++++++++---
>  packfile.h                  | 86 ++++++++++++++++++++-----------------
>  server-info.c               | 57 +++++++++++-------------
>  t/t5319-multi-pack-index.sh | 29 ++++++++++---
>  5 files changed, 132 insertions(+), 106 deletions(-)
>
> -Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
@ 2019-04-05  8:05   ` René Scharfe
  2019-04-05 13:21     ` Jeff King
  2019-04-05 12:01   ` SZEDER Gábor
  1 sibling, 1 reply; 48+ messages in thread
From: René Scharfe @ 2019-04-05  8:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Am 05.04.2019 um 01:25 schrieb Jeff King:
> When we have a .midx that covers many packfiles, we try to avoid opening
> the .idx for those packfiles. However, there are a few problems with the
> filename comparison we use:
>
>    - we ask midx_contains_pack() about the .pack name, not the .idx name.
>      But it compares to the latter.
>
>    - we compute the basename of the pack using strrchr() to find the
>      final slash. But that leaves an extra "/" at the start of our
>      string; we need to advance past it.
>
>      That also raises the question of what to do when the name does not
>      have a slash at all. This should generally not happen (we always
>      find files in "pack/"), but it doesn't hurt to be defensive here.
>
> The tests don't notice because there's nothing about opening those .idx
> files that would cause us to give incorrect output. It's just a little
> slower. The new test checks this case by corrupting the covered .idx,
> and then making sure we don't complain about it.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   packfile.c                  | 17 ++++++++++++++---
>   t/t5319-multi-pack-index.sh | 14 ++++++++++++++
>   2 files changed, 28 insertions(+), 3 deletions(-)
>
> diff --git a/packfile.c b/packfile.c
> index 054269ae5d..e7ca135ed5 100644
> --- a/packfile.c
> +++ b/packfile.c

> @@ -486,15 +496,16 @@ static int open_packed_git_1(struct packed_git *p)
>   	ssize_t read_result;
>   	const unsigned hashsz = the_hash_algo->rawsz;
>
> -	if (!p->index_data) {
> +	if (!p->index_data && the_repository->objects->multi_pack_index) {

So if there is no multi_pack_index, we skip this block now...

>   		struct multi_pack_index *m;
> -		const char *pack_name = strrchr(p->pack_name, '/');
> +		char *idx_name = pack_name_to_idx(pack_basename(p));
>
>   		for (m = the_repository->objects->multi_pack_index;
>   		     m; m = m->next) {
> -			if (midx_contains_pack(m, pack_name))
> +			if (midx_contains_pack(m, idx_name))
>   				break;
>   		}
> +		free(idx_name);
>
>   		if (!m && open_pack_index(p))
>   			return error("packfile %s index unavailable", p->pack_name);

... which also means this open_pack_index() call isn't done anymore if
there's no .midx file at all.  You don't mention this change in the
commit message; is it intended?

And I wonder if it would be easier overall to let midx_contains_pack()
accept .pack names in addition to .idx names.  Perhaps with something
like this?

int cmp_idx_or_pack_name(const char *idx_or_pack_name, const char *idx_name)
{
	while (*idx_name && *idx_name == *idx_or_pack_name) {
		idx_name++;
		idx_or_pack_name++;
	}
	if (!strcmp(idx_name, ".idx") && !strcmp(idx_or_pack_name, ".pack"))
		return 0;
	return strcmp(idx_or_pack_name, idx_name);
}

René

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/12] http: simplify parsing of remote objects/info/packs
  2019-04-04 23:27 ` [PATCH 05/12] http: simplify parsing of remote objects/info/packs Jeff King
@ 2019-04-05 10:41   ` René Scharfe
  2019-04-05 18:11     ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: René Scharfe @ 2019-04-05 10:41 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Am 05.04.2019 um 01:27 schrieb Jeff King:
> We can use skip_prefix() and parse_oid_hex() to continuously increment
> our pointer, rather than dealing with magic numbers. This also fixes a
> few small shortcomings:
>
>   - if we see a 'P' line that does not match our expectations, we'll
>     leave our "i" counter in the middle of the line. So we'll parse:
>     "P P P pack-1234.pack" as if there was just one "P" which was not
>     intentional (though probably not that harmful).

How so?  The default case, which we'd fall through to, skips the rest
of such a line, doesn't it?

>
>   - if we see a line with the right prefix, suffix, and length, i.e.
>     matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
>     hex without checking if it could be parsed. This could lead to us
>     looking at uninitialized garbage in the hash array. In practice this
>     means we'll just make a garbage request to the server which will
>     fail, though it's interesting that a malicious server could convince
>     us to leak 40 bytes of uninitialized stack to them.
>
>   - the current code is picky about seeing a newline at the end of file,
>     but we can easily be more liberal
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  http.c | 35 ++++++++++++++---------------------
>  1 file changed, 14 insertions(+), 21 deletions(-)
>
> diff --git a/http.c b/http.c
> index a32ad36ddf..2ef47bc779 100644
> --- a/http.c
> +++ b/http.c
> @@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
>  int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
>  {
>  	struct http_get_options options = {0};
> -	int ret = 0, i = 0;
> -	char *url, *data;
> +	int ret = 0;
> +	char *url;
> +	const char *data;
>  	struct strbuf buf = STRBUF_INIT;
> -	unsigned char hash[GIT_MAX_RAWSZ];
> -	const unsigned hexsz = the_hash_algo->hexsz;
> +	struct object_id oid;
>
>  	end_url_with_slash(&buf, base_url);
>  	strbuf_addstr(&buf, "objects/info/packs");
> @@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
>  		goto cleanup;
>
>  	data = buf.buf;
> -	while (i < buf.len) {
> -		switch (data[i]) {
> -		case 'P':
> -			i++;
> -			if (i + hexsz + 12 <= buf.len &&
> -			    starts_with(data + i, " pack-") &&
> -			    starts_with(data + i + hexsz + 6, ".pack\n")) {
> -				get_sha1_hex(data + i + 6, hash);
> -				fetch_and_setup_pack_index(packs_head, hash,
> -						      base_url);
> -				i += hexsz + 11;
> -				break;
> -			}
> -		default:
> -			while (i < buf.len && data[i] != '\n')
> -				i++;
> +	while (*data) {
> +		if (skip_prefix(data, "P pack-", &data) &&
> +		    !parse_oid_hex(data, &oid, &data) &&
> +		    skip_prefix(data, ".pack", &data) &&
> +		    (*data == '\n' || *data == '\0')) {
> +			fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
> +		} else {
> +			data = strchrnul(data, '\n');
>  		}
> -		i++;
> +		if (*data)
> +			data++; /* skip past newline */

So much simpler, *and* converted to object_id -- I like it!

Parsing "P" and "pack-" together crosses logical token boundaries,
but that I don't mind it here.

René


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
  2019-04-05  8:05   ` René Scharfe
@ 2019-04-05 12:01   ` SZEDER Gábor
  2019-04-05 13:40     ` Jeff King
  1 sibling, 1 reply; 48+ messages in thread
From: SZEDER Gábor @ 2019-04-05 12:01 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, Apr 04, 2019 at 07:25:46PM -0400, Jeff King wrote:
> When we have a .midx that covers many packfiles, we try to avoid opening
> the .idx for those packfiles. However, there are a few problems with the
> filename comparison we use:
> 
>   - we ask midx_contains_pack() about the .pack name, not the .idx name.
>     But it compares to the latter.
> 
>   - we compute the basename of the pack using strrchr() to find the
>     final slash. But that leaves an extra "/" at the start of our
>     string; we need to advance past it.
> 
>     That also raises the question of what to do when the name does not
>     have a slash at all. This should generally not happen (we always
>     find files in "pack/"), but it doesn't hurt to be defensive here.
> 
> The tests don't notice because there's nothing about opening those .idx
> files that would cause us to give incorrect output. It's just a little
> slower. The new test checks this case by corrupting the covered .idx,
> and then making sure we don't complain about it.

When the test suite is run with GIT_TEST_MULTI_PACK_INDEX=1, the
following tests fail with this patch:

  t1006-cat-file.sh                                (Wstat: 256 Tests: 111 Failed: 1)
    Failed test:  87
    Non-zero exit status: 1
  t5320-delta-islands.sh                           (Wstat: 256 Tests: 15 Failed: 10)
    Failed tests:  2-4, 6-10, 12-13
    Non-zero exit status: 1
  t5310-pack-bitmaps.sh                            (Wstat: 256 Tests: 46 Failed: 1)
    Failed test:  44
    Non-zero exit status: 1
  t5570-git-daemon.sh                              (Wstat: 256 Tests: 21 Failed: 1)
    Failed test:  10
    Non-zero exit status: 1


> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  packfile.c                  | 17 ++++++++++++++---
>  t/t5319-multi-pack-index.sh | 14 ++++++++++++++
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/packfile.c b/packfile.c
> index 054269ae5d..e7ca135ed5 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -472,6 +472,16 @@ static unsigned int get_max_fd_limit(void)
>  #endif
>  }
>  
> +static const char *pack_basename(struct packed_git *p)
> +{
> +	const char *ret = strrchr(p->pack_name, '/');
> +	if (ret)
> +		ret = ret + 1; /* skip past slash */
> +	else
> +		ret = p->pack_name; /* we only have a base */
> +	return ret;
> +}
> +
>  /*
>   * Do not call this directly as this leaks p->pack_fd on error return;
>   * call open_packed_git() instead.
> @@ -486,15 +496,16 @@ static int open_packed_git_1(struct packed_git *p)
>  	ssize_t read_result;
>  	const unsigned hashsz = the_hash_algo->rawsz;
>  
> -	if (!p->index_data) {
> +	if (!p->index_data && the_repository->objects->multi_pack_index) {
>  		struct multi_pack_index *m;
> -		const char *pack_name = strrchr(p->pack_name, '/');
> +		char *idx_name = pack_name_to_idx(pack_basename(p));
>  
>  		for (m = the_repository->objects->multi_pack_index;
>  		     m; m = m->next) {
> -			if (midx_contains_pack(m, pack_name))
> +			if (midx_contains_pack(m, idx_name))
>  				break;
>  		}
> +		free(idx_name);
>  
>  		if (!m && open_pack_index(p))
>  			return error("packfile %s index unavailable", p->pack_name);
> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 8c4d2bd849..1ebf19ec3c 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -117,6 +117,20 @@ test_expect_success 'write midx with one v2 pack' '
>  
>  compare_results_with_midx "one v2 pack"
>  
> +test_expect_success 'corrupt idx not opened' '
> +	idx=$(test-tool read-midx $objdir | grep "\.idx\$") &&
> +	mv $objdir/pack/$idx backup-$idx &&
> +	test_when_finished "mv backup-\$idx \$objdir/pack/\$idx" &&
> +
> +	# This is the minimum size for a sha-1 based .idx; this lets
> +	# us pass perfunctory tests, but anything that actually opens and reads
> +	# the idx file will complain.
> +	test_copy_bytes 1064 <backup-$idx >$objdir/pack/$idx &&
> +
> +	git -c core.multiPackIndex=true rev-list --objects --all 2>err &&
> +	test_must_be_empty err
> +'
> +
>  test_expect_success 'add more objects' '
>  	for i in $(test_seq 6 10)
>  	do
> -- 
> 2.21.0.714.gd1be1d035b
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-05  8:05   ` René Scharfe
@ 2019-04-05 13:21     ` Jeff King
  2019-04-05 13:30       ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05 13:21 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On Fri, Apr 05, 2019 at 10:05:29AM +0200, René Scharfe wrote:

> > @@ -486,15 +496,16 @@ static int open_packed_git_1(struct packed_git *p)
> >   	ssize_t read_result;
> >   	const unsigned hashsz = the_hash_algo->rawsz;
> >
> > -	if (!p->index_data) {
> > +	if (!p->index_data && the_repository->objects->multi_pack_index) {
> 
> So if there is no multi_pack_index, we skip this block now...
> 
> >   		struct multi_pack_index *m;
> > -		const char *pack_name = strrchr(p->pack_name, '/');
> > +		char *idx_name = pack_name_to_idx(pack_basename(p));
> >
> >   		for (m = the_repository->objects->multi_pack_index;
> >   		     m; m = m->next) {
> > -			if (midx_contains_pack(m, pack_name))
> > +			if (midx_contains_pack(m, idx_name))
> >   				break;
> >   		}
> > +		free(idx_name);
> >
> >   		if (!m && open_pack_index(p))
> >   			return error("packfile %s index unavailable", p->pack_name);
> 
> ... which also means this open_pack_index() call isn't done anymore if
> there's no .midx file at all.  You don't mention this change in the
> commit message; is it intended?

Doh, thank you for catching that. I made that switch at the last minute
because I didn't want to pay the malloc/free cost when we had no list to
compare to. I'm surprised it works at all. :-/

I guess it doesn't, from the other message in the thread.

> And I wonder if it would be easier overall to let midx_contains_pack()
> accept .pack names in addition to .idx names.  Perhaps with something
> like this?
> 
> int cmp_idx_or_pack_name(const char *idx_or_pack_name, const char *idx_name)
> {
> 	while (*idx_name && *idx_name == *idx_or_pack_name) {
> 		idx_name++;
> 		idx_or_pack_name++;
> 	}
> 	if (!strcmp(idx_name, ".idx") && !strcmp(idx_or_pack_name, ".pack"))
> 		return 0;
> 	return strcmp(idx_or_pack_name, idx_name);
> }

Hmm, maybe. It does a binary search, so I'd have to scratch my head for
a minute of whether this loose comparison is correct. I think it is
because of that final strcmp.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-05 13:21     ` Jeff King
@ 2019-04-05 13:30       ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 13:30 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On Fri, Apr 05, 2019 at 09:21:25AM -0400, Jeff King wrote:

> > ... which also means this open_pack_index() call isn't done anymore if
> > there's no .midx file at all.  You don't mention this change in the
> > commit message; is it intended?
> 
> Doh, thank you for catching that. I made that switch at the last minute
> because I didn't want to pay the malloc/free cost when we had no list to
> compare to. I'm surprised it works at all. :-/
> 
> I guess it doesn't, from the other message in the thread.

Hmm, no, that's a different problem. I think we'd generally already have
the pack open unless there _is_ a midx, so this conditional probably
never kicks in unless we have a midx anyway. Meaning my malloc-avoidance
was over-thought. So we could just get rid of it. Or do it like this:

diff --git a/packfile.c b/packfile.c
index e7ca135ed5..ef0f959311 100644
--- a/packfile.c
+++ b/packfile.c
@@ -496,17 +496,22 @@ static int open_packed_git_1(struct packed_git *p)
 	ssize_t read_result;
 	const unsigned hashsz = the_hash_algo->rawsz;
 
-	if (!p->index_data && the_repository->objects->multi_pack_index) {
-		struct multi_pack_index *m;
-		char *idx_name = pack_name_to_idx(pack_basename(p));
+	if (!p->index_data) {
+		struct multi_pack_index *m =
+			the_repository->objects->multi_pack_index;
 
-		for (m = the_repository->objects->multi_pack_index;
-		     m; m = m->next) {
-			if (midx_contains_pack(m, idx_name))
-				break;
+		if (m) {
+			char *idx_name = pack_name_to_idx(pack_basename(p));
+
+			for (; m; m = m->next) {
+				if (midx_contains_pack(m, idx_name))
+					break;
+			}
+			free(idx_name);
 		}
-		free(idx_name);
 
+		if (!m)
+			warning("opening anyway");
 		if (!m && open_pack_index(p))
 			return error("packfile %s index unavailable", p->pack_name);
 	}

I'll have to re-roll to address the other problem, though, so I'll give
it some thought.

-Peff

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack
  2019-04-05 12:01   ` SZEDER Gábor
@ 2019-04-05 13:40     ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 13:40 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On Fri, Apr 05, 2019 at 02:01:20PM +0200, SZEDER Gábor wrote:

> > The tests don't notice because there's nothing about opening those .idx
> > files that would cause us to give incorrect output. It's just a little
> > slower. The new test checks this case by corrupting the covered .idx,
> > and then making sure we don't complain about it.
> 
> When the test suite is run with GIT_TEST_MULTI_PACK_INDEX=1, the
> following tests fail with this patch:
> 
>   t1006-cat-file.sh                                (Wstat: 256 Tests: 111 Failed: 1)
>     Failed test:  87
>     Non-zero exit status: 1
>   t5320-delta-islands.sh                           (Wstat: 256 Tests: 15 Failed: 10)
>     Failed tests:  2-4, 6-10, 12-13
>     Non-zero exit status: 1
>   t5310-pack-bitmaps.sh                            (Wstat: 256 Tests: 46 Failed: 1)
>     Failed test:  44
>     Non-zero exit status: 1
>   t5570-git-daemon.sh                              (Wstat: 256 Tests: 21 Failed: 1)
>     Failed test:  10
>     Non-zero exit status: 1

Thanks. It looks like this is not a bug in my patch per se, but rather
that the feature of the midx code it is un-breaking is not actually a
good idea.

The problem is with generating the pack revindex. It needs the actual
index for each pack to be loaded, and now open_packed_git() will
sometimes quietly not do that.

Probably we _could_ be generating a big revindex off of the midx, but in
the meantime, the revindex code needs to make sure that the index is
loaded before looking at it.

This patch clears up those test failures (I'll re-roll the whole series
with it in the proper spot).

---
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 4695aaf6b4..3960ad94c8 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -308,7 +308,8 @@ static int load_pack_bitmap(struct bitmap_index *bitmap_git)
 
 	bitmap_git->bitmaps = kh_init_sha1();
 	bitmap_git->ext_index.positions = kh_init_sha1_pos();
-	load_pack_revindex(bitmap_git->pack);
+	if (load_pack_revindex(bitmap_git->pack))
+		goto failed;
 
 	if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) ||
 		!(bitmap_git->trees = read_bitmap_1(bitmap_git)) ||
diff --git a/pack-revindex.c b/pack-revindex.c
index 50891f77a2..d28a7e43d0 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack-revindex.h"
 #include "object-store.h"
+#include "packfile.h"
 
 /*
  * Pack index for existing packs give us easy access to the offsets into
@@ -158,10 +159,14 @@ static void create_pack_revindex(struct packed_git *p)
 	sort_revindex(p->revindex, num_ent, p->pack_size);
 }
 
-void load_pack_revindex(struct packed_git *p)
+int load_pack_revindex(struct packed_git *p)
 {
-	if (!p->revindex)
+	if (!p->revindex) {
+		if (open_pack_index(p))
+			return -1;
 		create_pack_revindex(p);
+	}
+	return 0;
 }
 
 int find_revindex_position(struct packed_git *p, off_t ofs)
@@ -188,7 +193,9 @@ struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
 {
 	int pos;
 
-	load_pack_revindex(p);
+	if (load_pack_revindex(p))
+		return NULL;
+
 	pos = find_revindex_position(p, ofs);
 
 	if (pos < 0)
diff --git a/pack-revindex.h b/pack-revindex.h
index e262f3efe8..848331d5d6 100644
--- a/pack-revindex.h
+++ b/pack-revindex.h
@@ -8,7 +8,7 @@ struct revindex_entry {
 	unsigned int nr;
 };
 
-void load_pack_revindex(struct packed_git *p);
+int load_pack_revindex(struct packed_git *p);
 int find_revindex_position(struct packed_git *p, off_t ofs);
 
 struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs);
diff --git a/packfile.c b/packfile.c
index e7ca135ed5..ca73a5b12a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2040,8 +2040,10 @@ int for_each_object_in_pack(struct packed_git *p,
 	uint32_t i;
 	int r = 0;
 
-	if (flags & FOR_EACH_OBJECT_PACK_ORDER)
-		load_pack_revindex(p);
+	if (flags & FOR_EACH_OBJECT_PACK_ORDER) {
+		if (load_pack_revindex(p))
+			return -1;
+	}
 
 	for (i = 0; i < p->num_objects; i++) {
 		uint32_t pos;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes
  2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
                   ` (12 preceding siblings ...)
  2019-04-05  5:46 ` [PATCH 0/12] a rabbit hole of update-server-info fixes Junio C Hamano
@ 2019-04-05 18:03 ` Jeff King
  2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
                     ` (13 more replies)
  13 siblings, 14 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:03 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

On Thu, Apr 04, 2019 at 07:21:04PM -0400, Jeff King wrote:

> This patch series started with patch 12: I just wanted to drop the
> unused "force" parameter from update_info_refs().
> [...]
> And here we are. I present them here in reverse rabbit-hole order (which
> is also roughly important fixes first, and minor cleanups last). The
> individual chunks are mostly independent, but the server-info cleanups
> rely on the shared pack_basename() helper added as part of the midx fix.

Here's a v2 which addresses the feedback on the last round (thank you
both). In particular:

  - the rabbit hole goes one deeper, as fixing the midx bug reveals that
    the revindex code just assumes that the pack .idx has been loaded.
    Patch 2 corrects this, which addresses most of the test failures
    with GIT_TEST_MULTI_PACK_INDEX.

  - There was one other failure, which is that t5570 corrupted a .idx
    and expected us to notice (which no longer happens when the midx is
    able to provide the answer). I've tweaked the test to address this.

  - I took René's suggestion of making midx_contains_pack() friendlier
    by handling either a .pack or a .idx argument. I had to tweak the
    suggested code just a little, and I added comments, since it took me
    a little staring to make sure everything was correct.

    I split that out from the basename fix, so that's now patches 5 and
    6. It also lets us drop the patch to factor out pack_name_to_idx().

Other than that, it's substantially the same as v1. I did move the
public declaration of pack_basename() to the patch where it is added,
rather than doing it later, which shows up in the range diff below.

  [01/13]: packfile.h: drop extern from function declarations
  [02/13]: pack-revindex: open index if necessary
  [03/13]: t5319: fix bogus cat-file argument
  [04/13]: t5319: drop useless --buffer from cat-file
  [05/13]: midx: check both pack and index names for containment
  [06/13]: packfile: fix pack basename computation
  [07/13]: http: simplify parsing of remote objects/info/packs
  [08/13]: server-info: fix blind pointer arithmetic
  [09/13]: server-info: simplify cleanup in parse_pack_def()
  [10/13]: server-info: use strbuf to read old info/packs file
  [11/13]: server-info: drop nr_alloc struct member
  [12/13]: server-info: drop objdirlen pointer arithmetic
  [13/13]: update_info_refs(): drop unused force parameter

 http.c                      | 35 ++++++---------
 midx.c                      | 36 +++++++++++++++-
 midx.h                      |  2 +-
 pack-bitmap.c               |  3 +-
 pack-revindex.c             | 13 ++++--
 pack-revindex.h             |  2 +-
 packfile.c                  | 18 ++++++--
 packfile.h                  | 86 ++++++++++++++++++++-----------------
 server-info.c               | 57 +++++++++++-------------
 t/t5319-multi-pack-index.sh | 29 ++++++++++---
 t/t5570-git-daemon.sh       |  1 +
 11 files changed, 172 insertions(+), 110 deletions(-)

10:  bead52a5dc =  1:  829e666ec9 packfile.h: drop extern from function declarations
 -:  ---------- >  2:  d46c0eafd0 pack-revindex: open index if necessary
 1:  3d1978ccaf =  3:  5b965de360 t5319: fix bogus cat-file argument
 2:  ce74312004 =  4:  32ef03e059 t5319: drop useless --buffer from cat-file
 3:  ea637c9cf1 <  -:  ---------- packfile: factor out .pack to .idx name conversion
 -:  ---------- >  5:  d87a2aa2bf midx: check both pack and index names for containment
 4:  4723e1b952 !  6:  c2bed91a3f packfile: check midx coverage with .idx rather than .pack
    @@ -1,35 +1,40 @@
     Author: Jeff King <peff@peff.net>
     
    -    packfile: check midx coverage with .idx rather than .pack
    +    packfile: fix pack basename computation
     
    -    When we have a .midx that covers many packfiles, we try to avoid opening
    -    the .idx for those packfiles. However, there are a few problems with the
    -    filename comparison we use:
    +    When we have a multi-pack-index that covers many packfiles, we try to
    +    avoid opening the .idx for those packfiles. To do that we feed the pack
    +    name to midx_contains_pack(). But that function wants to see only the
    +    basename, which we compute using strrchr() to find the final slash. But
    +    that leaves an extra "/" at the start of our string.
     
    -      - we ask midx_contains_pack() about the .pack name, not the .idx name.
    -        But it compares to the latter.
    +    We can fix this by incrementing the pointer. That also raises the
    +    question of what to do when the name does not have a '/' at all. This
    +    should generally not happen (we always find files in "pack/"), but it
    +    doesn't hurt to be defensive here.
     
    -      - we compute the basename of the pack using strrchr() to find the
    -        final slash. But that leaves an extra "/" at the start of our
    -        string; we need to advance past it.
    -
    -        That also raises the question of what to do when the name does not
    -        have a slash at all. This should generally not happen (we always
    -        find files in "pack/"), but it doesn't hurt to be defensive here.
    +    Let's wrap all of that up in a helper function and make it publicly
    +    available, since a later patch will need to use it, too.
     
         The tests don't notice because there's nothing about opening those .idx
         files that would cause us to give incorrect output. It's just a little
         slower. The new test checks this case by corrupting the covered .idx,
         and then making sure we don't complain about it.
     
    +    We also have to tweak t5570, which intentionally corrupts a .idx file
    +    and expects us to notice it. When run with GIT_TEST_MULTI_PACK_INDEX,
    +    this will fail since we now will (correctly) not bother opening the .idx
    +    at all. We can fix that by unconditionally dropping any midx that's
    +    there, which ensures we'll have to read the .idx.
    +
      diff --git a/packfile.c b/packfile.c
      --- a/packfile.c
      +++ b/packfile.c
     @@
      #endif
      }
      
    -+static const char *pack_basename(struct packed_git *p)
    ++const char *pack_basename(struct packed_git *p)
     +{
     +	const char *ret = strrchr(p->pack_name, '/');
     +	if (ret)
    @@ -43,25 +48,31 @@
       * Do not call this directly as this leaks p->pack_fd on error return;
       * call open_packed_git() instead.
     @@
    - 	ssize_t read_result;
    - 	const unsigned hashsz = the_hash_algo->rawsz;
      
    --	if (!p->index_data) {
    -+	if (!p->index_data && the_repository->objects->multi_pack_index) {
    + 	if (!p->index_data) {
      		struct multi_pack_index *m;
     -		const char *pack_name = strrchr(p->pack_name, '/');
    -+		char *idx_name = pack_name_to_idx(pack_basename(p));
    ++		const char *pack_name = pack_basename(p);
      
      		for (m = the_repository->objects->multi_pack_index;
      		     m; m = m->next) {
    --			if (midx_contains_pack(m, pack_name))
    -+			if (midx_contains_pack(m, idx_name))
    - 				break;
    - 		}
    -+		free(idx_name);
    +
    + diff --git a/packfile.h b/packfile.h
    + --- a/packfile.h
    + +++ b/packfile.h
    +@@
    +  */
    + char *sha1_pack_index_name(const unsigned char *sha1);
      
    - 		if (!m && open_pack_index(p))
    - 			return error("packfile %s index unavailable", p->pack_name);
    ++/*
    ++ * Return the basename of the packfile, omitting any containing directory
    ++ * (e.g., "pack-1234abcd[...].pack").
    ++ */
    ++const char *pack_basename(struct packed_git *p);
    ++
    + struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
    + 
    + typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
     
      diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
      --- a/t/t5319-multi-pack-index.sh
    @@ -87,3 +98,15 @@
      test_expect_success 'add more objects' '
      	for i in $(test_seq 6 10)
      	do
    +
    + diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
    + --- a/t/t5570-git-daemon.sh
    + +++ b/t/t5570-git-daemon.sh
    +@@
    + test_expect_success 'fetch notices corrupt idx' '
    + 	cp -R "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
    + 	(cd "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
    ++	 rm -f objects/pack/multi-pack-index &&
    + 	 p=$(ls objects/pack/pack-*.idx) &&
    + 	 chmod u+w $p &&
    + 	 printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
 5:  0bf3fb7758 =  7:  15a7d0c690 http: simplify parsing of remote objects/info/packs
 6:  3f711f74f1 =  8:  aee572065c server-info: fix blind pointer arithmetic
 7:  5a77637699 =  9:  a9d226960a server-info: simplify cleanup in parse_pack_def()
 8:  cbc7b6fa71 = 10:  388e7bf2be server-info: use strbuf to read old info/packs file
 9:  9601aa240c = 11:  76ca72f4bf server-info: drop nr_alloc struct member
11:  23ff6ba3dc ! 12:  8753c537a7 server-info: drop objdirlen pointer arithmetic
    @@ -18,36 +18,6 @@
         future-proofs us, and should hopefully be more obviously safe to
         somebody reading the code.
     
    - diff --git a/packfile.c b/packfile.c
    - --- a/packfile.c
    - +++ b/packfile.c
    -@@
    - #endif
    - }
    - 
    --static const char *pack_basename(struct packed_git *p)
    -+const char *pack_basename(struct packed_git *p)
    - {
    - 	const char *ret = strrchr(p->pack_name, '/');
    - 	if (ret)
    -
    - diff --git a/packfile.h b/packfile.h
    - --- a/packfile.h
    - +++ b/packfile.h
    -@@
    -  */
    - char *sha1_pack_index_name(const unsigned char *sha1);
    - 
    -+/*
    -+ * Return the basename of the packfile, omitting any containing directory
    -+ * (e.g., "pack-1234abcd[...].pack").
    -+ */
    -+const char *pack_basename(struct packed_git *p);
    -+
    - struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
    - 
    - typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
    -
      diff --git a/server-info.c b/server-info.c
      --- a/server-info.c
      +++ b/server-info.c
12:  3f9cff4360 = 13:  96c5a26a99 update_info_refs(): drop unused force parameter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
@ 2019-04-05 18:03   ` Jeff King
  2019-04-05 19:19     ` Ramsay Jones
  2019-04-05 18:04   ` [PATCH v2 02/13] pack-revindex: open index if necessary Jeff King
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:03 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

As CodingGuidelines recommends, we do not need an "extern" when
declaring a public function. Let's drop these. Note that we leave the
extern on report_garbage(), as that is actually a function pointer, not
a function itself.

Signed-off-by: Jeff King <peff@peff.net>
---
This is bumped to the front in v2 since we now add our public function
to it much earlier.

 packfile.h | 80 +++++++++++++++++++++++++++---------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/packfile.h b/packfile.h
index d70c6d9afb..b40fc34fb2 100644
--- a/packfile.h
+++ b/packfile.h
@@ -15,23 +15,23 @@ struct object_info;
  *
  * Example: odb_pack_name(out, sha1, "idx") => ".git/objects/pack/pack-1234..idx"
  */
-extern char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
+char *odb_pack_name(struct strbuf *buf, const unsigned char *sha1, const char *ext);
 
 /*
  * Return the name of the (local) packfile with the specified sha1 in
  * its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_name(const unsigned char *sha1);
+char *sha1_pack_name(const unsigned char *sha1);
 
 /*
  * Return the name of the (local) pack index file with the specified
  * sha1 in its name.  The return value is a pointer to memory that is
  * overwritten each time this function is called.
  */
-extern char *sha1_pack_index_name(const unsigned char *sha1);
+char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
+struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
 typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
 				      const char *file_pach, void *data);
@@ -45,8 +45,8 @@ void for_each_file_in_pack_dir(const char *objdir,
 #define PACKDIR_FILE_GARBAGE 4
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
-extern void reprepare_packed_git(struct repository *r);
-extern void install_packed_git(struct repository *r, struct packed_git *pack);
+void reprepare_packed_git(struct repository *r);
+void install_packed_git(struct repository *r, struct packed_git *pack);
 
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
@@ -59,32 +59,32 @@ struct packed_git *get_all_packs(struct repository *r);
  */
 unsigned long approximate_object_count(void);
 
-extern struct packed_git *find_sha1_pack(const unsigned char *sha1,
-					 struct packed_git *packs);
+struct packed_git *find_sha1_pack(const unsigned char *sha1,
+				  struct packed_git *packs);
 
-extern void pack_report(void);
+void pack_report(void);
 
 /*
  * mmap the index file for the specified packfile (if it is not
  * already mmapped).  Return 0 on success.
  */
-extern int open_pack_index(struct packed_git *);
+int open_pack_index(struct packed_git *);
 
 /*
  * munmap the index file for the specified packfile (if it is
  * currently mmapped).
  */
-extern void close_pack_index(struct packed_git *);
+void close_pack_index(struct packed_git *);
 
-extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
+uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
 
-extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
-extern void close_pack_windows(struct packed_git *);
-extern void close_pack(struct packed_git *);
-extern void close_all_packs(struct raw_object_store *o);
-extern void unuse_pack(struct pack_window **);
-extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
+unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
+void close_pack_windows(struct packed_git *);
+void close_pack(struct packed_git *);
+void close_all_packs(struct raw_object_store *o);
+void unuse_pack(struct pack_window **);
+void clear_delta_base_cache(void);
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local);
 
 /*
  * Make sure that a pointer access into an mmap'd index file is within bounds,
@@ -94,7 +94,7 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int
  * (like the 64-bit extended offset table), as we compare the size to the
  * fixed-length parts when we open the file.
  */
-extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
+void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
 /*
  * Perform binary search on a pack-index for a given oid. Packfile is expected to
@@ -110,59 +110,59 @@ int bsearch_pack(const struct object_id *oid, const struct packed_git *p, uint32
  * at the SHA-1 within the mmapped index.  Return NULL if there is an
  * error.
  */
-extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
+const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t n);
 /*
  * Like nth_packed_object_sha1, but write the data into the object specified by
  * the the first argument.  Returns the first argument on success, and NULL on
  * error.
  */
-extern const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
+const struct object_id *nth_packed_object_oid(struct object_id *, struct packed_git *, uint32_t n);
 
 /*
  * Return the offset of the nth object within the specified packfile.
  * The index must already be opened.
  */
-extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
+off_t nth_packed_object_offset(const struct packed_git *, uint32_t n);
 
 /*
  * If the object named sha1 is present in the specified packfile,
  * return its offset within the packfile; otherwise, return 0.
  */
-extern off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
+off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *);
 
-extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object_type *, unsigned long *);
-extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
-extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
-extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
+int is_pack_valid(struct packed_git *);
+void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object_type *, unsigned long *);
+unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
+unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
+int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
-extern void release_pack_memory(size_t);
+void release_pack_memory(size_t);
 
 /* global flag to enable extra checks when accessing packed objects */
-extern int do_check_packed_object_crc;
+int do_check_packed_object_crc;
 
-extern int packed_object_info(struct repository *r,
+int packed_object_info(struct repository *r,
 			      struct packed_git *pack,
 			      off_t offset, struct object_info *);
 
-extern void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
-extern const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
+void mark_bad_packed_object(struct packed_git *p, const unsigned char *sha1);
+const struct packed_git *has_packed_and_bad(struct repository *r, const unsigned char *sha1);
 
 /*
  * Iff a pack file in the given repository contains the object named by sha1,
  * return true and store its location to e.
  */
-extern int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e);
+int find_pack_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e);
 
-extern int has_object_pack(const struct object_id *oid);
+int has_object_pack(const struct object_id *oid);
 
-extern int has_pack_index(const unsigned char *sha1);
+int has_pack_index(const unsigned char *sha1);
 
 /*
  * Return 1 if an object in a promisor packfile is or refers to the given
  * object, 0 otherwise.
  */
-extern int is_promisor_object(const struct object_id *oid);
+int is_promisor_object(const struct object_id *oid);
 
 /*
  * Expose a function for fuzz testing.
@@ -174,7 +174,7 @@ extern int is_promisor_object(const struct object_id *oid);
  * have a convenient entry-point for fuzz testing. For real uses, you should
  * probably use open_pack_index() or parse_pack_index() instead.
  */
-extern int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
-		    size_t idx_size, struct packed_git *p);
+int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
+	     size_t idx_size, struct packed_git *p);
 
 #endif
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 02/13] pack-revindex: open index if necessary
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
@ 2019-04-05 18:04   ` Jeff King
  2019-04-05 18:04   ` [PATCH v2 03/13] t5319: fix bogus cat-file argument Jeff King
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:04 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

We can't create a pack revindex if we haven't actually looked at the
index. Normally we would never get as far as creating a revindex without
having already been looking in the pack, so this code never bothered to
double-check that pack->index_data had been loaded.

But with the new multi-pack-index feature, many code paths might not
load the individual pack .idx at all (they'd find objects via the midx
and then open the .pack, but not its index).

This can't yet be triggered in practice, because a bug in the midx code
means we accidentally open up the individual .idx files anyway. But in
preparation for fixing that, let's have the revindex code check that
everything it needs has been loaded.

In most cases this will just be a quick noop. But note that this does
introduce a possibility of error (if we have to open the index and it's
corrupt), so load_pack_revindex() now returns a result code, and callers
need to handle the error.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-bitmap.c   |  3 ++-
 pack-revindex.c | 13 ++++++++++---
 pack-revindex.h |  2 +-
 packfile.c      |  6 ++++--
 4 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 4695aaf6b4..3960ad94c8 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -308,7 +308,8 @@ static int load_pack_bitmap(struct bitmap_index *bitmap_git)
 
 	bitmap_git->bitmaps = kh_init_sha1();
 	bitmap_git->ext_index.positions = kh_init_sha1_pos();
-	load_pack_revindex(bitmap_git->pack);
+	if (load_pack_revindex(bitmap_git->pack))
+		goto failed;
 
 	if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) ||
 		!(bitmap_git->trees = read_bitmap_1(bitmap_git)) ||
diff --git a/pack-revindex.c b/pack-revindex.c
index 50891f77a2..d28a7e43d0 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack-revindex.h"
 #include "object-store.h"
+#include "packfile.h"
 
 /*
  * Pack index for existing packs give us easy access to the offsets into
@@ -158,10 +159,14 @@ static void create_pack_revindex(struct packed_git *p)
 	sort_revindex(p->revindex, num_ent, p->pack_size);
 }
 
-void load_pack_revindex(struct packed_git *p)
+int load_pack_revindex(struct packed_git *p)
 {
-	if (!p->revindex)
+	if (!p->revindex) {
+		if (open_pack_index(p))
+			return -1;
 		create_pack_revindex(p);
+	}
+	return 0;
 }
 
 int find_revindex_position(struct packed_git *p, off_t ofs)
@@ -188,7 +193,9 @@ struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
 {
 	int pos;
 
-	load_pack_revindex(p);
+	if (load_pack_revindex(p))
+		return NULL;
+
 	pos = find_revindex_position(p, ofs);
 
 	if (pos < 0)
diff --git a/pack-revindex.h b/pack-revindex.h
index e262f3efe8..848331d5d6 100644
--- a/pack-revindex.h
+++ b/pack-revindex.h
@@ -8,7 +8,7 @@ struct revindex_entry {
 	unsigned int nr;
 };
 
-void load_pack_revindex(struct packed_git *p);
+int load_pack_revindex(struct packed_git *p);
 int find_revindex_position(struct packed_git *p, off_t ofs);
 
 struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs);
diff --git a/packfile.c b/packfile.c
index 16bcb75262..6e40bd89c7 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2023,8 +2023,10 @@ int for_each_object_in_pack(struct packed_git *p,
 	uint32_t i;
 	int r = 0;
 
-	if (flags & FOR_EACH_OBJECT_PACK_ORDER)
-		load_pack_revindex(p);
+	if (flags & FOR_EACH_OBJECT_PACK_ORDER) {
+		if (load_pack_revindex(p))
+			return -1;
+	}
 
 	for (i = 0; i < p->num_objects; i++) {
 		uint32_t pos;
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 03/13] t5319: fix bogus cat-file argument
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
  2019-04-05 18:04   ` [PATCH v2 02/13] pack-revindex: open index if necessary Jeff King
@ 2019-04-05 18:04   ` Jeff King
  2019-04-05 18:05   ` [PATCH v2 04/13] t5319: drop useless --buffer from cat-file Jeff King
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:04 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

There's no such argument as "--unsorted"; it's spelled "--unordered".
But our test failed to notice that cat-file didn't run at all because:

  1. It lost the exit code of git on the left-hand side of a pipe.

  2. It was comparing two runs of the broken invocation with and without
     a particular config variable (and indeed, both cases produced no
     output!).

Let's fix the option, but also tweak the helper function to check the
exit code.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5319-multi-pack-index.sh | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 70926b5bc0..42f4d6cd01 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -86,13 +86,14 @@ test_expect_success 'write midx with one v1 pack' '
 '
 
 midx_git_two_modes () {
+	git -c core.multiPackIndex=false $1 >expect &&
+	git -c core.multiPackIndex=true $1 >actual &&
 	if [ "$2" = "sorted" ]
 	then
-		git -c core.multiPackIndex=false $1 | sort >expect &&
-		git -c core.multiPackIndex=true $1 | sort >actual
-	else
-		git -c core.multiPackIndex=false $1 >expect &&
-		git -c core.multiPackIndex=true $1 >actual
+		sort <expect >expect.sorted &&
+		mv expect.sorted expect &&
+		sort <actual >actual.sorted &&
+		mv actual.sorted actual
 	fi &&
 	test_cmp expect actual
 }
@@ -104,7 +105,7 @@ compare_results_with_midx () {
 		midx_git_two_modes "log --raw" &&
 		midx_git_two_modes "count-objects --verbose" &&
 		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unsorted" sorted
+		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unordered" sorted
 	'
 }
 
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 04/13] t5319: drop useless --buffer from cat-file
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (2 preceding siblings ...)
  2019-04-05 18:04   ` [PATCH v2 03/13] t5319: fix bogus cat-file argument Jeff King
@ 2019-04-05 18:05   ` Jeff King
  2019-04-05 18:06   ` [PATCH v2 05/13] midx: check both pack and index names for containment Jeff King
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:05 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

The cat-file --buffer option is the default already when using
--batch-all-objects. It doesn't hurt to specify it, but it's nice for
the test scripts to model good usage.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5319-multi-pack-index.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 42f4d6cd01..8c4d2bd849 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -104,8 +104,8 @@ compare_results_with_midx () {
 		midx_git_two_modes "rev-list --objects --all" &&
 		midx_git_two_modes "log --raw" &&
 		midx_git_two_modes "count-objects --verbose" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check" &&
-		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unordered" sorted
+		midx_git_two_modes "cat-file --batch-all-objects --batch-check" &&
+		midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted
 	'
 }
 
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 05/13] midx: check both pack and index names for containment
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (3 preceding siblings ...)
  2019-04-05 18:05   ` [PATCH v2 04/13] t5319: drop useless --buffer from cat-file Jeff King
@ 2019-04-05 18:06   ` Jeff King
  2019-04-05 20:18     ` René Scharfe
  2019-04-05 18:06   ` [PATCH v2 06/13] packfile: fix pack basename computation Jeff King
                     ` (8 subsequent siblings)
  13 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:06 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

A midx file (and the struct we parse from it) contains a list of all of
the covered packfiles, mentioned by their ".idx" names (e.g.,
"pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
callers to provide the idx name.

This works for most of the calls, but the one in open_packed_git_1()
tries to feed a packed_git->pack_name, which is the ".pack" name,
meaning we'll never find a match (even if the pack is covered by the
midx).

We can fix this by converting the ".pack" to ".idx" in the caller.
However, that requires allocating a new string. Instead, let's make
midx_contains_pack() a bit friendlier, and allow it take _either_ the
.pack or .idx variant.

All cleverness in the matching code is credited to René. Bugs are mine.

There's no test here, because while this does fix _a_ bug, it's masked
by another bug in that same caller. That will be covered (with a test)
in the next patch.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
---
I was tempted to suggest that the midx struct just store the base name
without ".idx" at all, but having callers pass that is no less tricky
than passing ".idx" (they still have to allocate a new string).

 midx.c | 36 ++++++++++++++++++++++++++++++++++--
 midx.h |  2 +-
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/midx.c b/midx.c
index 8a505fd423..0ceca1938f 100644
--- a/midx.c
+++ b/midx.c
@@ -307,7 +307,39 @@ int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct mu
 	return nth_midxed_pack_entry(m, e, pos);
 }
 
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
+/* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
+static int cmp_idx_or_pack_name(const char *idx_or_pack_name,
+				const char *idx_name)
+{
+	/* Skip past any initial matching prefix. */
+	while (*idx_name && *idx_name == *idx_or_pack_name) {
+		idx_name++;
+		idx_or_pack_name++;
+	}
+
+	/*
+	 * If we didn't match completely, we may have matched "pack-1234." and
+	 * be left with "idx" and "pack" respectively, which is also OK. We do
+	 * not have to check for "idx" and "idx", because that would have been
+	 * a complete match (and in that case these strcmps will be false, but
+	 * we'll correctly return 0 from the final strcmp() below.
+	 *
+	 * Technically this matches "fooidx" and "foopack", but we'd never have
+	 * such names in the first place.
+	 */
+	if (!strcmp(idx_name, "idx") && !strcmp(idx_or_pack_name, "pack"))
+		return 0;
+
+	/*
+	 * This not only checks for a complete match, but also orders based on
+	 * the first non-identical character, which means our ordering will
+	 * match a raw strcmp(). That makes it OK to use this to binary search
+	 * a naively-sorted list.
+	 */
+	return strcmp(idx_or_pack_name, idx_name);
+}
+
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name)
 {
 	uint32_t first = 0, last = m->num_packs;
 
@@ -317,7 +349,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
 		int cmp;
 
 		current = m->pack_names[mid];
-		cmp = strcmp(idx_name, current);
+		cmp = cmp_idx_or_pack_name(idx_or_pack_name, current);
 		if (!cmp)
 			return 1;
 		if (cmp > 0) {
diff --git a/midx.h b/midx.h
index 774f652530..26dd042d63 100644
--- a/midx.h
+++ b/midx.h
@@ -43,7 +43,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
 					uint32_t n);
 int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name);
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
 int write_midx_file(const char *object_dir);
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 06/13] packfile: fix pack basename computation
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (4 preceding siblings ...)
  2019-04-05 18:06   ` [PATCH v2 05/13] midx: check both pack and index names for containment Jeff King
@ 2019-04-05 18:06   ` Jeff King
  2019-04-05 18:12   ` [PATCH v2 07/13] http: simplify parsing of remote objects/info/packs Jeff King
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:06 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

When we have a multi-pack-index that covers many packfiles, we try to
avoid opening the .idx for those packfiles. To do that we feed the pack
name to midx_contains_pack(). But that function wants to see only the
basename, which we compute using strrchr() to find the final slash. But
that leaves an extra "/" at the start of our string.

We can fix this by incrementing the pointer. That also raises the
question of what to do when the name does not have a '/' at all. This
should generally not happen (we always find files in "pack/"), but it
doesn't hurt to be defensive here.

Let's wrap all of that up in a helper function and make it publicly
available, since a later patch will need to use it, too.

The tests don't notice because there's nothing about opening those .idx
files that would cause us to give incorrect output. It's just a little
slower. The new test checks this case by corrupting the covered .idx,
and then making sure we don't complain about it.

We also have to tweak t5570, which intentionally corrupts a .idx file
and expects us to notice it. When run with GIT_TEST_MULTI_PACK_INDEX,
this will fail since we now will (correctly) not bother opening the .idx
at all. We can fix that by unconditionally dropping any midx that's
there, which ensures we'll have to read the .idx.

Signed-off-by: Jeff King <peff@peff.net>
---
 packfile.c                  | 12 +++++++++++-
 packfile.h                  |  6 ++++++
 t/t5319-multi-pack-index.sh | 14 ++++++++++++++
 t/t5570-git-daemon.sh       |  1 +
 4 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/packfile.c b/packfile.c
index 6e40bd89c7..7a2dd2fdbe 100644
--- a/packfile.c
+++ b/packfile.c
@@ -466,6 +466,16 @@ static unsigned int get_max_fd_limit(void)
 #endif
 }
 
+const char *pack_basename(struct packed_git *p)
+{
+	const char *ret = strrchr(p->pack_name, '/');
+	if (ret)
+		ret = ret + 1; /* skip past slash */
+	else
+		ret = p->pack_name; /* we only have a base */
+	return ret;
+}
+
 /*
  * Do not call this directly as this leaks p->pack_fd on error return;
  * call open_packed_git() instead.
@@ -482,7 +492,7 @@ static int open_packed_git_1(struct packed_git *p)
 
 	if (!p->index_data) {
 		struct multi_pack_index *m;
-		const char *pack_name = strrchr(p->pack_name, '/');
+		const char *pack_name = pack_basename(p);
 
 		for (m = the_repository->objects->multi_pack_index;
 		     m; m = m->next) {
diff --git a/packfile.h b/packfile.h
index b40fc34fb2..45bf792d79 100644
--- a/packfile.h
+++ b/packfile.h
@@ -31,6 +31,12 @@ char *sha1_pack_name(const unsigned char *sha1);
  */
 char *sha1_pack_index_name(const unsigned char *sha1);
 
+/*
+ * Return the basename of the packfile, omitting any containing directory
+ * (e.g., "pack-1234abcd[...].pack").
+ */
+const char *pack_basename(struct packed_git *p);
+
 struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
 
 typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len,
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 8c4d2bd849..1ebf19ec3c 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -117,6 +117,20 @@ test_expect_success 'write midx with one v2 pack' '
 
 compare_results_with_midx "one v2 pack"
 
+test_expect_success 'corrupt idx not opened' '
+	idx=$(test-tool read-midx $objdir | grep "\.idx\$") &&
+	mv $objdir/pack/$idx backup-$idx &&
+	test_when_finished "mv backup-\$idx \$objdir/pack/\$idx" &&
+
+	# This is the minimum size for a sha-1 based .idx; this lets
+	# us pass perfunctory tests, but anything that actually opens and reads
+	# the idx file will complain.
+	test_copy_bytes 1064 <backup-$idx >$objdir/pack/$idx &&
+
+	git -c core.multiPackIndex=true rev-list --objects --all 2>err &&
+	test_must_be_empty err
+'
+
 test_expect_success 'add more objects' '
 	for i in $(test_seq 6 10)
 	do
diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
index 58ee787685..19e271bda6 100755
--- a/t/t5570-git-daemon.sh
+++ b/t/t5570-git-daemon.sh
@@ -90,6 +90,7 @@ test_expect_success 'fetch notices corrupt pack' '
 test_expect_success 'fetch notices corrupt idx' '
 	cp -R "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
 	(cd "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_bad2.git &&
+	 rm -f objects/pack/multi-pack-index &&
 	 p=$(ls objects/pack/pack-*.idx) &&
 	 chmod u+w $p &&
 	 printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/12] http: simplify parsing of remote objects/info/packs
  2019-04-05 10:41   ` René Scharfe
@ 2019-04-05 18:11     ` Jeff King
  2019-04-05 20:17       ` René Scharfe
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:11 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On Fri, Apr 05, 2019 at 12:41:27PM +0200, René Scharfe wrote:

> Am 05.04.2019 um 01:27 schrieb Jeff King:
> > We can use skip_prefix() and parse_oid_hex() to continuously increment
> > our pointer, rather than dealing with magic numbers. This also fixes a
> > few small shortcomings:
> >
> >   - if we see a 'P' line that does not match our expectations, we'll
> >     leave our "i" counter in the middle of the line. So we'll parse:
> >     "P P P pack-1234.pack" as if there was just one "P" which was not
> >     intentional (though probably not that harmful).
> 
> How so?  The default case, which we'd fall through to, skips the rest
> of such a line, doesn't it?

Oh, you're right. I didn't notice the fall-through, which is quite
subtle (I'm actually surprised -Wimplicit-fallthrough doesn't complain
about this).

I'll fix up the commit message (the cleanup is still very much worth it
for the garbage-oid issue, IMHO).

> > +	while (*data) {
> > +		if (skip_prefix(data, "P pack-", &data) &&
> > +		    !parse_oid_hex(data, &oid, &data) &&
> > +		    skip_prefix(data, ".pack", &data) &&
> > +		    (*data == '\n' || *data == '\0')) {
> > +			fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
> > +		} else {
> > +			data = strchrnul(data, '\n');
> >  		}
> > -		i++;
> > +		if (*data)
> > +			data++; /* skip past newline */
> 
> So much simpler, *and* converted to object_id -- I like it!
> 
> Parsing "P" and "pack-" together crosses logical token boundaries,
> but that I don't mind it here.

Yeah, I was tempted to write:

  if (skip_prefix(data, "P ", &data) &&
      skip_prefix(data, "pack-", &data) &&
      ...

but that felt a little silly. I dunno. I guess it is probably not any
less efficient, because we'd expect skip_prefix() and its loop to get
inlined here anyway.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 07/13] http: simplify parsing of remote objects/info/packs
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (5 preceding siblings ...)
  2019-04-05 18:06   ` [PATCH v2 06/13] packfile: fix pack basename computation Jeff King
@ 2019-04-05 18:12   ` Jeff King
  2019-04-05 18:13   ` [PATCH v2 08/13] server-info: fix blind pointer arithmetic Jeff King
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:12 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

We can use skip_prefix() and parse_oid_hex() to continuously increment
our pointer, rather than dealing with magic numbers. This also fixes a
few small shortcomings:

  - if we see a line with the right prefix, suffix, and length, i.e.
    matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
    hex without checking if it could be parsed. This could lead to us
    looking at uninitialized garbage in the hash array. In practice this
    means we'll just make a garbage request to the server which will
    fail, though it's interesting that a malicious server could convince
    us to leak 40 bytes of uninitialized stack to them.

  - the current code is picky about seeing a newline at the end of file,
    but we can easily be more liberal

Signed-off-by: Jeff King <peff@peff.net>
---
This drops an incorrect claim from the v1 commit message. Sorry, I only
remembered to deal with it as I was sending the patch out, so it is not
reflected in the range-diff in the cover letter.

 http.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/http.c b/http.c
index a32ad36ddf..2ef47bc779 100644
--- a/http.c
+++ b/http.c
@@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
 int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
 {
 	struct http_get_options options = {0};
-	int ret = 0, i = 0;
-	char *url, *data;
+	int ret = 0;
+	char *url;
+	const char *data;
 	struct strbuf buf = STRBUF_INIT;
-	unsigned char hash[GIT_MAX_RAWSZ];
-	const unsigned hexsz = the_hash_algo->hexsz;
+	struct object_id oid;
 
 	end_url_with_slash(&buf, base_url);
 	strbuf_addstr(&buf, "objects/info/packs");
@@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
 		goto cleanup;
 
 	data = buf.buf;
-	while (i < buf.len) {
-		switch (data[i]) {
-		case 'P':
-			i++;
-			if (i + hexsz + 12 <= buf.len &&
-			    starts_with(data + i, " pack-") &&
-			    starts_with(data + i + hexsz + 6, ".pack\n")) {
-				get_sha1_hex(data + i + 6, hash);
-				fetch_and_setup_pack_index(packs_head, hash,
-						      base_url);
-				i += hexsz + 11;
-				break;
-			}
-		default:
-			while (i < buf.len && data[i] != '\n')
-				i++;
+	while (*data) {
+		if (skip_prefix(data, "P pack-", &data) &&
+		    !parse_oid_hex(data, &oid, &data) &&
+		    skip_prefix(data, ".pack", &data) &&
+		    (*data == '\n' || *data == '\0')) {
+			fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
+		} else {
+			data = strchrnul(data, '\n');
 		}
-		i++;
+		if (*data)
+			data++; /* skip past newline */
 	}
 
 cleanup:
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 08/13] server-info: fix blind pointer arithmetic
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (6 preceding siblings ...)
  2019-04-05 18:12   ` [PATCH v2 07/13] http: simplify parsing of remote objects/info/packs Jeff King
@ 2019-04-05 18:13   ` Jeff King
  2019-04-05 18:13   ` [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def() Jeff King
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:13 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

When we're writing out a new objects/info/packs file, we read back the
old one to try to keep the ordering the same. When we see a line
starting with "P", we expect "P pack-1234..." and blindly jump to "line
+ 2" to parse the pack name. If we saw a line with _just_ "P" and
nothing else, we'd jump past the end of the buffer and start reading
arbitrary memory.

This shouldn't be a big attack vector, as the files are local to the
repository and written by us, but it's clearly worth fixing (we do read
remote copies of the file for dumb-http fetches, but using a totally
different parser!).

Let's instead use skip_prefix() here, which avoids pointer arithmetic
altogether. Note that this converts our switch statement to an if/else
chain, making it slightly more verbose. But it will also make it easier
to do a few follow-on cleanups.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/server-info.c b/server-info.c
index e2b2d6a27a..b61a6be4c2 100644
--- a/server-info.c
+++ b/server-info.c
@@ -112,9 +112,9 @@ static struct pack_info *find_pack_by_name(const char *name)
 /* Returns non-zero when we detect that the info in the
  * old file is useless.
  */
-static int parse_pack_def(const char *line, int old_cnt)
+static int parse_pack_def(const char *packname, int old_cnt)
 {
-	struct pack_info *i = find_pack_by_name(line + 2);
+	struct pack_info *i = find_pack_by_name(packname);
 	if (i) {
 		i->old_num = old_cnt;
 		return 0;
@@ -139,24 +139,26 @@ static int read_pack_info_file(const char *infofile)
 		return 1; /* nonexistent is not an error. */
 
 	while (fgets(line, sizeof(line), fp)) {
+		const char *arg;
 		int len = strlen(line);
 		if (len && line[len-1] == '\n')
 			line[--len] = 0;
 
 		if (!len)
 			continue;
 
-		switch (line[0]) {
-		case 'P': /* P name */
-			if (parse_pack_def(line, old_cnt++))
+		if (skip_prefix(line, "P ", &arg)) {
+			/* P name */
+			if (parse_pack_def(arg, old_cnt++))
 				goto out_stale;
-			break;
-		case 'D': /* we used to emit D but that was misguided. */
-		case 'T': /* we used to emit T but nobody uses it. */
+		} else if (line[0] == 'D') {
+			/* we used to emit D but that was misguided. */
 			goto out_stale;
-		default:
+		} else if (line[0] == 'T') {
+			/* we used to emit T but nobody uses it. */
+			goto out_stale;
+		} else {
 			error("unrecognized: %s", line);
-			break;
 		}
 	}
 	fclose(fp);
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def()
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (7 preceding siblings ...)
  2019-04-05 18:13   ` [PATCH v2 08/13] server-info: fix blind pointer arithmetic Jeff King
@ 2019-04-05 18:13   ` Jeff King
  2019-04-05 18:16     ` Jeff King
  2019-04-05 18:13   ` [PATCH v2 10/13] server-info: use strbuf to read old info/packs file Jeff King
                     ` (4 subsequent siblings)
  13 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:13 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

We have two exits from the function: either we jump to the out_stale
label or not. But in both exits we repeat our cleanup, and the only
difference is our return value. Let's just use a variable for the return
value to avoid repeating ourselves.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/server-info.c b/server-info.c
index b61a6be4c2..ba44cece7f 100644
--- a/server-info.c
+++ b/server-info.c
@@ -133,6 +133,7 @@ static int read_pack_info_file(const char *infofile)
 	FILE *fp;
 	char line[1000];
 	int old_cnt = 0;
+	int stale = 1;
 
 	fp = fopen_or_warn(infofile, "r");
 	if (!fp)
@@ -161,11 +162,11 @@ static int read_pack_info_file(const char *infofile)
 			error("unrecognized: %s", line);
 		}
 	}
-	fclose(fp);
-	return 0;
+	stale = 0;
+
  out_stale:
 	fclose(fp);
-	return 1;
+	return stale;
 }
 
 static int compare_info(const void *a_, const void *b_)
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 10/13] server-info: use strbuf to read old info/packs file
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (8 preceding siblings ...)
  2019-04-05 18:13   ` [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def() Jeff King
@ 2019-04-05 18:13   ` Jeff King
  2019-04-05 18:14   ` [PATCH v2 11/13] server-info: drop nr_alloc struct member Jeff King
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:13 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

This old code uses fgets with a fixed-size buffer. Let's use a strbuf
instead, so we don't have to wonder if "1000" is big enough, or what
happens if we see a long line.

This also lets us drop our custom code to trim the newline.

Probably nobody actually cares about the 1000-char limit (after all, the
lines generally only say "P pack-[0-9a-f]{40}.pack"), so this is mostly
just about cleanup/readability.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/server-info.c b/server-info.c
index ba44cece7f..d4115fecbb 100644
--- a/server-info.c
+++ b/server-info.c
@@ -131,40 +131,38 @@ static int parse_pack_def(const char *packname, int old_cnt)
 static int read_pack_info_file(const char *infofile)
 {
 	FILE *fp;
-	char line[1000];
+	struct strbuf line = STRBUF_INIT;
 	int old_cnt = 0;
 	int stale = 1;
 
 	fp = fopen_or_warn(infofile, "r");
 	if (!fp)
 		return 1; /* nonexistent is not an error. */
 
-	while (fgets(line, sizeof(line), fp)) {
+	while (strbuf_getline(&line, fp) != EOF) {
 		const char *arg;
-		int len = strlen(line);
-		if (len && line[len-1] == '\n')
-			line[--len] = 0;
 
-		if (!len)
+		if (!line.len)
 			continue;
 
-		if (skip_prefix(line, "P ", &arg)) {
+		if (skip_prefix(line.buf, "P ", &arg)) {
 			/* P name */
 			if (parse_pack_def(arg, old_cnt++))
 				goto out_stale;
-		} else if (line[0] == 'D') {
+		} else if (line.buf[0] == 'D') {
 			/* we used to emit D but that was misguided. */
 			goto out_stale;
-		} else if (line[0] == 'T') {
+		} else if (line.buf[0] == 'T') {
 			/* we used to emit T but nobody uses it. */
 			goto out_stale;
 		} else {
-			error("unrecognized: %s", line);
+			error("unrecognized: %s", line.buf);
 		}
 	}
 	stale = 0;
 
  out_stale:
+	strbuf_release(&line);
 	fclose(fp);
 	return stale;
 }
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 11/13] server-info: drop nr_alloc struct member
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (9 preceding siblings ...)
  2019-04-05 18:13   ` [PATCH v2 10/13] server-info: use strbuf to read old info/packs file Jeff King
@ 2019-04-05 18:14   ` Jeff King
  2019-04-05 18:14   ` [PATCH v2 12/13] server-info: drop objdirlen pointer arithmetic Jeff King
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:14 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

We keep an array of struct pointers, with each one representing a single
packfile. But for some reason there is a nr_alloc parameter inside each
struct, which has never been used.

This is probably cruft left over from development, where we might have
wanted a nr_alloc to dynamically grow the list. But as it turns out, we
do not dynamically grow the list at all, but rather count up the total
number of packs and use that as a maximum size. So while we're thinking
of this, let's add an assert() that documents (and checks!) that our
allocation and fill loops stay in sync.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/server-info.c b/server-info.c
index d4115fecbb..c9fbfd3a51 100644
--- a/server-info.c
+++ b/server-info.c
@@ -91,7 +91,6 @@ static struct pack_info {
 	struct packed_git *p;
 	int old_num;
 	int new_num;
-	int nr_alloc;
 } **info;
 static int num_pack;
 static const char *objdir;
@@ -213,6 +212,7 @@ static void init_pack_info(const char *infofile, int force)
 	for (i = 0, p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local)
 			continue;
+		assert(i < num_pack);
 		info[i] = xcalloc(1, sizeof(struct pack_info));
 		info[i]->p = p;
 		info[i]->old_num = -1;
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 12/13] server-info: drop objdirlen pointer arithmetic
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (10 preceding siblings ...)
  2019-04-05 18:14   ` [PATCH v2 11/13] server-info: drop nr_alloc struct member Jeff King
@ 2019-04-05 18:14   ` Jeff King
  2019-04-05 18:14   ` [PATCH v2 13/13] update_info_refs(): drop unused force parameter Jeff King
  2019-04-05 18:19   ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:14 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

When writing objects/info/packs, we use the basename of each pack
(i.e., just the "pack-1234abcd.pack" part). We compute that manually by
adding "objdirlen + 6" to the name.

This _should_ work consistently, as we do not include non-local packs,
meaning everything should be in $objdir/pack/. Before f13d7db4af
(server-info.c: use pack_local like everybody else., 2005-12-05), this
was definitely true, since we computed "local" based on comparing the
objdir string.  Since then, we're relying on the code on packfile.c to
match our expectations of p->pack_name and p->local.

I think our expectations do still hold today, but we can be a bit more
defensive by just using pack_basename() to get the base. That
future-proofs us, and should hopefully be more obviously safe to
somebody reading the code.

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/server-info.c b/server-info.c
index c9fbfd3a51..ab03c1b3c2 100644
--- a/server-info.c
+++ b/server-info.c
@@ -93,16 +93,13 @@ static struct pack_info {
 	int new_num;
 } **info;
 static int num_pack;
-static const char *objdir;
-static int objdirlen;
 
 static struct pack_info *find_pack_by_name(const char *name)
 {
 	int i;
 	for (i = 0; i < num_pack; i++) {
 		struct packed_git *p = info[i]->p;
-		/* skip "/pack/" after ".git/objects" */
-		if (!strcmp(p->pack_name + objdirlen + 6, name))
+		if (!strcmp(pack_basename(p), name))
 			return info[i];
 	}
 	return NULL;
@@ -196,9 +193,6 @@ static void init_pack_info(const char *infofile, int force)
 	int stale;
 	int i = 0;
 
-	objdir = get_object_directory();
-	objdirlen = strlen(objdir);
-
 	for (p = get_all_packs(the_repository); p; p = p->next) {
 		/* we ignore things on alternate path since they are
 		 * not available to the pullers in general.
@@ -246,7 +240,7 @@ static int write_pack_info_file(FILE *fp)
 {
 	int i;
 	for (i = 0; i < num_pack; i++) {
-		if (fprintf(fp, "P %s\n", info[i]->p->pack_name + objdirlen + 6) < 0)
+		if (fprintf(fp, "P %s\n", pack_basename(info[i]->p)) < 0)
 			return -1;
 	}
 	if (fputc('\n', fp) == EOF)
-- 
2.21.0.729.g7d31bf3764


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 13/13] update_info_refs(): drop unused force parameter
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (11 preceding siblings ...)
  2019-04-05 18:14   ` [PATCH v2 12/13] server-info: drop objdirlen pointer arithmetic Jeff King
@ 2019-04-05 18:14   ` Jeff King
  2019-04-05 18:19   ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:14 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

Once upon a time the force flag meant something when writing info/refs,
but it hasn't done anything since 60d0526aaa (Unoptimize info/refs
creation., 2005-09-14).

Signed-off-by: Jeff King <peff@peff.net>
---
 server-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/server-info.c b/server-info.c
index ab03c1b3c2..41274d098b 100644
--- a/server-info.c
+++ b/server-info.c
@@ -78,7 +78,7 @@ static int generate_info_refs(FILE *fp)
 	return for_each_ref(add_info_ref, fp);
 }
 
-static int update_info_refs(int force)
+static int update_info_refs(void)
 {
 	char *path = git_pathdup("info/refs");
 	int ret = update_info_file(path, generate_info_refs);
@@ -269,7 +269,7 @@ int update_server_info(int force)
 	 */
 	int errs = 0;
 
-	errs = errs | update_info_refs(force);
+	errs = errs | update_info_refs();
 	errs = errs | update_info_packs(force);
 
 	/* remove leftover rev-cache file if there is any */
-- 
2.21.0.729.g7d31bf3764

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def()
  2019-04-05 18:13   ` [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def() Jeff King
@ 2019-04-05 18:16     ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:16 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor

On Fri, Apr 05, 2019 at 02:13:13PM -0400, Jeff King wrote:

> Subject: server-info: simplify cleanup in parse_pack_def()
> We have two exits from the function: either we jump to the out_stale
> label or not. But in both exits we repeat our cleanup, and the only
> difference is our return value. Let's just use a variable for the return
> value to avoid repeating ourselves.

The subject name is misleading here. The change is actually in
read_pack_info_file(), which sees an error when calling
parse_pack_def().

Probably not worth a re-roll by itself.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes
  2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
                     ` (12 preceding siblings ...)
  2019-04-05 18:14   ` [PATCH v2 13/13] update_info_refs(): drop unused force parameter Jeff King
@ 2019-04-05 18:19   ` Jeff King
  13 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 18:19 UTC (permalink / raw)
  To: git; +Cc: René Scharfe, SZEDER Gábor, Ramsay Jones

On Fri, Apr 05, 2019 at 02:03:06PM -0400, Jeff King wrote:

> Other than that, it's substantially the same as v1. I did move the
> public declaration of pack_basename() to the patch where it is added,
> rather than doing it later, which shows up in the range diff below.

Sorry, two things I noticed and fixed up while sending out the patches
themselves, but which were too late to be in the cover letter or range
diff:

>   [03/13]: t5319: fix bogus cat-file argument

This fixes the flipped unsorted/unordered words in the commit message,
noticed by Ramsay.

>   [07/13]: http: simplify parsing of remote objects/info/packs

This drops the incorrect claim in the commit message noticed by René.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
@ 2019-04-05 19:19     ` Ramsay Jones
  2019-04-05 19:25       ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Ramsay Jones @ 2019-04-05 19:19 UTC (permalink / raw)
  To: Jeff King, git; +Cc: René Scharfe, SZEDER Gábor



On 05/04/2019 19:03, Jeff King wrote:
> As CodingGuidelines recommends, we do not need an "extern" when
> declaring a public function. Let's drop these. Note that we leave the
> extern on report_garbage(), as that is actually a function pointer, not
> a function itself.

Hmm, perhaps we need to edit CodingGuidelines to make it clear
that the 'extern' keyword is not needed on _function_ declarations
only. ;-)

Because, ...

[snip]

>  /* global flag to enable extra checks when accessing packed objects */
> -extern int do_check_packed_object_crc;
> +int do_check_packed_object_crc;

... removing this 'extern' on an int variable sends 'sparse'
into a frenzy of warnings! :-D

[You didn't use a global s/extern// by any chance?]

ATB,
Ramsay Jones

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-05 19:19     ` Ramsay Jones
@ 2019-04-05 19:25       ` Jeff King
  2019-04-08  5:13         ` Junio C Hamano
  2019-04-09 15:08         ` Junio C Hamano
  0 siblings, 2 replies; 48+ messages in thread
From: Jeff King @ 2019-04-05 19:25 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: git, René Scharfe, SZEDER Gábor

On Fri, Apr 05, 2019 at 08:19:30PM +0100, Ramsay Jones wrote:

> >  /* global flag to enable extra checks when accessing packed objects */
> > -extern int do_check_packed_object_crc;
> > +int do_check_packed_object_crc;
> 
> ... removing this 'extern' on an int variable sends 'sparse'
> into a frenzy of warnings! :-D
> 
> [You didn't use a global s/extern// by any chance?]

Oh my. I did look at each one, but probably via replace-and-confirm in
vim. I don't know how I managed to botch that one so badly.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/12] http: simplify parsing of remote objects/info/packs
  2019-04-05 18:11     ` Jeff King
@ 2019-04-05 20:17       ` René Scharfe
  0 siblings, 0 replies; 48+ messages in thread
From: René Scharfe @ 2019-04-05 20:17 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Am 05.04.2019 um 20:11 schrieb Jeff King:
> On Fri, Apr 05, 2019 at 12:41:27PM +0200, René Scharfe wrote:
>> Parsing "P" and "pack-" together crosses logical token boundaries,
>> but that I don't mind it here.
>
> Yeah, I was tempted to write:
>
>   if (skip_prefix(data, "P ", &data) &&
>       skip_prefix(data, "pack-", &data) &&
>       ...
>
> but that felt a little silly. I dunno. I guess it is probably not any
> less efficient, because we'd expect skip_prefix() and its loop to get
> inlined here anyway.

Didn't think of inlining.  Clang unrolls the whole comparison (except on
powerpc64), but the other compilers available at the Compiler Explorer
website keep the consecutive calls separate: https://godbolt.org/z/7eTarV

René

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/13] midx: check both pack and index names for containment
  2019-04-05 18:06   ` [PATCH v2 05/13] midx: check both pack and index names for containment Jeff King
@ 2019-04-05 20:18     ` René Scharfe
  0 siblings, 0 replies; 48+ messages in thread
From: René Scharfe @ 2019-04-05 20:18 UTC (permalink / raw)
  To: Jeff King; +Cc: git, SZEDER Gábor

Am 05.04.2019 um 20:06 schrieb Jeff King:
> A midx file (and the struct we parse from it) contains a list of all of
> the covered packfiles, mentioned by their ".idx" names (e.g.,
> "pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
> callers to provide the idx name.
>
> This works for most of the calls, but the one in open_packed_git_1()
> tries to feed a packed_git->pack_name, which is the ".pack" name,
> meaning we'll never find a match (even if the pack is covered by the
> midx).
>
> We can fix this by converting the ".pack" to ".idx" in the caller.
> However, that requires allocating a new string. Instead, let's make
> midx_contains_pack() a bit friendlier, and allow it take _either_ the
> .pack or .idx variant.
>
> All cleverness in the matching code is credited to René. Bugs are mine.

I didn't consider it to be particularly tricky -- but then kept the dots
in both filename extension strings, which is not going to fly, as they
are skipped by the common-prefix loop..  Thanks for fixing that.

> There's no test here, because while this does fix _a_ bug, it's masked
> by another bug in that same caller. That will be covered (with a test)
> in the next patch.
>
> Helped-by: René Scharfe <l.s.r@web.de>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I was tempted to suggest that the midx struct just store the base name
> without ".idx" at all, but having callers pass that is no less tricky
> than passing ".idx" (they still have to allocate a new string).

The middle part of the comparison function would become:

	if (!*idx_name && (!strcmp(idx_or_pack_name, ".idx") ||
			   !strcmp(idx_or_pack_name, ".pack"))
		return 0;

No allocations needed -- except when building the list.

(And this time we'd actually need the dots.)

>
>  midx.c | 36 ++++++++++++++++++++++++++++++++++--
>  midx.h |  2 +-
>  2 files changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/midx.c b/midx.c
> index 8a505fd423..0ceca1938f 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -307,7 +307,39 @@ int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct mu
>  	return nth_midxed_pack_entry(m, e, pos);
>  }
>
> -int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
> +/* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
> +static int cmp_idx_or_pack_name(const char *idx_or_pack_name,
> +				const char *idx_name)
> +{
> +	/* Skip past any initial matching prefix. */
> +	while (*idx_name && *idx_name == *idx_or_pack_name) {
> +		idx_name++;
> +		idx_or_pack_name++;
> +	}
> +
> +	/*
> +	 * If we didn't match completely, we may have matched "pack-1234." and
> +	 * be left with "idx" and "pack" respectively, which is also OK. We do
> +	 * not have to check for "idx" and "idx", because that would have been
> +	 * a complete match (and in that case these strcmps will be false, but
> +	 * we'll correctly return 0 from the final strcmp() below.
> +	 *
> +	 * Technically this matches "fooidx" and "foopack", but we'd never have
> +	 * such names in the first place.
> +	 */
> +	if (!strcmp(idx_name, "idx") && !strcmp(idx_or_pack_name, "pack"))
> +		return 0;

This is asymmetric, and thus the function should not be used for sorting,
where it would be called for random pairs of values. For the binary search
it's fine, assuming the list contains only index filenames (ending with
".idx"), as the search string is always passed in as idx_or_pack_name.

And an extension-insensitive lookup works fine in a strcmp()-sorted list
because the order wouldn't change when sorting it again with a looser
stable extension-sensitive sort.

> +
> +	/*
> +	 * This not only checks for a complete match, but also orders based on
> +	 * the first non-identical character, which means our ordering will
> +	 * match a raw strcmp(). That makes it OK to use this to binary search
> +	 * a naively-sorted list.
> +	 */
> +	return strcmp(idx_or_pack_name, idx_name);

At this point we could also compare the chars like this:

	return (unsigned char)(*idx_or_pack_name) - (unsigned char)(*idx_name);

This avoids a function call, but doesn't look very pretty.  And I'm not
fully sure the casts are correct.

> +}
> +
> +int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name)
>  {
>  	uint32_t first = 0, last = m->num_packs;
>
> @@ -317,7 +349,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
>  		int cmp;
>
>  		current = m->pack_names[mid];
> -		cmp = strcmp(idx_name, current);
> +		cmp = cmp_idx_or_pack_name(idx_or_pack_name, current);
>  		if (!cmp)
>  			return 1;
>  		if (cmp > 0) {
> diff --git a/midx.h b/midx.h
> index 774f652530..26dd042d63 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -43,7 +43,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid,
>  					struct multi_pack_index *m,
>  					uint32_t n);
>  int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
> -int midx_contains_pack(struct multi_pack_index *m, const char *idx_name);
> +int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
>  int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
>
>  int write_midx_file(const char *object_dir);
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-05 19:25       ` Jeff King
@ 2019-04-08  5:13         ` Junio C Hamano
  2019-04-08 20:32           ` Jeff King
  2019-04-09 15:08         ` Junio C Hamano
  1 sibling, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2019-04-08  5:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Ramsay Jones, git, René Scharfe, SZEDER Gábor

Jeff King <peff@peff.net> writes:

> On Fri, Apr 05, 2019 at 08:19:30PM +0100, Ramsay Jones wrote:
>
>> >  /* global flag to enable extra checks when accessing packed objects */
>> > -extern int do_check_packed_object_crc;
>> > +int do_check_packed_object_crc;
>> 
>> ... removing this 'extern' on an int variable sends 'sparse'
>> into a frenzy of warnings! :-D
>> 
>> [You didn't use a global s/extern// by any chance?]
>
> Oh my. I did look at each one, but probably via replace-and-confirm in
> vim. I don't know how I managed to botch that one so badly.

Perhaps we should keep 'extern' even when declaring (not defining) a
public function in the header file to avoid a gotcha like this?

What was the reasoning behind the insn in CodingGuidelines?  "As it
is already the default" does qualify as a reasonable justification
for telling "extern is not needed for functions" to our readers, but
not quite enough for "extern should not be used for functions".


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-08  5:13         ` Junio C Hamano
@ 2019-04-08 20:32           ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-08 20:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ramsay Jones, git, René Scharfe, SZEDER Gábor

On Mon, Apr 08, 2019 at 02:13:17PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Fri, Apr 05, 2019 at 08:19:30PM +0100, Ramsay Jones wrote:
> >
> >> >  /* global flag to enable extra checks when accessing packed objects */
> >> > -extern int do_check_packed_object_crc;
> >> > +int do_check_packed_object_crc;
> >> 
> >> ... removing this 'extern' on an int variable sends 'sparse'
> >> into a frenzy of warnings! :-D
> >> 
> >> [You didn't use a global s/extern// by any chance?]
> >
> > Oh my. I did look at each one, but probably via replace-and-confirm in
> > vim. I don't know how I managed to botch that one so badly.
> 
> Perhaps we should keep 'extern' even when declaring (not defining) a
> public function in the header file to avoid a gotcha like this?
> 
> What was the reasoning behind the insn in CodingGuidelines?  "As it
> is already the default" does qualify as a reasonable justification
> for telling "extern is not needed for functions" to our readers, but
> not quite enough for "extern should not be used for functions".

I think the reasoning is just that it's useless noise, so it makes the
resulting lines longer (which are often already too-long) and harder to
read.

For this particular patch, I don't care that much about the existing
functions, which I'm not touching, but rather was adding a new one. And
my options were:

  - use "extern" there to match; even if we don't want to go through the
    code churn of cleaning up existing cases, I think we shouldn't be
    encouraging it in new ones. Even more crazy to me would actually
    be review telling somebody to add an extern.

  - intermingle it with the existing extern ones. Low risk, but leaves
    people wondering why some have extern and some do not.

  - clean up the existing cases first

I dunno. Like all code churn, these kinds of clean-ups have the
possibility of accidentally screwing something up. But they are at least
a one-time pain, as long as we do not keep changing our mind back and
forth.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-05 19:25       ` Jeff King
  2019-04-08  5:13         ` Junio C Hamano
@ 2019-04-09 15:08         ` Junio C Hamano
  2019-04-09 15:15           ` Jeff King
  1 sibling, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2019-04-09 15:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Ramsay Jones, git, René Scharfe, SZEDER Gábor

Jeff King <peff@peff.net> writes:

> On Fri, Apr 05, 2019 at 08:19:30PM +0100, Ramsay Jones wrote:
>
>> >  /* global flag to enable extra checks when accessing packed objects */
>> > -extern int do_check_packed_object_crc;
>> > +int do_check_packed_object_crc;
>> 
>> ... removing this 'extern' on an int variable sends 'sparse'
>> into a frenzy of warnings! :-D
>> 
>> [You didn't use a global s/extern// by any chance?]
>
> Oh my. I did look at each one, but probably via replace-and-confirm in
> vim. I don't know how I managed to botch that one so badly.

In any case, I think I have a correct SQUASH??? immediately on top
of this step queued.  If there is nothing else needed, I think we
are ready to squash that in and merge the whole thing to 'next'.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/13] packfile.h: drop extern from function declarations
  2019-04-09 15:08         ` Junio C Hamano
@ 2019-04-09 15:15           ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2019-04-09 15:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ramsay Jones, git, René Scharfe, SZEDER Gábor

On Wed, Apr 10, 2019 at 12:08:45AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Fri, Apr 05, 2019 at 08:19:30PM +0100, Ramsay Jones wrote:
> >
> >> >  /* global flag to enable extra checks when accessing packed objects */
> >> > -extern int do_check_packed_object_crc;
> >> > +int do_check_packed_object_crc;
> >> 
> >> ... removing this 'extern' on an int variable sends 'sparse'
> >> into a frenzy of warnings! :-D
> >> 
> >> [You didn't use a global s/extern// by any chance?]
> >
> > Oh my. I did look at each one, but probably via replace-and-confirm in
> > vim. I don't know how I managed to botch that one so badly.
> 
> In any case, I think I have a correct SQUASH??? immediately on top
> of this step queued.  If there is nothing else needed, I think we
> are ready to squash that in and merge the whole thing to 'next'.

Yes, your squash looks good, and I don't think there were any other
unaddressed comments. Thanks for fixing my bug.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2019-04-09 15:15 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
2019-04-05  0:44   ` Ramsay Jones
2019-04-05  1:41     ` Jeff King
2019-04-05  1:46       ` Jeff King
2019-04-04 23:22 ` [PATCH 02/12] t5319: drop useless --buffer from cat-file Jeff King
2019-04-04 23:22 ` [PATCH 03/12] packfile: factor out .pack to .idx name conversion Jeff King
2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
2019-04-05  8:05   ` René Scharfe
2019-04-05 13:21     ` Jeff King
2019-04-05 13:30       ` Jeff King
2019-04-05 12:01   ` SZEDER Gábor
2019-04-05 13:40     ` Jeff King
2019-04-04 23:27 ` [PATCH 05/12] http: simplify parsing of remote objects/info/packs Jeff King
2019-04-05 10:41   ` René Scharfe
2019-04-05 18:11     ` Jeff King
2019-04-05 20:17       ` René Scharfe
2019-04-04 23:27 ` [PATCH 06/12] server-info: fix blind pointer arithmetic Jeff King
2019-04-04 23:28 ` [PATCH 07/12] server-info: simplify cleanup in parse_pack_def() Jeff King
2019-04-04 23:28 ` [PATCH 08/12] server-info: use strbuf to read old info/packs file Jeff King
2019-04-04 23:29 ` [PATCH 09/12] server-info: drop nr_alloc struct member Jeff King
2019-04-04 23:30 ` [PATCH 10/12] packfile.h: drop extern from function declarations Jeff King
2019-04-04 23:30 ` [PATCH 11/12] server-info: drop objdirlen pointer arithmetic Jeff King
2019-04-04 23:31 ` [PATCH 12/12] update_info_refs(): drop unused force parameter Jeff King
2019-04-05  5:46 ` [PATCH 0/12] a rabbit hole of update-server-info fixes Junio C Hamano
2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
2019-04-05 19:19     ` Ramsay Jones
2019-04-05 19:25       ` Jeff King
2019-04-08  5:13         ` Junio C Hamano
2019-04-08 20:32           ` Jeff King
2019-04-09 15:08         ` Junio C Hamano
2019-04-09 15:15           ` Jeff King
2019-04-05 18:04   ` [PATCH v2 02/13] pack-revindex: open index if necessary Jeff King
2019-04-05 18:04   ` [PATCH v2 03/13] t5319: fix bogus cat-file argument Jeff King
2019-04-05 18:05   ` [PATCH v2 04/13] t5319: drop useless --buffer from cat-file Jeff King
2019-04-05 18:06   ` [PATCH v2 05/13] midx: check both pack and index names for containment Jeff King
2019-04-05 20:18     ` René Scharfe
2019-04-05 18:06   ` [PATCH v2 06/13] packfile: fix pack basename computation Jeff King
2019-04-05 18:12   ` [PATCH v2 07/13] http: simplify parsing of remote objects/info/packs Jeff King
2019-04-05 18:13   ` [PATCH v2 08/13] server-info: fix blind pointer arithmetic Jeff King
2019-04-05 18:13   ` [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def() Jeff King
2019-04-05 18:16     ` Jeff King
2019-04-05 18:13   ` [PATCH v2 10/13] server-info: use strbuf to read old info/packs file Jeff King
2019-04-05 18:14   ` [PATCH v2 11/13] server-info: drop nr_alloc struct member Jeff King
2019-04-05 18:14   ` [PATCH v2 12/13] server-info: drop objdirlen pointer arithmetic Jeff King
2019-04-05 18:14   ` [PATCH v2 13/13] update_info_refs(): drop unused force parameter Jeff King
2019-04-05 18:19   ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).