git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] multi-pack-index cleanups
@ 2018-08-20 16:51 Derrick Stolee
  2018-08-20 16:51 ` [PATCH 1/9] multi-pack-index: provide more helpful usage info Derrick Stolee
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

This series is based on ds/multi-pack-index and
jk/for-each-object-iteration.

The multi-pack-index indexes objects across multiple pack-files. To
speed up object lookups and abbreviations, we do not place the pack-
files covered by the multi-pack-index into the packed_git linked list
or the packed_git_mru list. Existing test coverage focused on typical
uses and the main consumers of the multi-pack-index.

To better understand the implications of the multi-pack-index with
other scenarios, I ran the test suite after adding a step to 'git repack'
to write a multi-pack-index, and to default core.multiPackIndex to 'true'.
This commit is available as [1].

The following issues were discovered, and are fixed by this series:

1. The multi-pack-index did not distinguish between local and non-local
   pack-files.

2. A bad packed object was not inspected by object lookups in the multi-
   pack-index, so would loop infinitely trying to load the same object.

3. 'git count-objects --verbose' would not see the objects in the multi-
   pack-index and would report the multi-pack-index as garbage.

4. If the local object directory had a multi-pack-index but an alternate
   did not, then the multi-pack-index would be dropped.

5. If the multi-pack-index covered a pack-file that was paired with a
   reachability bitmap, then that bitmap would not be loaded.

Several issues were resolved simply by making a new 'all_packs' list in
the object store and replacing get_packed_git() calls with get_all_packs()
calls. The all_packs list is a linked list that starts with the pack-files
in multi-pack-indexes and then continues along the packed_git linked list.

Also: I simplified the usage reports in 'git multi-pack-index' to help
users who are entering parameters incorrectly.

[1] https://github.com/derrickstolee/git/commit/098dd1d515b592fb165a276241d7d68d1cde0036
    DO-NOT-MERGE: compute multi-pack-index on repack.
    I will send this commit as a separate patch so we can see the change
    I made and the one test I needed to fix (because it moves a pack-file,
    thereby making the multi-pack-index invalid).

[2] https://github.com/derrickstolee/git/pull/9
    A GitHub pull request containing this series.

Derrick Stolee (9):
  multi-pack-index: provide more helpful usage info
  multi-pack-index: store local property
  midx: mark bad packed objects
  midx: stop reporting garbage
  midx: fix bug that skips midx with alternates
  packfile: add all_packs list
  treewide: use get_all_packs
  midx: test a few commands that use get_all_packs
  pack-objects: consider packs in multi-pack-index

 builtin/count-objects.c     |  2 +-
 builtin/fsck.c              |  4 ++--
 builtin/gc.c                |  4 ++--
 builtin/multi-pack-index.c  | 16 +++++++-------
 builtin/pack-objects.c      | 42 +++++++++++++++++++++++++++++------
 builtin/pack-redundant.c    |  4 ++--
 fast-import.c               |  4 ++--
 http-backend.c              |  4 ++--
 midx.c                      | 32 ++++++++++++++++++---------
 midx.h                      |  7 ++++--
 object-store.h              |  6 +++++
 pack-bitmap.c               |  2 +-
 pack-objects.c              |  2 +-
 packfile.c                  | 40 ++++++++++++++++++++++++++++-----
 packfile.h                  |  1 +
 server-info.c               |  4 ++--
 t/helper/test-read-midx.c   |  2 +-
 t/t5319-multi-pack-index.sh | 44 ++++++++++++++++++++++++++++++++++---
 18 files changed, 168 insertions(+), 52 deletions(-)

-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/9] multi-pack-index: provide more helpful usage info
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
@ 2018-08-20 16:51 ` Derrick Stolee
  2018-08-20 16:51 ` [PATCH 2/9] multi-pack-index: store local property Derrick Stolee
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

The multi-pack-index builtin has a very simple command-line
interface. Instead of simply reporting usage, give the user a
hint to why the arguments failed.

Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/multi-pack-index.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
index 6a7aa00cf2..2633efd95d 100644
--- a/builtin/multi-pack-index.c
+++ b/builtin/multi-pack-index.c
@@ -32,16 +32,16 @@ int cmd_multi_pack_index(int argc, const char **argv,
 		opts.object_dir = get_object_directory();
 
 	if (argc == 0)
-		goto usage;
+		usage_with_options(builtin_multi_pack_index_usage,
+				   builtin_multi_pack_index_options);
 
-	if (!strcmp(argv[0], "write")) {
-		if (argc > 1)
-			goto usage;
+	if (argc > 1) {
+		die(_("too many arguments"));
+		return 1;
+	}
 
+	if (!strcmp(argv[0], "write"))
 		return write_midx_file(opts.object_dir);
-	}
 
-usage:
-	usage_with_options(builtin_multi_pack_index_usage,
-			   builtin_multi_pack_index_options);
+	die(_("unrecognized verb: %s"), argv[0]);
 }
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/9] multi-pack-index: store local property
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
  2018-08-20 16:51 ` [PATCH 1/9] multi-pack-index: provide more helpful usage info Derrick Stolee
@ 2018-08-20 16:51 ` Derrick Stolee
  2018-08-20 21:14   ` Stefan Beller
  2018-08-20 16:51 ` [PATCH 3/9] midx: mark bad packed objects Derrick Stolee
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

A pack-file is 'local' if it is stored within the usual object
directory. If it is stored in an alternate, it is non-local.

Pack-files are stored using a 'pack_local' member in the packed_git
struct. Add a similar 'local' member to the multi_pack_index struct
and 'local' parameters to the methods that load and prepare multi-
pack-indexes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c                    | 11 ++++++-----
 midx.h                    |  6 ++++--
 packfile.c                |  4 ++--
 t/helper/test-read-midx.c |  2 +-
 4 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/midx.c b/midx.c
index 19b7df338e..6824acf5f8 100644
--- a/midx.c
+++ b/midx.c
@@ -37,7 +37,7 @@ static char *get_midx_filename(const char *object_dir)
 	return xstrfmt("%s/pack/multi-pack-index", object_dir);
 }
 
-struct multi_pack_index *load_multi_pack_index(const char *object_dir)
+struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local)
 {
 	struct multi_pack_index *m = NULL;
 	int fd;
@@ -73,6 +73,7 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir)
 	m->fd = fd;
 	m->data = midx_map;
 	m->data_len = midx_size;
+	m->local = local;
 
 	m->signature = get_be32(m->data);
 	if (m->signature != MIDX_SIGNATURE) {
@@ -209,7 +210,7 @@ static int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id)
 	strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
 		    m->pack_names[pack_int_id]);
 
-	m->packs[pack_int_id] = add_packed_git(pack_name.buf, pack_name.len, 1);
+	m->packs[pack_int_id] = add_packed_git(pack_name.buf, pack_name.len, m->local);
 	strbuf_release(&pack_name);
 	return !m->packs[pack_int_id];
 }
@@ -318,7 +319,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
 	return 0;
 }
 
-int prepare_multi_pack_index_one(struct repository *r, const char *object_dir)
+int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local)
 {
 	struct multi_pack_index *m = r->objects->multi_pack_index;
 	struct multi_pack_index *m_search;
@@ -332,7 +333,7 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir)
 		if (!strcmp(object_dir, m_search->object_dir))
 			return 1;
 
-	r->objects->multi_pack_index = load_multi_pack_index(object_dir);
+	r->objects->multi_pack_index = load_multi_pack_index(object_dir, local);
 
 	if (r->objects->multi_pack_index) {
 		r->objects->multi_pack_index->next = m;
@@ -746,7 +747,7 @@ int write_midx_file(const char *object_dir)
 			  midx_name);
 	}
 
-	packs.m = load_multi_pack_index(object_dir);
+	packs.m = load_multi_pack_index(object_dir, 1);
 
 	packs.nr = 0;
 	packs.alloc_list = packs.m ? packs.m->num_packs : 16;
diff --git a/midx.h b/midx.h
index e3b07f1586..8aa79f4b62 100644
--- a/midx.h
+++ b/midx.h
@@ -18,6 +18,8 @@ struct multi_pack_index {
 	uint32_t num_packs;
 	uint32_t num_objects;
 
+	int local;
+
 	const unsigned char *chunk_pack_names;
 	const uint32_t *chunk_oid_fanout;
 	const unsigned char *chunk_oid_lookup;
@@ -29,14 +31,14 @@ struct multi_pack_index {
 	char object_dir[FLEX_ARRAY];
 };
 
-struct multi_pack_index *load_multi_pack_index(const char *object_dir);
+struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
 struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
 					uint32_t n);
 int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
 int midx_contains_pack(struct multi_pack_index *m, const char *idx_name);
-int prepare_multi_pack_index_one(struct repository *r, const char *object_dir);
+int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
 int write_midx_file(const char *object_dir);
 void clear_midx_file(const char *object_dir);
diff --git a/packfile.c b/packfile.c
index 12db1a9d7d..896da460ac 100644
--- a/packfile.c
+++ b/packfile.c
@@ -963,11 +963,11 @@ static void prepare_packed_git(struct repository *r)
 
 	if (r->objects->packed_git_initialized)
 		return;
-	prepare_multi_pack_index_one(r, r->objects->objectdir);
+	prepare_multi_pack_index_one(r, r->objects->objectdir, 1);
 	prepare_packed_git_one(r, r->objects->objectdir, 1);
 	prepare_alt_odb(r);
 	for (alt = r->objects->alt_odb_list; alt; alt = alt->next) {
-		prepare_multi_pack_index_one(r, alt->path);
+		prepare_multi_pack_index_one(r, alt->path, 0);
 		prepare_packed_git_one(r, alt->path, 0);
 	}
 	rearrange_packed_git(r);
diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c
index 8e19972e89..831b586d02 100644
--- a/t/helper/test-read-midx.c
+++ b/t/helper/test-read-midx.c
@@ -7,7 +7,7 @@
 static int read_midx_file(const char *object_dir)
 {
 	uint32_t i;
-	struct multi_pack_index *m = load_multi_pack_index(object_dir);
+	struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
 
 	if (!m)
 		return 1;
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/9] midx: mark bad packed objects
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
  2018-08-20 16:51 ` [PATCH 1/9] multi-pack-index: provide more helpful usage info Derrick Stolee
  2018-08-20 16:51 ` [PATCH 2/9] multi-pack-index: store local property Derrick Stolee
@ 2018-08-20 16:51 ` Derrick Stolee
  2018-08-20 21:23   ` Stefan Beller
  2018-08-20 16:51 ` [PATCH 4/9] midx: stop reporting garbage Derrick Stolee
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

When an object fails to decompress from a pack-file, we mark the object
as 'bad' so we can retry with a different copy of the object (if such a
copy exists).

Before now, the multi-pack-index did not update the bad objects list for
the pack-files it contains, and we did not check the bad objects list
when reading an object. Now, do both.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/midx.c b/midx.c
index 6824acf5f8..7fa75a37a3 100644
--- a/midx.c
+++ b/midx.c
@@ -280,6 +280,16 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
 	if (!is_pack_valid(p))
 		return 0;
 
+	if (p->num_bad_objects) {
+		uint32_t i;
+		struct object_id oid;
+		nth_midxed_object_oid(&oid, m, pos);
+		for (i = 0; i < p->num_bad_objects; i++)
+			if (!hashcmp(oid.hash,
+				     p->bad_object_sha1 + the_hash_algo->rawsz * i))
+				return 0;
+	}
+
 	e->offset = nth_midxed_offset(m, pos);
 	e->p = p;
 
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/9] midx: stop reporting garbage
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (2 preceding siblings ...)
  2018-08-20 16:51 ` [PATCH 3/9] midx: mark bad packed objects Derrick Stolee
@ 2018-08-20 16:51 ` Derrick Stolee
  2018-08-20 16:52 ` [PATCH 5/9] midx: fix bug that skips midx with alternates Derrick Stolee
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

When prepare_packed_git is called with the report_garbage method
initialized, we report unexpected files in the objects directory
as garbage. Stop reporting the multi-pack-index and the pack-files
it covers as garbage.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 packfile.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/packfile.c b/packfile.c
index 896da460ac..fe713a0242 100644
--- a/packfile.c
+++ b/packfile.c
@@ -820,9 +820,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
 	struct packed_git *p;
 	size_t base_len = full_name_len;
 
-	if (strip_suffix_mem(full_name, &base_len, ".idx")) {
-		if (data->m && midx_contains_pack(data->m, file_name))
-			return;
+	if (strip_suffix_mem(full_name, &base_len, ".idx") &&
+	    !(data->m && midx_contains_pack(data->m, file_name))) {
 		/* Don't reopen a pack we already have. */
 		for (p = data->r->objects->packed_git; p; p = p->next) {
 			size_t len;
@@ -842,6 +841,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
 	if (!report_garbage)
 		return;
 
+	if (!strcmp(file_name, "multi-pack-index"))
+		return;
 	if (ends_with(file_name, ".idx") ||
 	    ends_with(file_name, ".pack") ||
 	    ends_with(file_name, ".bitmap") ||
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/9] midx: fix bug that skips midx with alternates
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (3 preceding siblings ...)
  2018-08-20 16:51 ` [PATCH 4/9] midx: stop reporting garbage Derrick Stolee
@ 2018-08-20 16:52 ` Derrick Stolee
  2018-08-20 16:52 ` [PATCH 6/9] packfile: add all_packs list Derrick Stolee
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:52 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

The logic for constructing the linked list of multi-pack-indexes
in the object store is incorrect. If the local object store has
a multi-pack-index, but an alternate does not, then the list is
dropped.

Add tests that would have revealed this bug.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c                      | 11 ++++++-----
 t/t5319-multi-pack-index.sh | 17 +++++++++++++++++
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/midx.c b/midx.c
index 7fa75a37a3..0710c4c175 100644
--- a/midx.c
+++ b/midx.c
@@ -331,7 +331,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
 
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local)
 {
-	struct multi_pack_index *m = r->objects->multi_pack_index;
+	struct multi_pack_index *m;
 	struct multi_pack_index *m_search;
 	int config_value;
 
@@ -339,14 +339,15 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i
 	    !config_value)
 		return 0;
 
-	for (m_search = m; m_search; m_search = m_search->next)
+	for (m_search = r->objects->multi_pack_index; m_search; m_search = m_search->next)
 		if (!strcmp(object_dir, m_search->object_dir))
 			return 1;
 
-	r->objects->multi_pack_index = load_multi_pack_index(object_dir, local);
+	m = load_multi_pack_index(object_dir, local);
 
-	if (r->objects->multi_pack_index) {
-		r->objects->multi_pack_index->next = m;
+	if (m) {
+		m->next = r->objects->multi_pack_index;
+		r->objects->multi_pack_index = m;
 		return 1;
 	}
 
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index ae1d5d4592..4b6e2825a6 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -149,6 +149,23 @@ test_expect_success 'repack removes multi-pack-index' '
 
 compare_results_with_midx "after repack"
 
+test_expect_success 'multi-pack-index and alternates' '
+	git init --bare alt.git &&
+	echo $(pwd)/alt.git/objects >.git/objects/info/alternates &&
+	echo content1 >file1 &&
+	altblob=$(GIT_DIR=alt.git git hash-object -w file1) &&
+	git cat-file blob $altblob &&
+	git rev-list --all
+'
+
+compare_results_with_midx "with alternate (local midx)"
+
+test_expect_success 'multi-pack-index in an alternate' '
+	mv .git/objects/pack/* alt.git/objects/pack
+'
+
+compare_results_with_midx "with alternate (remote midx)"
+
 
 # usage: corrupt_data <file> <pos> [<data>]
 corrupt_data () {
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/9] packfile: add all_packs list
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (4 preceding siblings ...)
  2018-08-20 16:52 ` [PATCH 5/9] midx: fix bug that skips midx with alternates Derrick Stolee
@ 2018-08-20 16:52 ` Derrick Stolee
  2018-08-20 16:52 ` [PATCH 7/9] treewide: use get_all_packs Derrick Stolee
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:52 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

If a repo contains a multi-pack-index, then the packed_git list
does not contain the packfiles that are covered by the multi-pack-index.
This is important for doing object lookups, abbreviations, and
approximating object count. However, there are many operations that
really want to iterate over all packfiles.

Create a new 'all_packs' linked list that contains this list, starting
with the packfiles in the multi-pack-index and then continuing along
the packed_git linked list.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c         |  2 +-
 midx.h         |  1 +
 object-store.h |  6 ++++++
 packfile.c     | 27 +++++++++++++++++++++++++++
 packfile.h     |  1 +
 5 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 0710c4c175..f3e8dbc108 100644
--- a/midx.c
+++ b/midx.c
@@ -197,7 +197,7 @@ static void close_midx(struct multi_pack_index *m)
 	FREE_AND_NULL(m->pack_names);
 }
 
-static int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id)
+int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id)
 {
 	struct strbuf pack_name = STRBUF_INIT;
 
diff --git a/midx.h b/midx.h
index 8aa79f4b62..a210f1af2a 100644
--- a/midx.h
+++ b/midx.h
@@ -32,6 +32,7 @@ struct multi_pack_index {
 };
 
 struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
+int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id);
 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
 struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
diff --git a/object-store.h b/object-store.h
index f9c57e2c26..c69d546392 100644
--- a/object-store.h
+++ b/object-store.h
@@ -128,6 +128,12 @@ struct raw_object_store {
 	/* A most-recently-used ordered version of the packed_git list. */
 	struct list_head packed_git_mru;
 
+	/*
+	 * A linked list containing all packfiles, starting with those
+	 * contained in the multi_pack_index.
+	 */
+	struct packed_git *all_packs;
+
 	/*
 	 * A fast, rough count of the number of objects in the repository.
 	 * These two fields are not meant for direct access. Use
diff --git a/packfile.c b/packfile.c
index fe713a0242..adcf2e12a0 100644
--- a/packfile.c
+++ b/packfile.c
@@ -972,6 +972,9 @@ static void prepare_packed_git(struct repository *r)
 		prepare_packed_git_one(r, alt->path, 0);
 	}
 	rearrange_packed_git(r);
+
+	r->objects->all_packs = NULL;
+
 	prepare_packed_git_mru(r);
 	r->objects->packed_git_initialized = 1;
 }
@@ -995,6 +998,30 @@ struct multi_pack_index *get_multi_pack_index(struct repository *r)
 	return r->objects->multi_pack_index;
 }
 
+struct packed_git *get_all_packs(struct repository *r)
+{
+	prepare_packed_git(r);
+
+	if (!r->objects->all_packs) {
+		struct packed_git *p = r->objects->packed_git;
+		struct multi_pack_index *m;
+
+		for (m = r->objects->multi_pack_index; m; m = m->next) {
+			uint32_t i;
+			for (i = 0; i < m->num_packs; i++) {
+				if (!prepare_midx_pack(m, i)) {
+					m->packs[i]->next = p;
+					p = m->packs[i];
+				}
+			}
+		}
+
+		r->objects->all_packs = p;
+	}
+
+	return r->objects->all_packs;
+}
+
 struct list_head *get_packed_git_mru(struct repository *r)
 {
 	prepare_packed_git(r);
diff --git a/packfile.h b/packfile.h
index 7855556686..3b90e2864c 100644
--- a/packfile.h
+++ b/packfile.h
@@ -51,6 +51,7 @@ extern void install_packed_git(struct repository *r, struct packed_git *pack);
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
 struct multi_pack_index *get_multi_pack_index(struct repository *r);
+struct packed_git *get_all_packs(struct repository *r);
 
 /*
  * Give a rough count of objects in the repository. This sacrifices accuracy
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/9] treewide: use get_all_packs
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (5 preceding siblings ...)
  2018-08-20 16:52 ` [PATCH 6/9] packfile: add all_packs list Derrick Stolee
@ 2018-08-20 16:52 ` Derrick Stolee
  2018-08-20 22:01   ` Stefan Beller
  2018-08-20 16:52 ` [PATCH 8/9] midx: test a few commands that " Derrick Stolee
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:52 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

There are many places in the codebase that want to iterate over
all packfiles known to Git. The purposes are wide-ranging, and
those that can take advantage of the multi-pack-index already
do. So, use get_all_packs() instead of get_packed_git() to be
sure we are iterating over all packfiles.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/count-objects.c  |  2 +-
 builtin/fsck.c           |  4 ++--
 builtin/gc.c             |  4 ++--
 builtin/pack-objects.c   | 14 +++++++-------
 builtin/pack-redundant.c |  4 ++--
 fast-import.c            |  4 ++--
 http-backend.c           |  4 ++--
 pack-bitmap.c            |  2 +-
 pack-objects.c           |  2 +-
 packfile.c               |  2 +-
 server-info.c            |  4 ++--
 11 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index d51e2ce1ec..a7cad052c6 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -123,7 +123,7 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix)
 		struct strbuf pack_buf = STRBUF_INIT;
 		struct strbuf garbage_buf = STRBUF_INIT;
 
-		for (p = get_packed_git(the_repository); p; p = p->next) {
+		for (p = get_all_packs(the_repository); p; p = p->next) {
 			if (!p->pack_local)
 				continue;
 			if (open_pack_index(p))
diff --git a/builtin/fsck.c b/builtin/fsck.c
index c96f3f4fcc..184d8e7f4e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -740,7 +740,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 			struct progress *progress = NULL;
 
 			if (show_progress) {
-				for (p = get_packed_git(the_repository); p;
+				for (p = get_all_packs(the_repository); p;
 				     p = p->next) {
 					if (open_pack_index(p))
 						continue;
@@ -749,7 +749,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 				progress = start_progress(_("Checking objects"), total);
 			}
-			for (p = get_packed_git(the_repository); p;
+			for (p = get_all_packs(the_repository); p;
 			     p = p->next) {
 				/* verify gives error messages itself */
 				if (verify_pack(p, fsck_obj_buffer,
diff --git a/builtin/gc.c b/builtin/gc.c
index 57069442b0..2b592260e9 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -183,7 +183,7 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 {
 	struct packed_git *p, *base = NULL;
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local)
 			continue;
 		if (limit) {
@@ -208,7 +208,7 @@ static int too_many_packs(void)
 	if (gc_auto_pack_limit <= 0)
 		return 0;
 
-	for (cnt = 0, p = get_packed_git(the_repository); p; p = p->next) {
+	for (cnt = 0, p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local)
 			continue;
 		if (p->pack_keep)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4391504a91..1a896d7810 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2784,7 +2784,7 @@ static void add_objects_in_unpacked_packs(struct rev_info *revs)
 
 	memset(&in_pack, 0, sizeof(in_pack));
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		struct object_id oid;
 		struct object *o;
 
@@ -2848,7 +2848,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
 	struct packed_git *p;
 
 	p = (last_found != (void *)1) ? last_found :
-					get_packed_git(the_repository);
+					get_all_packs(the_repository);
 
 	while (p) {
 		if ((!p->pack_local || p->pack_keep ||
@@ -2858,7 +2858,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid)
 			return 1;
 		}
 		if (p == last_found)
-			p = get_packed_git(the_repository);
+			p = get_all_packs(the_repository);
 		else
 			p = p->next;
 		if (p == last_found)
@@ -2894,7 +2894,7 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
 	uint32_t i;
 	struct object_id oid;
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local || p->pack_keep || p->pack_keep_in_core)
 			continue;
 
@@ -3041,7 +3041,7 @@ static void add_extra_kept_packs(const struct string_list *names)
 	if (!names->nr)
 		return;
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		const char *name = basename(p->pack_name);
 		int i;
 
@@ -3314,7 +3314,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	add_extra_kept_packs(&keep_pack_list);
 	if (ignore_packed_keep_on_disk) {
 		struct packed_git *p;
-		for (p = get_packed_git(the_repository); p; p = p->next)
+		for (p = get_all_packs(the_repository); p; p = p->next)
 			if (p->pack_local && p->pack_keep)
 				break;
 		if (!p) /* no keep-able packs found */
@@ -3327,7 +3327,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		 * it also covers non-local objects
 		 */
 		struct packed_git *p;
-		for (p = get_packed_git(the_repository); p; p = p->next) {
+		for (p = get_all_packs(the_repository); p; p = p->next) {
 			if (!p->pack_local) {
 				have_non_local_packs = 1;
 				break;
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index 0494dceff7..cf9a9aabd4 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -577,7 +577,7 @@ static struct pack_list * add_pack(struct packed_git *p)
 
 static struct pack_list * add_pack_file(const char *filename)
 {
-	struct packed_git *p = get_packed_git(the_repository);
+	struct packed_git *p = get_all_packs(the_repository);
 
 	if (strlen(filename) < 40)
 		die("Bad pack filename: %s", filename);
@@ -592,7 +592,7 @@ static struct pack_list * add_pack_file(const char *filename)
 
 static void load_all(void)
 {
-	struct packed_git *p = get_packed_git(the_repository);
+	struct packed_git *p = get_all_packs(the_repository);
 
 	while (p) {
 		add_pack(p);
diff --git a/fast-import.c b/fast-import.c
index 89bb0c9db3..f8c3acd3b5 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1068,7 +1068,7 @@ static int store_object(
 		duplicate_count_by_type[type]++;
 		return 1;
 	} else if (find_sha1_pack(oid.hash,
-				  get_packed_git(the_repository))) {
+				  get_all_packs(the_repository))) {
 		e->type = type;
 		e->pack_id = MAX_PACK_ID;
 		e->idx.offset = 1; /* just not zero! */
@@ -1266,7 +1266,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark)
 		truncate_pack(&checkpoint);
 
 	} else if (find_sha1_pack(oid.hash,
-				  get_packed_git(the_repository))) {
+				  get_all_packs(the_repository))) {
 		e->type = OBJ_BLOB;
 		e->pack_id = MAX_PACK_ID;
 		e->idx.offset = 1; /* just not zero! */
diff --git a/http-backend.c b/http-backend.c
index bd0442a805..5e879177ed 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -523,13 +523,13 @@ static void get_info_packs(struct strbuf *hdr, char *arg)
 	size_t cnt = 0;
 
 	select_getanyfile(hdr);
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if (p->pack_local)
 			cnt++;
 	}
 
 	strbuf_grow(&buf, cnt * 53 + 2);
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if (p->pack_local)
 			strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
 	}
diff --git a/pack-bitmap.c b/pack-bitmap.c
index f0a1937a1c..4e50ab391f 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -335,7 +335,7 @@ static int open_pack_bitmap(struct bitmap_index *bitmap_git)
 
 	assert(!bitmap_git->map && !bitmap_git->loaded);
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if (open_pack_bitmap_1(bitmap_git, p) == 0)
 			ret = 0;
 	}
diff --git a/pack-objects.c b/pack-objects.c
index 92708522e7..832dcf7462 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -99,7 +99,7 @@ static void prepare_in_pack_by_idx(struct packing_data *pdata)
 	 * (i.e. in_pack_idx also zero) should return NULL.
 	 */
 	mapping[cnt++] = NULL;
-	for (p = get_packed_git(the_repository); p; p = p->next, cnt++) {
+	for (p = get_all_packs(the_repository); p; p = p->next, cnt++) {
 		if (cnt == nr) {
 			free(mapping);
 			return;
diff --git a/packfile.c b/packfile.c
index adcf2e12a0..cbef7033c3 100644
--- a/packfile.c
+++ b/packfile.c
@@ -2036,7 +2036,7 @@ int for_each_packed_object(each_packed_object_fn cb, void *data,
 	int pack_errors = 0;
 
 	prepare_packed_git(the_repository);
-	for (p = the_repository->objects->packed_git; p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local)
 			continue;
 		if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) &&
diff --git a/server-info.c b/server-info.c
index 41050c2449..e2b2d6a27a 100644
--- a/server-info.c
+++ b/server-info.c
@@ -199,7 +199,7 @@ static void init_pack_info(const char *infofile, int force)
 	objdir = get_object_directory();
 	objdirlen = strlen(objdir);
 
-	for (p = get_packed_git(the_repository); p; p = p->next) {
+	for (p = get_all_packs(the_repository); p; p = p->next) {
 		/* we ignore things on alternate path since they are
 		 * not available to the pullers in general.
 		 */
@@ -209,7 +209,7 @@ static void init_pack_info(const char *infofile, int force)
 	}
 	num_pack = i;
 	info = xcalloc(num_pack, sizeof(struct pack_info *));
-	for (i = 0, p = get_packed_git(the_repository); p; p = p->next) {
+	for (i = 0, p = get_all_packs(the_repository); p; p = p->next) {
 		if (!p->pack_local)
 			continue;
 		info[i] = xcalloc(1, sizeof(struct pack_info));
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 8/9] midx: test a few commands that use get_all_packs
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (6 preceding siblings ...)
  2018-08-20 16:52 ` [PATCH 7/9] treewide: use get_all_packs Derrick Stolee
@ 2018-08-20 16:52 ` Derrick Stolee
  2018-08-20 22:03   ` Stefan Beller
  2018-08-20 16:52 ` [PATCH 9/9] pack-objects: consider packs in multi-pack-index Derrick Stolee
  2018-08-21 14:34 ` [PATCH 0/9] multi-pack-index cleanups Duy Nguyen
  9 siblings, 1 reply; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:52 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

The new get_all_packs() method exposed the packfiles coverede by
a multi-pack-index. Before, the 'git cat-file --batch' and
'git count-objects' commands would skip objects in an environment
with a multi-pack-index.

Further, a reachability bitmap would be ignored if its pack-file
was covered by a multi-pack-index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t5319-multi-pack-index.sh | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 4b6e2825a6..424d0c640f 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -86,8 +86,14 @@ test_expect_success 'write midx with one v1 pack' '
 '
 
 midx_git_two_modes () {
-	git -c core.multiPackIndex=false $1 >expect &&
-	git -c core.multiPackIndex=true $1 >actual &&
+	if [ "$2" = "sorted" ]
+	then
+		git -c core.multiPackIndex=false $1 | sort >expect &&
+		git -c core.multiPackIndex=true $1 | sort >actual
+	else
+		git -c core.multiPackIndex=false $1 >expect &&
+		git -c core.multiPackIndex=true $1 >actual
+	fi &&
 	test_cmp expect actual
 }
 
@@ -95,7 +101,10 @@ compare_results_with_midx () {
 	MSG=$1
 	test_expect_success "check normal git operations: $MSG" '
 		midx_git_two_modes "rev-list --objects --all" &&
-		midx_git_two_modes "log --raw"
+		midx_git_two_modes "log --raw" &&
+		midx_git_two_modes "count-objects --verbose" &&
+		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check" &&
+		midx_git_two_modes "cat-file --batch-all-objects --buffer --batch-check --unsorted" sorted
 	'
 }
 
@@ -149,6 +158,12 @@ test_expect_success 'repack removes multi-pack-index' '
 
 compare_results_with_midx "after repack"
 
+test_expect_success 'multi-pack-index and pack-bitmap' '
+	git -c repack.writeBitmaps=true repack -ad &&
+	git multi-pack-index write &&
+	git rev-list --test-bitmap HEAD
+'
+
 test_expect_success 'multi-pack-index and alternates' '
 	git init --bare alt.git &&
 	echo $(pwd)/alt.git/objects >.git/objects/info/alternates &&
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 9/9] pack-objects: consider packs in multi-pack-index
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (7 preceding siblings ...)
  2018-08-20 16:52 ` [PATCH 8/9] midx: test a few commands that " Derrick Stolee
@ 2018-08-20 16:52 ` Derrick Stolee
  2018-08-21 14:34 ` [PATCH 0/9] multi-pack-index cleanups Duy Nguyen
  9 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-20 16:52 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Derrick Stolee

When running 'git pack-objects --local', we want to avoid packing
objects that are in an alternate. Currently, we check for these
objects using the packed_git_mru list, which excludes the pack-files
covered by a multi-pack-index.

Add a new iteration over the multi-pack-indexes to find these
copies and mark them as unwanted.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/pack-objects.c      | 28 ++++++++++++++++++++++++++++
 t/t5319-multi-pack-index.sh |  8 +++++++-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1a896d7810..4a9a42d29a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -31,6 +31,7 @@
 #include "packfile.h"
 #include "object-store.h"
 #include "dir.h"
+#include "midx.h"
 
 #define IN_PACK(obj) oe_in_pack(&to_pack, obj)
 #define SIZE(obj) oe_size(&to_pack, obj)
@@ -1034,6 +1035,7 @@ static int want_object_in_pack(const struct object_id *oid,
 {
 	int want;
 	struct list_head *pos;
+	struct multi_pack_index *m;
 
 	if (!exclude && local && has_loose_object_nonlocal(oid))
 		return 0;
@@ -1048,6 +1050,32 @@ static int want_object_in_pack(const struct object_id *oid,
 		if (want != -1)
 			return want;
 	}
+
+	for (m = get_multi_pack_index(the_repository); m; m = m->next) {
+		struct pack_entry e;
+		if (fill_midx_entry(oid, &e, m)) {
+			struct packed_git *p = e.p;
+			off_t offset;
+			
+			if (p == *found_pack)
+				offset = *found_offset;
+			else
+				offset = find_pack_entry_one(oid->hash, p);
+
+			if (offset) {
+				if (!*found_pack) {
+					if (!is_pack_valid(p))
+						continue;
+					*found_offset = offset;
+					*found_pack = p;
+				}
+				want = want_found_object(exclude, p);
+				if (want != -1)
+					return want;
+			}
+		}
+	}
+
 	list_for_each(pos, get_packed_git_mru(the_repository)) {
 		struct packed_git *p = list_entry(pos, struct packed_git, mru);
 		off_t offset;
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index 424d0c640f..6f56b38674 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -176,7 +176,13 @@ test_expect_success 'multi-pack-index and alternates' '
 compare_results_with_midx "with alternate (local midx)"
 
 test_expect_success 'multi-pack-index in an alternate' '
-	mv .git/objects/pack/* alt.git/objects/pack
+	mv .git/objects/pack/* alt.git/objects/pack &&
+	test_commit add_local_objects &&
+	git repack --local &&
+	git multi-pack-index write &&
+	midx_read_expect 1 3 4 $objdir &&
+	git reset --hard HEAD~1 &&
+	rm -f .git/objects/pack/*
 '
 
 compare_results_with_midx "with alternate (remote midx)"
-- 
2.18.0.118.gd4f65b8d14


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/9] multi-pack-index: store local property
  2018-08-20 16:51 ` [PATCH 2/9] multi-pack-index: store local property Derrick Stolee
@ 2018-08-20 21:14   ` Stefan Beller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2018-08-20 21:14 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>
> A pack-file is 'local' if it is stored within the usual object
> directory. If it is stored in an alternate, it is non-local.
>
> Pack-files are stored using a 'pack_local' member in the packed_git
> struct. Add a similar 'local' member to the multi_pack_index struct
> and 'local' parameters to the methods that load and prepare multi-
> pack-indexes.

Going by that example, maybe we'd want to have it be the first bit
of a bitfield, c.f.

    unsigned pack_local:1,
        pack_keep:1,
        pack_keep_in_core:1,
        freshened:1,
        do_not_close:1,
        pack_promisor:1;

in the definition of packed_git.

Is there value in documenting both packfiles as well as midx variables?
(When going with the bitfield example, it may be a bit more worthwhile,
as the commit message is harder to find, as it will need repeated
blaming in the future, if there is a commit on top that adds another
bit to the field; I am unsure, but I would lean towards documentation
as it becomes a bit unclear: Is the local flag for the midx file that
is read, or rather for the underlying packs? What exactly is local? )

AFAICT this patch alone doesn't have any effect, yet, as it only pipes
the flag thru, is that worth mentioning or is that obvious?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/9] midx: mark bad packed objects
  2018-08-20 16:51 ` [PATCH 3/9] midx: mark bad packed objects Derrick Stolee
@ 2018-08-20 21:23   ` Stefan Beller
  2018-08-21 13:53     ` Derrick Stolee
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2018-08-20 21:23 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>
> When an object fails to decompress from a pack-file, we mark the object
> as 'bad' so we can retry with a different copy of the object (if such a
> copy exists).
>
> Before now, the multi-pack-index did not update the bad objects list for
> the pack-files it contains, and we did not check the bad objects list
> when reading an object. Now, do both.

This sounds like a bug fix unlike patches 1 & 2 that sound like
feature work(2) or making code more readable(1).
(After studying the code, this doesn't sound like a bug fix any more,
but a safety thing)

Is it worth having this on a separate track coming in
faster than the rest of this series?

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  midx.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/midx.c b/midx.c
> index 6824acf5f8..7fa75a37a3 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -280,6 +280,16 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
>         if (!is_pack_valid(p))
>                 return 0;
>
> +       if (p->num_bad_objects) {
> +               uint32_t i;

Is there a reason that i needs to be if 32 bits?
Would size_t (for iterating) or 'int' (as a default
like in many for loops) be ok, too?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/9] treewide: use get_all_packs
  2018-08-20 16:52 ` [PATCH 7/9] treewide: use get_all_packs Derrick Stolee
@ 2018-08-20 22:01   ` Stefan Beller
  2018-08-21 13:56     ` Derrick Stolee
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Beller @ 2018-08-20 22:01 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>
> There are many places in the codebase that want to iterate over
> all packfiles known to Git. The purposes are wide-ranging, and
> those that can take advantage of the multi-pack-index already
> do. So, use get_all_packs() instead of get_packed_git() to be
> sure we are iterating over all packfiles.

So get_packed_git shouold not be used any further?
Do we want to annotate it as deprecated, to be deleted
in the future? Or even remove it as part of this series
(that might be an issue with other series in flight).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 8/9] midx: test a few commands that use get_all_packs
  2018-08-20 16:52 ` [PATCH 8/9] midx: test a few commands that " Derrick Stolee
@ 2018-08-20 22:03   ` Stefan Beller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Beller @ 2018-08-20 22:03 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>
> The new get_all_packs() method exposed the packfiles coverede by

s/coverede/covered/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/9] midx: mark bad packed objects
  2018-08-20 21:23   ` Stefan Beller
@ 2018-08-21 13:53     ` Derrick Stolee
  0 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-21 13:53 UTC (permalink / raw)
  To: Stefan Beller, Derrick Stolee; +Cc: git

On 8/20/2018 5:23 PM, Stefan Beller wrote:
> On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>> When an object fails to decompress from a pack-file, we mark the object
>> as 'bad' so we can retry with a different copy of the object (if such a
>> copy exists).
>>
>> Before now, the multi-pack-index did not update the bad objects list for
>> the pack-files it contains, and we did not check the bad objects list
>> when reading an object. Now, do both.
> This sounds like a bug fix unlike patches 1 & 2 that sound like
> feature work(2) or making code more readable(1).
> (After studying the code, this doesn't sound like a bug fix any more,
> but a safety thing)

It is a safety thing. One codepath that needs this includes this comment:

             /*
              * We're probably in deep shit, but let's try to fetch
              * the required base anyway from another pack or loose.
              * This is costly but should happen only in the presence
              * of a corrupted pack, and is better than failing outright.
              */

This goes back to 8eca0b47: implement some resilience against pack 
corruptions. The logic is tested in t5303-pack-corruption-resilience.sh, 
with the test title '... and a redundant pack allows for full recovery too'.

> Is it worth having this on a separate track coming in
> faster than the rest of this series?

ds/multi-pack-index is cooking in 'next' until after 2.19.0, so I'm not 
sure it's worth splitting things up at this point.

>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>>   midx.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/midx.c b/midx.c
>> index 6824acf5f8..7fa75a37a3 100644
>> --- a/midx.c
>> +++ b/midx.c
>> @@ -280,6 +280,16 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
>>          if (!is_pack_valid(p))
>>                  return 0;
>>
>> +       if (p->num_bad_objects) {
>> +               uint32_t i;
> Is there a reason that i needs to be if 32 bits?
> Would size_t (for iterating) or 'int' (as a default
> like in many for loops) be ok, too?

i is bounded by p->num_bad_objects, which is also uint32_t.

Thanks,

-Stolee


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/9] treewide: use get_all_packs
  2018-08-20 22:01   ` Stefan Beller
@ 2018-08-21 13:56     ` Derrick Stolee
  0 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-21 13:56 UTC (permalink / raw)
  To: Stefan Beller, Derrick Stolee; +Cc: git

On 8/20/2018 6:01 PM, Stefan Beller wrote:
> On Mon, Aug 20, 2018 at 9:52 AM Derrick Stolee <dstolee@microsoft.com> wrote:
>> There are many places in the codebase that want to iterate over
>> all packfiles known to Git. The purposes are wide-ranging, and
>> those that can take advantage of the multi-pack-index already
>> do. So, use get_all_packs() instead of get_packed_git() to be
>> sure we are iterating over all packfiles.
> So get_packed_git shouold not be used any further?
> Do we want to annotate it as deprecated, to be deleted
> in the future? Or even remove it as part of this series
> (that might be an issue with other series in flight).

Perhaps the right thing to do would be to put a big warning over the 
definition to say "only use this if you really don't want 
get_all_packs()" or something. The reason I didn't remove it from 
packfile.h is that it is used in sha1-name.c alongside 
get_multi_pack_index() (so making it 'static' would be incorrect).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/9] multi-pack-index cleanups
  2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
                   ` (8 preceding siblings ...)
  2018-08-20 16:52 ` [PATCH 9/9] pack-objects: consider packs in multi-pack-index Derrick Stolee
@ 2018-08-21 14:34 ` Duy Nguyen
  2018-08-21 14:44   ` Derrick Stolee
  9 siblings, 1 reply; 18+ messages in thread
From: Duy Nguyen @ 2018-08-21 14:34 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Git Mailing List

On Mon, Aug 20, 2018 at 6:53 PM Derrick Stolee <dstolee@microsoft.com> wrote:
> To better understand the implications of the multi-pack-index with
> other scenarios, I ran the test suite after adding a step to 'git repack'
> to write a multi-pack-index, and to default core.multiPackIndex to 'true'.
> This commit is available as [1].
>
> ...
>
> [1] https://github.com/derrickstolee/git/commit/098dd1d515b592fb165a276241d7d68d1cde0036
>     DO-NOT-MERGE: compute multi-pack-index on repack.

It should be able to _merge_ this patch, but only activate it when
some test environment variable is set. On Travis x86 we run the test
suite twice, once normal, and once with special features activated
(see "if test "$jobname" = "linux-gcc"" in ci/run-build-and-tests.sh).
I don't think midx is incompatible with any of those features, so we
could activate and run the whole test suite with it too.
-- 
Duy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/9] multi-pack-index cleanups
  2018-08-21 14:34 ` [PATCH 0/9] multi-pack-index cleanups Duy Nguyen
@ 2018-08-21 14:44   ` Derrick Stolee
  0 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2018-08-21 14:44 UTC (permalink / raw)
  To: Duy Nguyen, Derrick Stolee; +Cc: Git Mailing List

On 8/21/2018 10:34 AM, Duy Nguyen wrote:
> On Mon, Aug 20, 2018 at 6:53 PM Derrick Stolee <dstolee@microsoft.com> wrote:
>> To better understand the implications of the multi-pack-index with
>> other scenarios, I ran the test suite after adding a step to 'git repack'
>> to write a multi-pack-index, and to default core.multiPackIndex to 'true'.
>> This commit is available as [1].
>>
>> ...
>>
>> [1] https://github.com/derrickstolee/git/commit/098dd1d515b592fb165a276241d7d68d1cde0036
>>      DO-NOT-MERGE: compute multi-pack-index on repack.
> It should be able to _merge_ this patch, but only activate it when
> some test environment variable is set. On Travis x86 we run the test
> suite twice, once normal, and once with special features activated
> (see "if test "$jobname" = "linux-gcc"" in ci/run-build-and-tests.sh).
> I don't think midx is incompatible with any of those features, so we
> could activate and run the whole test suite with it too.

Interesting. I'll look into this. A similar approach may work for the 
commit-graph.

Thanks,

-Stolee


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-08-21 14:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-20 16:51 [PATCH 0/9] multi-pack-index cleanups Derrick Stolee
2018-08-20 16:51 ` [PATCH 1/9] multi-pack-index: provide more helpful usage info Derrick Stolee
2018-08-20 16:51 ` [PATCH 2/9] multi-pack-index: store local property Derrick Stolee
2018-08-20 21:14   ` Stefan Beller
2018-08-20 16:51 ` [PATCH 3/9] midx: mark bad packed objects Derrick Stolee
2018-08-20 21:23   ` Stefan Beller
2018-08-21 13:53     ` Derrick Stolee
2018-08-20 16:51 ` [PATCH 4/9] midx: stop reporting garbage Derrick Stolee
2018-08-20 16:52 ` [PATCH 5/9] midx: fix bug that skips midx with alternates Derrick Stolee
2018-08-20 16:52 ` [PATCH 6/9] packfile: add all_packs list Derrick Stolee
2018-08-20 16:52 ` [PATCH 7/9] treewide: use get_all_packs Derrick Stolee
2018-08-20 22:01   ` Stefan Beller
2018-08-21 13:56     ` Derrick Stolee
2018-08-20 16:52 ` [PATCH 8/9] midx: test a few commands that " Derrick Stolee
2018-08-20 22:03   ` Stefan Beller
2018-08-20 16:52 ` [PATCH 9/9] pack-objects: consider packs in multi-pack-index Derrick Stolee
2018-08-21 14:34 ` [PATCH 0/9] multi-pack-index cleanups Duy Nguyen
2018-08-21 14:44   ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).