git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Multi-pack-index: Fix "too many file descriptors" bug
@ 2019-04-29 16:18 Derrick Stolee via GitGitGadget
  2019-04-29 16:18 ` [PATCH 1/2] midx: pass a repository pointer Derrick Stolee via GitGitGadget
  2019-04-29 16:18 ` [PATCH 2/2] midx: add packs to packed_git linked list Derrick Stolee via GitGitGadget
  0 siblings, 2 replies; 5+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2019-04-29 16:18 UTC (permalink / raw)
  To: git; +Cc: git, avarab, peff, Junio C Hamano

Thanks to Jeff H for finding the problem with the multi-pack-index regarding
many packs. Specifically: if we open too many packs, the close_one_pack()
method cannot find the packs from the multi-pack-index to close.

Jeff already fixed the problem explicitly in 'git multi-pack-index verify'
which would hit this issue 100% of the time we had 2000+ packs. This issue
could still happen in 'git rev-list --all --objects' if there are enough
packs containing commits and trees. This series fixes the issue.

The basic solution is to add packs from the multi-pack-index into the
packed_git struct as they are opened. To avoid performance issues, add a 
multi_pack_index bit to the packed_git struct. Midx-aware algorithms can
then ignore those packs.

There was a very subtle issue that happens during a 'git repack': we clear
the multi-pack-index after possibly reading some packs from it, thus leaving
some packs in the packed_git struct but having a NULL multi_pack_index in
the object store. This informs the change to close_midx().

I'm based on a recent 'master' commit that contains the following three
branches due to nearby changes causing conflicts if I pick only Jeff's
change as a base:

jh/midx-verify-too-many-packs jk/server-info-rabbit-hole
bc/hash-transition-16

Thanks, -Stolee

Derrick Stolee (2):
  midx: pass a repository pointer
  midx: add packs to packed_git linked list

 builtin/multi-pack-index.c |  2 +-
 builtin/pack-objects.c     |  2 +-
 midx.c                     | 42 +++++++++++++++++++++++++-------------
 midx.h                     |  7 ++++---
 object-store.h             |  9 ++------
 packfile.c                 | 30 ++++++++-------------------
 sha1-name.c                |  6 ++++++
 7 files changed, 51 insertions(+), 47 deletions(-)


base-commit: 83232e38648b51abbcbdb56c94632b6906cc85a6
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-182%2Fderrickstolee%2Fmany-packs-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-182/derrickstolee/many-packs-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/182
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] midx: pass a repository pointer
  2019-04-29 16:18 [PATCH 0/2] Multi-pack-index: Fix "too many file descriptors" bug Derrick Stolee via GitGitGadget
@ 2019-04-29 16:18 ` Derrick Stolee via GitGitGadget
  2019-05-07  8:31   ` Junio C Hamano
  2019-04-29 16:18 ` [PATCH 2/2] midx: add packs to packed_git linked list Derrick Stolee via GitGitGadget
  1 sibling, 1 reply; 5+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2019-04-29 16:18 UTC (permalink / raw)
  To: git; +Cc: git, avarab, peff, Junio C Hamano, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Much of the multi-pack-index code focuses on the multi_pack_index
struct, and so we only pass a pointer to the current one. However,
we will insert a dependency on the packed_git linked list in a
future change, so we will need a repository reference. Inserting
these parameters is a significant enough change to split out.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/multi-pack-index.c |  2 +-
 builtin/pack-objects.c     |  2 +-
 midx.c                     | 22 ++++++++++++++--------
 midx.h                     |  7 ++++---
 packfile.c                 |  4 ++--
 5 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
index ae6e476ad5..72dfd3dadc 100644
--- a/builtin/multi-pack-index.c
+++ b/builtin/multi-pack-index.c
@@ -46,7 +46,7 @@ int cmd_multi_pack_index(int argc, const char **argv,
 	if (!strcmp(argv[0], "write"))
 		return write_midx_file(opts.object_dir);
 	if (!strcmp(argv[0], "verify"))
-		return verify_midx_file(opts.object_dir);
+		return verify_midx_file(the_repository, opts.object_dir);
 
 	die(_("unrecognized verb: %s"), argv[0]);
 }
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2d9a3bdc9d..e606b9ef03 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1078,7 +1078,7 @@ static int want_object_in_pack(const struct object_id *oid,
 
 	for (m = get_multi_pack_index(the_repository); m; m = m->next) {
 		struct pack_entry e;
-		if (fill_midx_entry(oid, &e, m)) {
+		if (fill_midx_entry(the_repository, oid, &e, m)) {
 			struct packed_git *p = e.p;
 			off_t offset;
 
diff --git a/midx.c b/midx.c
index d5d2e9522f..8b8faec35a 100644
--- a/midx.c
+++ b/midx.c
@@ -201,7 +201,7 @@ void close_midx(struct multi_pack_index *m)
 	FREE_AND_NULL(m->pack_names);
 }
 
-int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id)
+int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id)
 {
 	struct strbuf pack_name = STRBUF_INIT;
 
@@ -261,7 +261,10 @@ static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos)
 	return get_be32(m->chunk_object_offsets + pos * MIDX_CHUNK_OFFSET_WIDTH);
 }
 
-static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *e, uint32_t pos)
+static int nth_midxed_pack_entry(struct repository *r,
+				 struct multi_pack_index *m,
+				 struct pack_entry *e,
+				 uint32_t pos)
 {
 	uint32_t pack_int_id;
 	struct packed_git *p;
@@ -271,7 +274,7 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
 
 	pack_int_id = nth_midxed_pack_int_id(m, pos);
 
-	if (prepare_midx_pack(m, pack_int_id))
+	if (prepare_midx_pack(r, m, pack_int_id))
 		die(_("error preparing packfile from multi-pack-index"));
 	p = m->packs[pack_int_id];
 
@@ -301,14 +304,17 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
 	return 1;
 }
 
-int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m)
+int fill_midx_entry(struct repository * r,
+		    const struct object_id *oid,
+		    struct pack_entry *e,
+		    struct multi_pack_index *m)
 {
 	uint32_t pos;
 
 	if (!bsearch_midx(oid, m, &pos))
 		return 0;
 
-	return nth_midxed_pack_entry(m, e, pos);
+	return nth_midxed_pack_entry(r, m, e, pos);
 }
 
 /* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
@@ -1020,7 +1026,7 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 			display_progress(progress, _n); \
 	} while (0)
 
-int verify_midx_file(const char *object_dir)
+int verify_midx_file(struct repository *r, const char *object_dir)
 {
 	struct pair_pos_vs_id *pairs = NULL;
 	uint32_t i;
@@ -1034,7 +1040,7 @@ int verify_midx_file(const char *object_dir)
 	progress = start_progress(_("Looking for referenced packfiles"),
 				  m->num_packs);
 	for (i = 0; i < m->num_packs; i++) {
-		if (prepare_midx_pack(m, i))
+		if (prepare_midx_pack(r, m, i))
 			midx_report("failed to load pack in position %d", i);
 
 		display_progress(progress, i + 1);
@@ -1099,7 +1105,7 @@ int verify_midx_file(const char *object_dir)
 
 		nth_midxed_object_oid(&oid, m, pairs[i].pos);
 
-		if (!fill_midx_entry(&oid, &e, m)) {
+		if (!fill_midx_entry(r, &oid, &e, m)) {
 			midx_report(_("failed to load pack entry for oid[%d] = %s"),
 				    pairs[i].pos, oid_to_hex(&oid));
 			continue;
diff --git a/midx.h b/midx.h
index 26dd042d63..3eb29731f2 100644
--- a/midx.h
+++ b/midx.h
@@ -5,6 +5,7 @@
 
 struct object_id;
 struct pack_entry;
+struct repository;
 
 #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX"
 
@@ -37,18 +38,18 @@ struct multi_pack_index {
 };
 
 struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
-int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id);
+int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
 struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
 					uint32_t n);
-int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
+int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
 int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
 int write_midx_file(const char *object_dir);
 void clear_midx_file(struct repository *r);
-int verify_midx_file(const char *object_dir);
+int verify_midx_file(struct repository *r, const char *object_dir);
 
 void close_midx(struct multi_pack_index *m);
 
diff --git a/packfile.c b/packfile.c
index cdf6b6ec34..7b94a14726 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1035,7 +1035,7 @@ struct packed_git *get_all_packs(struct repository *r)
 		for (m = r->objects->multi_pack_index; m; m = m->next) {
 			uint32_t i;
 			for (i = 0; i < m->num_packs; i++) {
-				if (!prepare_midx_pack(m, i)) {
+				if (!prepare_midx_pack(r, m, i)) {
 					m->packs[i]->next = p;
 					p = m->packs[i];
 				}
@@ -1998,7 +1998,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 		return 0;
 
 	for (m = r->objects->multi_pack_index; m; m = m->next) {
-		if (fill_midx_entry(oid, e, m))
+		if (fill_midx_entry(r, oid, e, m))
 			return 1;
 	}
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] midx: add packs to packed_git linked list
  2019-04-29 16:18 [PATCH 0/2] Multi-pack-index: Fix "too many file descriptors" bug Derrick Stolee via GitGitGadget
  2019-04-29 16:18 ` [PATCH 1/2] midx: pass a repository pointer Derrick Stolee via GitGitGadget
@ 2019-04-29 16:18 ` Derrick Stolee via GitGitGadget
  1 sibling, 0 replies; 5+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2019-04-29 16:18 UTC (permalink / raw)
  To: git; +Cc: git, avarab, peff, Junio C Hamano, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The multi-pack-index allows searching for objects across multiple
packs using one object list. The original design gains many of
these performance benefits by keeping the packs in the
multi-pack-index out of the packed_git list.

Unfortunately, this has one major drawback. If the multi-pack-index
covers thousands of packs, and a command loads many of those packs,
then we can hit the limit for open file descriptors. The
close_one_pack() method is used to limit this resource, but it
only looks at the packed_git list, and uses an LRU cache to prevent
thrashing.

Instead of complicating this close_one_pack() logic to include
direct references to the multi-pack-index, simply add the packs
opened by the multi-pack-index to the packed_git list. This
immediately solves the file-descriptor limit problem, but requires
some extra steps to avoid performance issues or other problems:

1. Create a multi_pack_index bit in the packed_git struct that is
   one if and only if the pack was loaded from a multi-pack-index.

2. Skip packs with the multi_pack_index bit when doing object
   lookups and abbreviations. These algorithms already check the
   multi-pack-index before the packed_git struct. This has a very
   small performance hit, as we need to walk more packed_git
   structs. This is acceptable, since these operations run binary
   search on the other packs, so this walk-and-ignore logic is
   very fast by comparison.

3. When closing a multi-pack-index file, do not close its packs,
   as those packs will be closed using close_all_packs(). In some
   cases, such as 'git repack', we run 'close_midx()' without also
   closing the packs, so we need to un-set the multi_pack_index bit
   in those packs. This is necessary, and caught by running
   t6501-freshen-objects.sh with GIT_TEST_MULTI_PACK_INDEX=1.

To manually test this change, I inserted trace2 logging into
close_pack_fd() and set pack_max_fds to 10, then ran 'git rev-list
--all --objects' on a copy of the Git repo with 300+ pack-files and
a multi-pack-index. The logs verified the packs are closed as
we read them beyond the file descriptor limit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 midx.c         | 20 ++++++++++++++------
 object-store.h |  9 ++-------
 packfile.c     | 28 ++++++++--------------------
 sha1-name.c    |  6 ++++++
 4 files changed, 30 insertions(+), 33 deletions(-)

diff --git a/midx.c b/midx.c
index 8b8faec35a..e7e1fe4d65 100644
--- a/midx.c
+++ b/midx.c
@@ -192,10 +192,8 @@ void close_midx(struct multi_pack_index *m)
 	m->fd = -1;
 
 	for (i = 0; i < m->num_packs; i++) {
-		if (m->packs[i]) {
-			close_pack(m->packs[i]);
-			free(m->packs[i]);
-		}
+		if (m->packs[i])
+			m->packs[i]->multi_pack_index = 0;
 	}
 	FREE_AND_NULL(m->packs);
 	FREE_AND_NULL(m->pack_names);
@@ -204,6 +202,7 @@ void close_midx(struct multi_pack_index *m)
 int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id)
 {
 	struct strbuf pack_name = STRBUF_INIT;
+	struct packed_git *p;
 
 	if (pack_int_id >= m->num_packs)
 		die(_("bad pack-int-id: %u (%u total packs)"),
@@ -215,9 +214,18 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t
 	strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir,
 		    m->pack_names[pack_int_id]);
 
-	m->packs[pack_int_id] = add_packed_git(pack_name.buf, pack_name.len, m->local);
+	p = add_packed_git(pack_name.buf, pack_name.len, m->local);
 	strbuf_release(&pack_name);
-	return !m->packs[pack_int_id];
+
+	if (!p)
+		return 1;
+
+	p->multi_pack_index = 1;
+	m->packs[pack_int_id] = p;
+	install_packed_git(r, p);
+	list_add_tail(&p->mru, &r->objects->packed_git_mru);
+
+	return 0;
 }
 
 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result)
diff --git a/object-store.h b/object-store.h
index b086f5ecdb..7acbc7fffe 100644
--- a/object-store.h
+++ b/object-store.h
@@ -76,7 +76,8 @@ struct packed_git {
 		 pack_keep_in_core:1,
 		 freshened:1,
 		 do_not_close:1,
-		 pack_promisor:1;
+		 pack_promisor:1,
+		 multi_pack_index:1;
 	unsigned char hash[GIT_MAX_RAWSZ];
 	struct revindex_entry *revindex;
 	/* something like ".git/objects/pack/xxxxx.pack" */
@@ -128,12 +129,6 @@ struct raw_object_store {
 	/* A most-recently-used ordered version of the packed_git list. */
 	struct list_head packed_git_mru;
 
-	/*
-	 * A linked list containing all packfiles, starting with those
-	 * contained in the multi_pack_index.
-	 */
-	struct packed_git *all_packs;
-
 	/*
 	 * A fast, rough count of the number of objects in the repository.
 	 * These two fields are not meant for direct access. Use
diff --git a/packfile.c b/packfile.c
index 7b94a14726..060de420d1 100644
--- a/packfile.c
+++ b/packfile.c
@@ -994,8 +994,6 @@ static void prepare_packed_git(struct repository *r)
 	}
 	rearrange_packed_git(r);
 
-	r->objects->all_packs = NULL;
-
 	prepare_packed_git_mru(r);
 	r->objects->packed_git_initialized = 1;
 }
@@ -1026,26 +1024,16 @@ struct multi_pack_index *get_multi_pack_index(struct repository *r)
 
 struct packed_git *get_all_packs(struct repository *r)
 {
-	prepare_packed_git(r);
-
-	if (!r->objects->all_packs) {
-		struct packed_git *p = r->objects->packed_git;
-		struct multi_pack_index *m;
-
-		for (m = r->objects->multi_pack_index; m; m = m->next) {
-			uint32_t i;
-			for (i = 0; i < m->num_packs; i++) {
-				if (!prepare_midx_pack(r, m, i)) {
-					m->packs[i]->next = p;
-					p = m->packs[i];
-				}
-			}
-		}
+	struct multi_pack_index *m;
 
-		r->objects->all_packs = p;
+	prepare_packed_git(r);
+	for (m = r->objects->multi_pack_index; m; m = m->next) {
+		uint32_t i;
+		for (i = 0; i < m->num_packs; i++)
+			prepare_midx_pack(r, m, i);
 	}
 
-	return r->objects->all_packs;
+	return r->objects->packed_git;
 }
 
 struct list_head *get_packed_git_mru(struct repository *r)
@@ -2004,7 +1992,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
 
 	list_for_each(pos, &r->objects->packed_git_mru) {
 		struct packed_git *p = list_entry(pos, struct packed_git, mru);
-		if (fill_pack_entry(oid, e, p)) {
+		if (!p->multi_pack_index && fill_pack_entry(oid, e, p)) {
 			list_move(&p->mru, &r->objects->packed_git_mru);
 			return 1;
 		}
diff --git a/sha1-name.c b/sha1-name.c
index 07c71a7567..42ac1c5bb6 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -157,6 +157,9 @@ static void unique_in_pack(struct packed_git *p,
 	uint32_t num, i, first = 0;
 	const struct object_id *current = NULL;
 
+	if (p->multi_pack_index)
+		return;
+
 	if (open_pack_index(p) || !p->num_objects)
 		return;
 
@@ -589,6 +592,9 @@ static void find_abbrev_len_for_pack(struct packed_git *p,
 	struct object_id oid;
 	const struct object_id *mad_oid;
 
+	if (p->multi_pack_index)
+		return;
+
 	if (open_pack_index(p) || !p->num_objects)
 		return;
 
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] midx: pass a repository pointer
  2019-04-29 16:18 ` [PATCH 1/2] midx: pass a repository pointer Derrick Stolee via GitGitGadget
@ 2019-05-07  8:31   ` Junio C Hamano
  2019-05-08 14:22     ` Derrick Stolee
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2019-05-07  8:31 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, git, avarab, peff, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> Much of the multi-pack-index code focuses on the multi_pack_index
> struct, and so we only pass a pointer to the current one. However,
> we will insert a dependency on the packed_git linked list in a
> future change, so we will need a repository reference. Inserting
> these parameters is a significant enough change to split out.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---

This is a good move in the loger term, but not a happy thing to do
while your "expire" topic is still in flight, as the impact from
updating the signature of prepare_midx_pack() and friends will break
new callers in expire_midx_packs() etc.

I am tempted to queue this and eject ds/midx-expire-repack for now,
while checking how that topic would look like when rebased on top of
these two patches.  We'll see.

Thanks.

>  builtin/multi-pack-index.c |  2 +-
>  builtin/pack-objects.c     |  2 +-
>  midx.c                     | 22 ++++++++++++++--------
>  midx.h                     |  7 ++++---
>  packfile.c                 |  4 ++--
>  5 files changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
> index ae6e476ad5..72dfd3dadc 100644
> --- a/builtin/multi-pack-index.c
> +++ b/builtin/multi-pack-index.c
> @@ -46,7 +46,7 @@ int cmd_multi_pack_index(int argc, const char **argv,
>  	if (!strcmp(argv[0], "write"))
>  		return write_midx_file(opts.object_dir);
>  	if (!strcmp(argv[0], "verify"))
> -		return verify_midx_file(opts.object_dir);
> +		return verify_midx_file(the_repository, opts.object_dir);
>  
>  	die(_("unrecognized verb: %s"), argv[0]);
>  }
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 2d9a3bdc9d..e606b9ef03 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1078,7 +1078,7 @@ static int want_object_in_pack(const struct object_id *oid,
>  
>  	for (m = get_multi_pack_index(the_repository); m; m = m->next) {
>  		struct pack_entry e;
> -		if (fill_midx_entry(oid, &e, m)) {
> +		if (fill_midx_entry(the_repository, oid, &e, m)) {
>  			struct packed_git *p = e.p;
>  			off_t offset;
>  
> diff --git a/midx.c b/midx.c
> index d5d2e9522f..8b8faec35a 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -201,7 +201,7 @@ void close_midx(struct multi_pack_index *m)
>  	FREE_AND_NULL(m->pack_names);
>  }
>  
> -int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id)
> +int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id)
>  {
>  	struct strbuf pack_name = STRBUF_INIT;
>  
> @@ -261,7 +261,10 @@ static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos)
>  	return get_be32(m->chunk_object_offsets + pos * MIDX_CHUNK_OFFSET_WIDTH);
>  }
>  
> -static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *e, uint32_t pos)
> +static int nth_midxed_pack_entry(struct repository *r,
> +				 struct multi_pack_index *m,
> +				 struct pack_entry *e,
> +				 uint32_t pos)
>  {
>  	uint32_t pack_int_id;
>  	struct packed_git *p;
> @@ -271,7 +274,7 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
>  
>  	pack_int_id = nth_midxed_pack_int_id(m, pos);
>  
> -	if (prepare_midx_pack(m, pack_int_id))
> +	if (prepare_midx_pack(r, m, pack_int_id))
>  		die(_("error preparing packfile from multi-pack-index"));
>  	p = m->packs[pack_int_id];
>  
> @@ -301,14 +304,17 @@ static int nth_midxed_pack_entry(struct multi_pack_index *m, struct pack_entry *
>  	return 1;
>  }
>  
> -int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m)
> +int fill_midx_entry(struct repository * r,
> +		    const struct object_id *oid,
> +		    struct pack_entry *e,
> +		    struct multi_pack_index *m)
>  {
>  	uint32_t pos;
>  
>  	if (!bsearch_midx(oid, m, &pos))
>  		return 0;
>  
> -	return nth_midxed_pack_entry(m, e, pos);
> +	return nth_midxed_pack_entry(r, m, e, pos);
>  }
>  
>  /* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
> @@ -1020,7 +1026,7 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
>  			display_progress(progress, _n); \
>  	} while (0)
>  
> -int verify_midx_file(const char *object_dir)
> +int verify_midx_file(struct repository *r, const char *object_dir)
>  {
>  	struct pair_pos_vs_id *pairs = NULL;
>  	uint32_t i;
> @@ -1034,7 +1040,7 @@ int verify_midx_file(const char *object_dir)
>  	progress = start_progress(_("Looking for referenced packfiles"),
>  				  m->num_packs);
>  	for (i = 0; i < m->num_packs; i++) {
> -		if (prepare_midx_pack(m, i))
> +		if (prepare_midx_pack(r, m, i))
>  			midx_report("failed to load pack in position %d", i);
>  
>  		display_progress(progress, i + 1);
> @@ -1099,7 +1105,7 @@ int verify_midx_file(const char *object_dir)
>  
>  		nth_midxed_object_oid(&oid, m, pairs[i].pos);
>  
> -		if (!fill_midx_entry(&oid, &e, m)) {
> +		if (!fill_midx_entry(r, &oid, &e, m)) {
>  			midx_report(_("failed to load pack entry for oid[%d] = %s"),
>  				    pairs[i].pos, oid_to_hex(&oid));
>  			continue;
> diff --git a/midx.h b/midx.h
> index 26dd042d63..3eb29731f2 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -5,6 +5,7 @@
>  
>  struct object_id;
>  struct pack_entry;
> +struct repository;
>  
>  #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX"
>  
> @@ -37,18 +38,18 @@ struct multi_pack_index {
>  };
>  
>  struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
> -int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id);
> +int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
>  int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
>  struct object_id *nth_midxed_object_oid(struct object_id *oid,
>  					struct multi_pack_index *m,
>  					uint32_t n);
> -int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
> +int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
>  int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
>  int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
>  
>  int write_midx_file(const char *object_dir);
>  void clear_midx_file(struct repository *r);
> -int verify_midx_file(const char *object_dir);
> +int verify_midx_file(struct repository *r, const char *object_dir);
>  
>  void close_midx(struct multi_pack_index *m);
>  
> diff --git a/packfile.c b/packfile.c
> index cdf6b6ec34..7b94a14726 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1035,7 +1035,7 @@ struct packed_git *get_all_packs(struct repository *r)
>  		for (m = r->objects->multi_pack_index; m; m = m->next) {
>  			uint32_t i;
>  			for (i = 0; i < m->num_packs; i++) {
> -				if (!prepare_midx_pack(m, i)) {
> +				if (!prepare_midx_pack(r, m, i)) {
>  					m->packs[i]->next = p;
>  					p = m->packs[i];
>  				}
> @@ -1998,7 +1998,7 @@ int find_pack_entry(struct repository *r, const struct object_id *oid, struct pa
>  		return 0;
>  
>  	for (m = r->objects->multi_pack_index; m; m = m->next) {
> -		if (fill_midx_entry(oid, e, m))
> +		if (fill_midx_entry(r, oid, e, m))
>  			return 1;
>  	}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] midx: pass a repository pointer
  2019-05-07  8:31   ` Junio C Hamano
@ 2019-05-08 14:22     ` Derrick Stolee
  0 siblings, 0 replies; 5+ messages in thread
From: Derrick Stolee @ 2019-05-08 14:22 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, git, avarab, peff, Derrick Stolee

On 5/7/2019 4:31 AM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Much of the multi-pack-index code focuses on the multi_pack_index
>> struct, and so we only pass a pointer to the current one. However,
>> we will insert a dependency on the packed_git linked list in a
>> future change, so we will need a repository reference. Inserting
>> these parameters is a significant enough change to split out.
>>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
> 
> This is a good move in the loger term, but not a happy thing to do
> while your "expire" topic is still in flight, as the impact from
> updating the signature of prepare_midx_pack() and friends will break
> new callers in expire_midx_packs() etc.
> 
> I am tempted to queue this and eject ds/midx-expire-repack for now,
> while checking how that topic would look like when rebased on top of
> these two patches.  We'll see.

Sorry for the noise. I consider this series a higher priority, as it
fixes a problem with the feature in existing releases.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-05-08 14:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-29 16:18 [PATCH 0/2] Multi-pack-index: Fix "too many file descriptors" bug Derrick Stolee via GitGitGadget
2019-04-29 16:18 ` [PATCH 1/2] midx: pass a repository pointer Derrick Stolee via GitGitGadget
2019-05-07  8:31   ` Junio C Hamano
2019-05-08 14:22     ` Derrick Stolee
2019-04-29 16:18 ` [PATCH 2/2] midx: add packs to packed_git linked list Derrick Stolee via GitGitGadget

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).