git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] delta-islands: free island-related data after use
@ 2022-11-16 10:50 Eric Wong
  2022-11-16 15:44 ` Ævar Arnfjörð Bjarmason
  2022-11-16 18:44 ` [PATCH] " Jeff King
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Wong @ 2022-11-16 10:50 UTC (permalink / raw)
  To: git

On my use case involving 771 islands of Linux on kernel.org,
this reduces memory usage by around 25MB.  The bulk of that
comes from free_remote_islands, since free_island_regexes only
saves around 40k.

This memory is saved early in the memory-intensive pack process,
making it available for the remainder of the long process.

Signed-off-by: Eric Wong <e@80x24.org>
---
  Note: I only noticed this when I screwed up up a pack.island RE,
  ended up with hundreds of thousands of islands instead of 771,
  and kept OOM-ing :x

  Memory savings were measured using the following patch which
  relies on a patched LD_PRELOAD-based malloc debugger:
  https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw

  Will try to hunt down more memory savings in the nearish future.

 delta-islands.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/delta-islands.c b/delta-islands.c
index 26f9e99e1a..391e947cc6 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -454,6 +454,31 @@ static void deduplicate_islands(struct repository *r)
 	free(list);
 }
 
+static void free_island_regexes(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < island_regexes_nr; i++)
+		regfree(&island_regexes[i]);
+
+	FREE_AND_NULL(island_regexes);
+	island_regexes_alloc = island_regexes_nr = 0;
+}
+
+static void free_remote_islands(void)
+{
+	const char *island_name;
+	struct remote_island *rl;
+
+	kh_foreach(remote_islands, island_name, rl, {
+		free((void *)island_name);
+		oid_array_clear(&rl->oids);
+		free(rl);
+	});
+	kh_destroy_str(remote_islands);
+	remote_islands = NULL;
+}
+
 void load_delta_islands(struct repository *r, int progress)
 {
 	island_marks = kh_init_oid_map();
@@ -461,7 +486,9 @@ void load_delta_islands(struct repository *r, int progress)
 
 	git_config(island_config_callback, NULL);
 	for_each_ref(find_island_for_ref, NULL);
+	free_island_regexes();
 	deduplicate_islands(r);
+	free_remote_islands();
 
 	if (progress)
 		fprintf(stderr, _("Marked %d islands, done.\n"), island_counter);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] delta-islands: free island-related data after use
  2022-11-16 10:50 [PATCH] delta-islands: free island-related data after use Eric Wong
@ 2022-11-16 15:44 ` Ævar Arnfjörð Bjarmason
  2022-11-17 23:06   ` [PATCH v2] " Eric Wong
  2022-11-16 18:44 ` [PATCH] " Jeff King
  1 sibling, 1 reply; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-16 15:44 UTC (permalink / raw)
  To: Eric Wong; +Cc: git


On Wed, Nov 16 2022, Eric Wong wrote:

> On my use case involving 771 islands of Linux on kernel.org,
> this reduces memory usage by around 25MB.  The bulk of that
> comes from free_remote_islands, since free_island_regexes only
> saves around 40k.
>
> This memory is saved early in the memory-intensive pack process,
> making it available for the remainder of the long process.
>
> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>   Note: I only noticed this when I screwed up up a pack.island RE,
>   ended up with hundreds of thousands of islands instead of 771,
>   and kept OOM-ing :x
>
>   Memory savings were measured using the following patch which
>   relies on a patched LD_PRELOAD-based malloc debugger:
>   https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw

FWIW SANITIZE=leak will find this if you stick a "remote_islands = NULL"
and run e.g. t5320-delta-islands.sh, but maybe you needed this closer to
production.

Valgrind will also work, but of course be *much* slower.

>   Will try to hunt down more memory savings in the nearish future.
>
>  delta-islands.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
>
> diff --git a/delta-islands.c b/delta-islands.c
> index 26f9e99e1a..391e947cc6 100644
> --- a/delta-islands.c
> +++ b/delta-islands.c
> @@ -454,6 +454,31 @@ static void deduplicate_islands(struct repository *r)
>  	free(list);
>  }
>  
> +static void free_island_regexes(void)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < island_regexes_nr; i++)
> +		regfree(&island_regexes[i]);
> +
> +	FREE_AND_NULL(island_regexes);
> +	island_regexes_alloc = island_regexes_nr = 0;
> +}
> +
> +static void free_remote_islands(void)
> +{
> +	const char *island_name;
> +	struct remote_island *rl;
> +
> +	kh_foreach(remote_islands, island_name, rl, {
> +		free((void *)island_name);
> +		oid_array_clear(&rl->oids);
> +		free(rl);
> +	});
> +	kh_destroy_str(remote_islands);
> +	remote_islands = NULL;
> +}
> +
>  void load_delta_islands(struct repository *r, int progress)
>  {
>  	island_marks = kh_init_oid_map();
> @@ -461,7 +486,9 @@ void load_delta_islands(struct repository *r, int progress)
>  
>  	git_config(island_config_callback, NULL);
>  	for_each_ref(find_island_for_ref, NULL);
> +	free_island_regexes();
>  	deduplicate_islands(r);
> +	free_remote_islands();
>  
>  	if (progress)
>  		fprintf(stderr, _("Marked %d islands, done.\n"), island_counter);

Perfect shouldn't be the enemy of the good & all that, but in this case
it's not too much more effort to just give this data an appropriate
lifetime instead of the global, I tried that out for just the "regex"
part of this below.

The free_remote_islands() seems to be similarly alive between
"find_island_for_ref" and "deduplicate_islands".

Your version also works, but the root cause of this sort of thing is
these global lifetimes, which sometimes we do for a good reason, but in
this case we don't.

It's more lines, but e.g. your FREE_AND_NULL() and resetting "nr" and
"alloc" makes one wonder how the data might be used outside fo that
load_delta_islands() chain (which is needed, if we invoke it again),
when we can just init a stack variable to hold it instead...

diff --git a/delta-islands.c b/delta-islands.c
index 26f9e99e1a9..ef86a91059c 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -312,29 +312,41 @@ void resolve_tree_islands(struct repository *r,
 	free(todo);
 }
 
-static regex_t *island_regexes;
-static unsigned int island_regexes_alloc, island_regexes_nr;
+struct island_config_data {
+	regex_t *rx;
+	size_t nr;
+	size_t alloc;
+};
 static const char *core_island_name;
 
-static int island_config_callback(const char *k, const char *v, void *cb UNUSED)
+static void island_config_data_release(struct island_config_data *icd)
+{
+	for (size_t i = 0; i < icd->nr; i++)
+		regfree(&icd->rx[i]);
+	free(icd->rx);
+}
+
+static int island_config_callback(const char *k, const char *v, void *cb)
 {
+	struct island_config_data *data = cb;
+
 	if (!strcmp(k, "pack.island")) {
 		struct strbuf re = STRBUF_INIT;
 
 		if (!v)
 			return config_error_nonbool(k);
 
-		ALLOC_GROW(island_regexes, island_regexes_nr + 1, island_regexes_alloc);
+		ALLOC_GROW(data->rx, data->nr + 1, data->alloc);
 
 		if (*v != '^')
 			strbuf_addch(&re, '^');
 		strbuf_addstr(&re, v);
 
-		if (regcomp(&island_regexes[island_regexes_nr], re.buf, REG_EXTENDED))
+		if (regcomp(&data->rx[data->nr], re.buf, REG_EXTENDED))
 			die(_("failed to load island regex for '%s': %s"), k, re.buf);
 
 		strbuf_release(&re);
-		island_regexes_nr++;
+		data->nr++;
 		return 0;
 	}
 
@@ -365,8 +377,10 @@ static void add_ref_to_island(const char *island_name, const struct object_id *o
 }
 
 static int find_island_for_ref(const char *refname, const struct object_id *oid,
-			       int flags UNUSED, void *data UNUSED)
+			       int flags UNUSED, void *cb)
 {
+	struct island_config_data *data = cb;
+
 	/*
 	 * We should advertise 'ARRAY_SIZE(matches) - 2' as the max,
 	 * so we can diagnose below a config with more capture groups
@@ -377,8 +391,8 @@ static int find_island_for_ref(const char *refname, const struct object_id *oid,
 	struct strbuf island_name = STRBUF_INIT;
 
 	/* walk backwards to get last-one-wins ordering */
-	for (i = island_regexes_nr - 1; i >= 0; i--) {
-		if (!regexec(&island_regexes[i], refname,
+	for (i = data->nr - 1; i >= 0; i--) {
+		if (!regexec(&data->rx[i], refname,
 			     ARRAY_SIZE(matches), matches, 0))
 			break;
 	}
@@ -456,11 +470,14 @@ static void deduplicate_islands(struct repository *r)
 
 void load_delta_islands(struct repository *r, int progress)
 {
+	struct island_config_data data = { 0 };
+
 	island_marks = kh_init_oid_map();
 	remote_islands = kh_init_str();
 
-	git_config(island_config_callback, NULL);
-	for_each_ref(find_island_for_ref, NULL);
+	git_config(island_config_callback, &data);
+	for_each_ref(find_island_for_ref, &data);
+	island_config_data_release(&data);
 	deduplicate_islands(r);
 
 	if (progress)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] delta-islands: free island-related data after use
  2022-11-16 10:50 [PATCH] delta-islands: free island-related data after use Eric Wong
  2022-11-16 15:44 ` Ævar Arnfjörð Bjarmason
@ 2022-11-16 18:44 ` Jeff King
  2023-02-01  9:20   ` pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use] Eric Wong
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff King @ 2022-11-16 18:44 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On Wed, Nov 16, 2022 at 10:50:13AM +0000, Eric Wong wrote:

> On my use case involving 771 islands of Linux on kernel.org,
> this reduces memory usage by around 25MB.  The bulk of that
> comes from free_remote_islands, since free_island_regexes only
> saves around 40k.
> 
> This memory is saved early in the memory-intensive pack process,
> making it available for the remainder of the long process.

I think this works and is a reasonable thing to do. The non-obvious
question is whether the island data is ever used later in the process.
Certainly the island bitmaps themselves are, but the initial mapping of
refs to islands doesn't.

I do agree with Ævar that it would be a lot easier to confirm that if
these variables were given a more appropriate scope (this was mostly
just laziness on my part writing the initial delta-island code; so much
of it has to stick around for the whole process and I didn't distinguish
between the two).

>   Will try to hunt down more memory savings in the nearish future.

Yes, you've probably noticed that pack-objects does not distinguish much
between what is necessary for the various phases. A few obvious things
to look at:

  1. After the write phase, you can probably ditch the island bitmaps,
     too. In many repacks we're basically done then, but if you're
     generating bitmaps, that happens afterwards in the same process.

  2. The object traversal for pack-objects is done in-process these
     days. But after it finishes, I suspect that we do not generally
     need those object structs anymore, because all of the book-keeping
     is done in the bit object_entry array in packing_data.

     The exception would be generating bitmaps, which does need to do
     some traversal (and I think may hang on to actual "struct commit"
     pointers).

     I also don't think we have any code that clears obj_hash or
     released the pooled object pointers.

     So it's probably pretty tricky, but I suspect would yield big
     savings, since it's a per-object cost on the order of 64 bytes (so
     ~640MB on the kernel).

  3. During the bitmap phase I'm not sure if we still care about the
     object_entry struct in packing_data. It's the other big
     bytes-per-object user of memory. We need it all the way through the
     write phase for obvious reasons, but if we could ditch it for the
     bitmap phase, that may reduce peak memory during that phase.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] delta-islands: free island-related data after use
  2022-11-16 15:44 ` Ævar Arnfjörð Bjarmason
@ 2022-11-17 23:06   ` Eric Wong
  2022-11-18  1:51     ` Ævar Arnfjörð Bjarmason
  2022-11-22 20:22     ` Jeff King
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Wong @ 2022-11-17 23:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Wed, Nov 16 2022, Eric Wong wrote:
> >   Memory savings were measured using the following patch which
> >   relies on a patched LD_PRELOAD-based malloc debugger:
> >   https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw
> 
> FWIW SANITIZE=leak will find this if you stick a "remote_islands = NULL"
> and run e.g. t5320-delta-islands.sh, but maybe you needed this closer to
> production.
> 
> Valgrind will also work, but of course be *much* slower.

Yeah, I run that LD_PRELOAD thing in production since it's
cheap compared to valgrind.

> Perfect shouldn't be the enemy of the good & all that, but in this case
> it's not too much more effort to just give this data an appropriate
> lifetime instead of the global, I tried that out for just the "regex"
> part of this below.
> 
> The free_remote_islands() seems to be similarly alive between
> "find_island_for_ref" and "deduplicate_islands".
> 
> Your version also works, but the root cause of this sort of thing is
> these global lifetimes, which sometimes we do for a good reason, but in
> this case we don't.

Agreed on all points.  Overall, the amount of globals in git has
long seemed excessive and offputting to me (and likely other
drive-by hackers).

> diff --git a/delta-islands.c b/delta-islands.c
> index 26f9e99e1a9..ef86a91059c 100644
> --- a/delta-islands.c
> +++ b/delta-islands.c
> @@ -312,29 +312,41 @@ void resolve_tree_islands(struct repository *r,
>  	free(todo);
>  }
>  
> -static regex_t *island_regexes;
> -static unsigned int island_regexes_alloc, island_regexes_nr;
> +struct island_config_data {
> +	regex_t *rx;
> +	size_t nr;
> +	size_t alloc;
> +};

I've added kh_str_t *remote_islands and renamed
s/island_config_data/island_load_data/ in the below version
to reflect the slightly different scope of remote_islands.

>  static const char *core_island_name;
>  
> -static int island_config_callback(const char *k, const char *v, void *cb UNUSED)
> +static void island_config_data_release(struct island_config_data *icd)
> +{
> +	for (size_t i = 0; i < icd->nr; i++)
> +		regfree(&icd->rx[i]);
> +	free(icd->rx);
> +}

icd => ild since config => load

> +static int island_config_callback(const char *k, const char *v, void *cb)
>  {
> +	struct island_config_data *data = cb;
> +

data => ild

I don't like the name `data' for a typed variable.

Aside from that, v2 below still frees the regex memory early on
in the hopes deduplicate_islands() can reuse some of the freed
regexp memory.

Anyways, here's v2, which seems to work.  I'm still trying to
figure out SATA errors+resets after replacing a CMOS battery,
but I really hope this patch isn't the cause.

-----8<-----
From: Eric Wong <e@80x24.org>
Subject: [PATCH] delta-islands: free island-related data after use

On my use case involving 771 islands of Linux on kernel.org,
this reduces memory usage by around 25MB.  The bulk of that
comes from free_remote_islands, since free_config_regexes only
saves around 40k.

This memory is saved early in the memory-intensive pack process,
making it available for the remainder of the long process.

Signed-off-by: Eric Wong <e@80x24.org>
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 v2: reduce scope of load-time data structures with hints from Ævar

 delta-islands.c | 71 +++++++++++++++++++++++++++++++++++--------------
 1 file changed, 51 insertions(+), 20 deletions(-)

diff --git a/delta-islands.c b/delta-islands.c
index 26f9e99e1a..90c0d6958f 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -26,8 +26,6 @@ static kh_oid_map_t *island_marks;
 static unsigned island_counter;
 static unsigned island_counter_core;
 
-static kh_str_t *remote_islands;
-
 struct remote_island {
 	uint64_t hash;
 	struct oid_array oids;
@@ -312,29 +310,55 @@ void resolve_tree_islands(struct repository *r,
 	free(todo);
 }
 
-static regex_t *island_regexes;
-static unsigned int island_regexes_alloc, island_regexes_nr;
+struct island_load_data {
+	kh_str_t *remote_islands;
+	regex_t *rx;
+	size_t nr;
+	size_t alloc;
+};
 static const char *core_island_name;
 
-static int island_config_callback(const char *k, const char *v, void *cb UNUSED)
+static void free_config_regexes(struct island_load_data *ild)
 {
+	for (size_t i = 0; i < ild->nr; i++)
+		regfree(&ild->rx[i]);
+	free(ild->rx);
+}
+
+static void free_remote_islands(kh_str_t *remote_islands)
+{
+	const char *island_name;
+	struct remote_island *rl;
+
+	kh_foreach(remote_islands, island_name, rl, {
+		free((void *)island_name);
+		oid_array_clear(&rl->oids);
+		free(rl);
+	});
+	kh_destroy_str(remote_islands);
+}
+
+static int island_config_callback(const char *k, const char *v, void *cb)
+{
+	struct island_load_data *ild = cb;
+
 	if (!strcmp(k, "pack.island")) {
 		struct strbuf re = STRBUF_INIT;
 
 		if (!v)
 			return config_error_nonbool(k);
 
-		ALLOC_GROW(island_regexes, island_regexes_nr + 1, island_regexes_alloc);
+		ALLOC_GROW(ild->rx, ild->nr + 1, ild->alloc);
 
 		if (*v != '^')
 			strbuf_addch(&re, '^');
 		strbuf_addstr(&re, v);
 
-		if (regcomp(&island_regexes[island_regexes_nr], re.buf, REG_EXTENDED))
+		if (regcomp(&ild->rx[ild->nr], re.buf, REG_EXTENDED))
 			die(_("failed to load island regex for '%s': %s"), k, re.buf);
 
 		strbuf_release(&re);
-		island_regexes_nr++;
+		ild->nr++;
 		return 0;
 	}
 
@@ -344,7 +368,8 @@ static int island_config_callback(const char *k, const char *v, void *cb UNUSED)
 	return 0;
 }
 
-static void add_ref_to_island(const char *island_name, const struct object_id *oid)
+static void add_ref_to_island(kh_str_t *remote_islands, const char *island_name,
+				const struct object_id *oid)
 {
 	uint64_t sha_core;
 	struct remote_island *rl = NULL;
@@ -365,8 +390,10 @@ static void add_ref_to_island(const char *island_name, const struct object_id *o
 }
 
 static int find_island_for_ref(const char *refname, const struct object_id *oid,
-			       int flags UNUSED, void *data UNUSED)
+			       int flags UNUSED, void *cb)
 {
+	struct island_load_data *ild = cb;
+
 	/*
 	 * We should advertise 'ARRAY_SIZE(matches) - 2' as the max,
 	 * so we can diagnose below a config with more capture groups
@@ -377,8 +404,8 @@ static int find_island_for_ref(const char *refname, const struct object_id *oid,
 	struct strbuf island_name = STRBUF_INIT;
 
 	/* walk backwards to get last-one-wins ordering */
-	for (i = island_regexes_nr - 1; i >= 0; i--) {
-		if (!regexec(&island_regexes[i], refname,
+	for (i = ild->nr - 1; i >= 0; i--) {
+		if (!regexec(&ild->rx[i], refname,
 			     ARRAY_SIZE(matches), matches, 0))
 			break;
 	}
@@ -403,12 +430,12 @@ static int find_island_for_ref(const char *refname, const struct object_id *oid,
 		strbuf_add(&island_name, refname + match->rm_so, match->rm_eo - match->rm_so);
 	}
 
-	add_ref_to_island(island_name.buf, oid);
+	add_ref_to_island(ild->remote_islands, island_name.buf, oid);
 	strbuf_release(&island_name);
 	return 0;
 }
 
-static struct remote_island *get_core_island(void)
+static struct remote_island *get_core_island(kh_str_t *remote_islands)
 {
 	if (core_island_name) {
 		khiter_t pos = kh_get_str(remote_islands, core_island_name);
@@ -419,7 +446,7 @@ static struct remote_island *get_core_island(void)
 	return NULL;
 }
 
-static void deduplicate_islands(struct repository *r)
+static void deduplicate_islands(kh_str_t *remote_islands, struct repository *r)
 {
 	struct remote_island *island, *core = NULL, **list;
 	unsigned int island_count, dst, src, ref, i = 0;
@@ -445,7 +472,7 @@ static void deduplicate_islands(struct repository *r)
 	}
 
 	island_bitmap_size = (island_count / 32) + 1;
-	core = get_core_island();
+	core = get_core_island(remote_islands);
 
 	for (i = 0; i < island_count; ++i) {
 		mark_remote_island_1(r, list[i], core && list[i]->hash == core->hash);
@@ -456,12 +483,16 @@ static void deduplicate_islands(struct repository *r)
 
 void load_delta_islands(struct repository *r, int progress)
 {
+	struct island_load_data ild = { 0 };
+
 	island_marks = kh_init_oid_map();
-	remote_islands = kh_init_str();
 
-	git_config(island_config_callback, NULL);
-	for_each_ref(find_island_for_ref, NULL);
-	deduplicate_islands(r);
+	git_config(island_config_callback, &ild);
+	ild.remote_islands = kh_init_str();
+	for_each_ref(find_island_for_ref, &ild);
+	free_config_regexes(&ild);
+	deduplicate_islands(ild.remote_islands, r);
+	free_remote_islands(ild.remote_islands);
 
 	if (progress)
 		fprintf(stderr, _("Marked %d islands, done.\n"), island_counter);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] delta-islands: free island-related data after use
  2022-11-17 23:06   ` [PATCH v2] " Eric Wong
@ 2022-11-18  1:51     ` Ævar Arnfjörð Bjarmason
  2022-11-22 20:22     ` Jeff King
  1 sibling, 0 replies; 9+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18  1:51 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, Jeff King


On Thu, Nov 17 2022, Eric Wong wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> On Wed, Nov 16 2022, Eric Wong wrote:
>> >   Memory savings were measured using the following patch which
>> >   relies on a patched LD_PRELOAD-based malloc debugger:
>> >   https://80x24.org/spew/20221116095404.3974691-1-e@80x24.org/raw
>> 
>> FWIW SANITIZE=leak will find this if you stick a "remote_islands = NULL"
>> and run e.g. t5320-delta-islands.sh, but maybe you needed this closer to
>> production.
>> 
>> Valgrind will also work, but of course be *much* slower.
>
> Yeah, I run that LD_PRELOAD thing in production since it's
> cheap compared to valgrind.
>
>> Perfect shouldn't be the enemy of the good & all that, but in this case
>> it's not too much more effort to just give this data an appropriate
>> lifetime instead of the global, I tried that out for just the "regex"
>> part of this below.
>> 
>> The free_remote_islands() seems to be similarly alive between
>> "find_island_for_ref" and "deduplicate_islands".
>> 
>> Your version also works, but the root cause of this sort of thing is
>> these global lifetimes, which sometimes we do for a good reason, but in
>> this case we don't.
>
> Agreed on all points.  Overall, the amount of globals in git has
> long seemed excessive and offputting to me (and likely other
> drive-by hackers).
>
>> diff --git a/delta-islands.c b/delta-islands.c
>> index 26f9e99e1a9..ef86a91059c 100644
>> --- a/delta-islands.c
>> +++ b/delta-islands.c
>> @@ -312,29 +312,41 @@ void resolve_tree_islands(struct repository *r,
>>  	free(todo);
>>  }
>>  
>> -static regex_t *island_regexes;
>> -static unsigned int island_regexes_alloc, island_regexes_nr;
>> +struct island_config_data {
>> +	regex_t *rx;
>> +	size_t nr;
>> +	size_t alloc;
>> +};
>
> I've added kh_str_t *remote_islands and renamed
> s/island_config_data/island_load_data/ in the below version
> to reflect the slightly different scope of remote_islands.
>
>>  static const char *core_island_name;
>>  
>> -static int island_config_callback(const char *k, const char *v, void *cb UNUSED)
>> +static void island_config_data_release(struct island_config_data *icd)
>> +{
>> +	for (size_t i = 0; i < icd->nr; i++)
>> +		regfree(&icd->rx[i]);
>> +	free(icd->rx);
>> +}
>
> icd => ild since config => load
>
>> +static int island_config_callback(const char *k, const char *v, void *cb)
>>  {
>> +	struct island_config_data *data = cb;
>> +
>
> data => ild
>
> I don't like the name `data' for a typed variable.

Hah! My thought process when deciding on it was "hrm, what *do* we call
the two variables when we have a void * and turn it into a 'util'?
data? cb? util? ... and which one was which?"

I started grepping, then decided I was wasting too much time on that for
a one-off reply, and just went with the first thing I found. The names
you picked are a lot better :)

> Aside from that, v2 below still frees the regex memory early on
> in the hopes deduplicate_islands() can reuse some of the freed
> regexp memory.
>
> Anyways, here's v2, which seems to work.  I'm still trying to
> figure out SATA errors+resets after replacing a CMOS battery,
> but I really hope this patch isn't the cause.
>
> -----8<-----
> From: Eric Wong <e@80x24.org>
> Subject: [PATCH] delta-islands: free island-related data after use
>
> On my use case involving 771 islands of Linux on kernel.org,
> this reduces memory usage by around 25MB.  The bulk of that
> comes from free_remote_islands, since free_config_regexes only
> saves around 40k.
>
> This memory is saved early in the memory-intensive pack process,
> making it available for the remainder of the long process.
>
> Signed-off-by: Eric Wong <e@80x24.org>
> Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

This all looks good to me, thanks a lot for the follow-up.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] delta-islands: free island-related data after use
  2022-11-17 23:06   ` [PATCH v2] " Eric Wong
  2022-11-18  1:51     ` Ævar Arnfjörð Bjarmason
@ 2022-11-22 20:22     ` Jeff King
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff King @ 2022-11-22 20:22 UTC (permalink / raw)
  To: Eric Wong; +Cc: Ævar Arnfjörð Bjarmason, git

On Thu, Nov 17, 2022 at 11:06:58PM +0000, Eric Wong wrote:

> Aside from that, v2 below still frees the regex memory early on
> in the hopes deduplicate_islands() can reuse some of the freed
> regexp memory.
> 
> Anyways, here's v2, which seems to work.  I'm still trying to
> figure out SATA errors+resets after replacing a CMOS battery,
> but I really hope this patch isn't the cause.

This looks OK to me, though I think it would have been easier to review
split into two patches (one pushing the globals into local variables and
the other adding the freeing).

Two small notes:

>  void load_delta_islands(struct repository *r, int progress)
>  {
> +	struct island_load_data ild = { 0 };
> +
>  	island_marks = kh_init_oid_map();
> -	remote_islands = kh_init_str();
>  
> -	git_config(island_config_callback, NULL);
> -	for_each_ref(find_island_for_ref, NULL);
> -	deduplicate_islands(r);
> +	git_config(island_config_callback, &ild);
> +	ild.remote_islands = kh_init_str();

The initialization of the remote_islands khash is now moved after we
read the config. That's OK, because our callback doesn't read it, but
it's not immediately obvious without going back to check the callback.

Splitting it into:

  struct island_load_data {
	kh_str_t *remote_islands;
	struct island_regexes regexes {
		regex_t *rx;
		size_t nr;
		size_t alloc;
	};
  };

lets you pass:

  git_config(island_config_callback, &ild.regexes);

which makes it clear that the khash part isn't touched. But you still
get to pass the whole &ild around later. Of course that's all going
through a void pointer, so you're praying that the callback expects the
right type anyway. ;)

And with your code we'd hopefully notice the problem right away since
the khash pointer is NULL. So it might not be that big a deal.

> +	for_each_ref(find_island_for_ref, &ild);
> +	free_config_regexes(&ild);
> +	deduplicate_islands(ild.remote_islands, r);
> +	free_remote_islands(ild.remote_islands);

Here we free the regexes, but they're pointing to garbage memory still.
But since we pass just the remote_islands part of the struct to those
functions, we know they can't look at the garbage regexes. Good.

I'd have said it ought to be two separate variables, but the
for_each_ref() callback forces your hand there.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use]
  2022-11-16 18:44 ` [PATCH] " Jeff King
@ 2023-02-01  9:20   ` Eric Wong
  2023-02-01 22:09     ` Eric Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2023-02-01  9:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> wrote:
> On Wed, Nov 16, 2022 at 10:50:13AM +0000, Eric Wong wrote:
> >   Will try to hunt down more memory savings in the nearish future.
> 
> Yes, you've probably noticed that pack-objects does not distinguish much
> between what is necessary for the various phases. A few obvious things
> to look at:
> 
>   1. After the write phase, you can probably ditch the island bitmaps,
>      too. In many repacks we're basically done then, but if you're
>      generating bitmaps, that happens afterwards in the same process.

Also, island_marks oid_map gets pretty big itself (300M?), and
realloc gets painful when resizing a big khash especially on
non-GNU/Linux systems without MREMAP_MAYMOVE realloc.  Currently
experimenting with tweaks to make oidtree handle kh_oid_map
functionality to avoid resizes...[1]

>   2. The object traversal for pack-objects is done in-process these
>      days. But after it finishes, I suspect that we do not generally
>      need those object structs anymore, because all of the book-keeping
>      is done in the bit object_entry array in packing_data.

pdata->objects is 1.4G for me atm after many hours (still going).
I think packing_data could be split to avoid reallocs, but that
might need to touch a lot of code...

I need to get better data on my next attempts.  I suspect gcc
-O2 is throwing off mwrap-perl[2]+addr2line and I need to
rebuild w/ -O0.

[1] WIP oidtree map, but I feel like I forgot C, again :<

diff --git a/delta-islands.c b/delta-islands.c
index 90c0d6958f..9e824d7a0d 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -18,11 +18,13 @@
 #include "pack-objects.h"
 #include "delta-islands.h"
 #include "oid-array.h"
+#include "oidtree.h"
 #include "config.h"
 
 KHASH_INIT(str, const char *, void *, 1, kh_str_hash_func, kh_str_hash_equal)
 
-static kh_oid_map_t *island_marks;
+struct oidtree island_marks_storage;
+static struct oidtree *island_marks;
 static unsigned island_counter;
 static unsigned island_counter_core;
 
@@ -93,7 +95,7 @@ static int island_bitmap_get(struct island_bitmap *self, uint32_t i)
 
 int in_same_island(const struct object_id *trg_oid, const struct object_id *src_oid)
 {
-	khiter_t trg_pos, src_pos;
+	struct island_bitmap *trg, *src;
 
 	/* If we aren't using islands, assume everything goes together. */
 	if (!island_marks)
@@ -103,37 +105,30 @@ int in_same_island(const struct object_id *trg_oid, const struct object_id *src_
 	 * If we don't have a bitmap for the target, we can delta it
 	 * against anything -- it's not an important object
 	 */
-	trg_pos = kh_get_oid_map(island_marks, *trg_oid);
-	if (trg_pos >= kh_end(island_marks))
+	trg = oidtree_get(island_marks, trg_oid);
+	if (!trg)
 		return 1;
 
 	/*
 	 * if the source (our delta base) doesn't have a bitmap,
 	 * we don't want to base any deltas on it!
 	 */
-	src_pos = kh_get_oid_map(island_marks, *src_oid);
-	if (src_pos >= kh_end(island_marks))
+	src = oidtree_get(island_marks, src_oid);
+	if (!src)
 		return 0;
 
-	return island_bitmap_is_subset(kh_value(island_marks, trg_pos),
-				kh_value(island_marks, src_pos));
+	return island_bitmap_is_subset(trg, src);
 }
 
 int island_delta_cmp(const struct object_id *a, const struct object_id *b)
 {
-	khiter_t a_pos, b_pos;
-	struct island_bitmap *a_bitmap = NULL, *b_bitmap = NULL;
+	struct island_bitmap *a_bitmap, *b_bitmap;
 
 	if (!island_marks)
 		return 0;
 
-	a_pos = kh_get_oid_map(island_marks, *a);
-	if (a_pos < kh_end(island_marks))
-		a_bitmap = kh_value(island_marks, a_pos);
-
-	b_pos = kh_get_oid_map(island_marks, *b);
-	if (b_pos < kh_end(island_marks))
-		b_bitmap = kh_value(island_marks, b_pos);
+	a_bitmap = oidtree_get(island_marks, a);
+	b_bitmap = oidtree_get(island_marks, b);
 
 	if (a_bitmap) {
 		if (!b_bitmap || !island_bitmap_is_subset(a_bitmap, b_bitmap))
@@ -149,30 +144,28 @@ int island_delta_cmp(const struct object_id *a, const struct object_id *b)
 
 static struct island_bitmap *create_or_get_island_marks(struct object *obj)
 {
-	khiter_t pos;
-	int hash_ret;
+	void **val;
+	size_t n = sizeof(struct island_bitmap *);
 
-	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);
-	if (hash_ret)
-		kh_value(island_marks, pos) = island_bitmap_new(NULL);
+	if (oidtree_put(island_marks, &obj->oid, &val, n))
+		*val = island_bitmap_new(NULL);
 
-	return kh_value(island_marks, pos);
+	return *val;
 }
 
 static void set_island_marks(struct object *obj, struct island_bitmap *marks)
 {
 	struct island_bitmap *b;
-	khiter_t pos;
-	int hash_ret;
+	void **val;
+	size_t n = sizeof(struct island_bitmap *);
 
-	pos = kh_put_oid_map(island_marks, obj->oid, &hash_ret);
-	if (hash_ret) {
+	if (oidtree_put(island_marks, &obj->oid, &val, n)) {
 		/*
 		 * We don't have one yet; make a copy-on-write of the
 		 * parent.
 		 */
 		marks->refcount++;
-		kh_value(island_marks, pos) = marks;
+		*val = marks;
 		return;
 	}
 
@@ -180,10 +173,10 @@ static void set_island_marks(struct object *obj, struct island_bitmap *marks)
 	 * We do have it. Make sure we split any copy-on-write before
 	 * updating.
 	 */
-	b = kh_value(island_marks, pos);
+	b = *val;
 	if (b->refcount > 1) {
 		b->refcount--;
-		b = kh_value(island_marks, pos) = island_bitmap_new(b);
+		*val = b = island_bitmap_new(b);
 	}
 	island_bitmap_or(b, marks);
 }
@@ -275,14 +268,11 @@ void resolve_tree_islands(struct repository *r,
 		struct tree *tree;
 		struct tree_desc desc;
 		struct name_entry entry;
-		khiter_t pos;
 
-		pos = kh_get_oid_map(island_marks, ent->idx.oid);
-		if (pos >= kh_end(island_marks))
+		root_marks = oidtree_get(island_marks, &ent->idx.oid);
+		if (!root_marks)
 			continue;
 
-		root_marks = kh_value(island_marks, pos);
-
 		tree = lookup_tree(r, &ent->idx.oid);
 		if (!tree || parse_tree(tree) < 0)
 			die(_("bad tree object %s"), oid_to_hex(&ent->idx.oid));
@@ -485,7 +475,8 @@ void load_delta_islands(struct repository *r, int progress)
 {
 	struct island_load_data ild = { 0 };
 
-	island_marks = kh_init_oid_map();
+	oidtree_init(&island_marks_storage);
+	island_marks = &island_marks_storage;
 
 	git_config(island_config_callback, &ild);
 	ild.remote_islands = kh_init_str();
@@ -500,11 +491,11 @@ void load_delta_islands(struct repository *r, int progress)
 
 void propagate_island_marks(struct commit *commit)
 {
-	khiter_t pos = kh_get_oid_map(island_marks, commit->object.oid);
+	struct island_bitmap *root_marks;
 
-	if (pos < kh_end(island_marks)) {
+	root_marks = oidtree_get(island_marks, &commit->object.oid);
+	if (root_marks) {
 		struct commit_list *p;
-		struct island_bitmap *root_marks = kh_value(island_marks, pos);
 
 		parse_commit(commit);
 		set_island_marks(&get_commit_tree(commit)->object, root_marks);
@@ -522,16 +513,13 @@ int compute_pack_layers(struct packing_data *to_pack)
 
 	for (i = 0; i < to_pack->nr_objects; ++i) {
 		struct object_entry *entry = &to_pack->objects[i];
-		khiter_t pos = kh_get_oid_map(island_marks, entry->idx.oid);
+		struct island_bitmap *bitmap;
 
 		oe_set_layer(to_pack, entry, 1);
 
-		if (pos < kh_end(island_marks)) {
-			struct island_bitmap *bitmap = kh_value(island_marks, pos);
-
-			if (island_bitmap_get(bitmap, island_counter_core))
-				oe_set_layer(to_pack, entry, 0);
-		}
+		bitmap = oidtree_get(island_marks, &entry->idx.oid);
+		if (bitmap && island_bitmap_get(bitmap, island_counter_core))
+			oe_set_layer(to_pack, entry, 0);
 	}
 
 	return 2;
diff --git a/oidtree.c b/oidtree.c
index 0d39389bee..eb7e76335e 100644
--- a/oidtree.c
+++ b/oidtree.c
@@ -28,15 +28,16 @@ void oidtree_clear(struct oidtree *ot)
 	}
 }
 
-void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
+static void **oidtree_insert3(struct oidtree *ot, const struct object_id *oid,
+				size_t extra)
 {
 	struct cb_node *on;
 	struct object_id k;
 
 	if (!oid->algo)
-		BUG("oidtree_insert requires oid->algo");
+		BUG("oidtree insertion requires oid->algo");
 
-	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid));
+	on = mem_pool_alloc(&ot->mem_pool, sizeof(*on) + sizeof(*oid) + extra);
 
 	/*
 	 * Clear the padding and copy the result in separate steps to
@@ -45,19 +46,22 @@ void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
 	oidcpy_with_padding(&k, oid);
 	memcpy(on->k, &k, sizeof(k));
 
-	/*
-	 * n.b. Current callers won't get us duplicates, here.  If a
-	 * future caller causes duplicates, there'll be a a small leak
-	 * that won't be freed until oidtree_clear.  Currently it's not
-	 * worth maintaining a free list
-	 */
-	cb_insert(&ot->tree, on, sizeof(*oid));
+	if (!cb_insert(&ot->tree, on, sizeof(*oid)))
+		return (void **)(on->k + sizeof(k)); /* success */
+
+	warning("oidtree leak (check contains/get before insert/put)");
+	return NULL;
 }
 
+void oidtree_insert(struct oidtree *ot, const struct object_id *oid)
+{
+	(void)oidtree_insert3(ot, oid, 0);
+}
 
-int oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+static void **oidtree_find(struct oidtree *ot, const struct object_id *oid)
 {
 	struct object_id k;
+	struct cb_node *on;
 	size_t klen = sizeof(k);
 
 	oidcpy_with_padding(&k, oid);
@@ -69,7 +73,31 @@ int oidtree_contains(struct oidtree *ot, const struct object_id *oid)
 	klen += BUILD_ASSERT_OR_ZERO(offsetof(struct object_id, hash) <
 				offsetof(struct object_id, algo));
 
-	return cb_lookup(&ot->tree, (const uint8_t *)&k, klen) ? 1 : 0;
+	on = cb_lookup(&ot->tree, (const uint8_t *)&k, klen);
+	return on ? (void **)(on->k + sizeof(k)) : NULL;
+}
+
+int oidtree_put(struct oidtree *ot, const struct object_id *oid,
+		void ***p, size_t n)
+{
+	*p = oidtree_find(ot, oid);
+	if (*p)
+		return 0;
+
+	*p = oidtree_insert3(ot, oid, n);
+	assert(*p);
+	return 1;
+}
+
+void *oidtree_get(struct oidtree *ot, const struct object_id *oid)
+{
+	void **p = oidtree_find(ot, oid);
+	return p ? *p : NULL;
+}
+
+int oidtree_contains(struct oidtree *ot, const struct object_id *oid)
+{
+	return oidtree_find(ot, oid) ? 1 : 0;
 }
 
 static enum cb_next iter(struct cb_node *n, void *arg)
diff --git a/oidtree.h b/oidtree.h
index 77898f510a..2f6e6f1beb 100644
--- a/oidtree.h
+++ b/oidtree.h
@@ -12,6 +12,8 @@ struct oidtree {
 
 void oidtree_init(struct oidtree *);
 void oidtree_clear(struct oidtree *);
+
+/* oid_set-like API */
 void oidtree_insert(struct oidtree *, const struct object_id *);
 int oidtree_contains(struct oidtree *, const struct object_id *);
 
@@ -19,4 +21,17 @@ typedef enum cb_next (*oidtree_iter)(const struct object_id *, void *data);
 void oidtree_each(struct oidtree *, const struct object_id *,
 			size_t oidhexsz, oidtree_iter, void *data);
 
+/* oid_map-like API */
+
+/* returns a pointer to the data payload associated with object_id */
+void *oidtree_get(struct oidtree *, const struct object_id *);
+
+/*
+ * points @p to the destination of the value
+ * @n must be consistent for the entire oidtree
+ * returns true if a new oidtree node was created,
+ * returns false if reusing an existing oidtree node
+ */
+int oidtree_put(struct oidtree *, const struct object_id *,
+		void ***p, size_t n);
 #endif /* OIDTREE_H */


[2] https://80x24.org/mwrap-perl.git

    # after install, run gc under mwrap-perl with backtrace 10
    MWRAP=socket_dir:/tmp/mwrap,bt:10 mwrap-perl git gc

    # recommended: use GNU addr2line 2.39+ (Aug 2022) for +OFFSET decoding

    # start HTTP reverse proxy
    ADDR2LINE='/path/to/addr2line -p -f -i' \
	mwrap-rproxy --socket-dir=/tmp/mwrap

    # the per-PID each/2000 URLs can get really expensive for browsers
    # even w3m struggles:
    w3m http://0:5000/ # follow per-PID links

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use]
  2023-02-01  9:20   ` pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use] Eric Wong
@ 2023-02-01 22:09     ` Eric Wong
  2023-02-02  0:11       ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2023-02-01 22:09 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Eric Wong <e@80x24.org> wrote:
> [1] WIP oidtree map, but I feel like I forgot C, again :<

Well, it hasn't crashed.  It's just much slower compared to khash.

I'm thinking `struct object_id' should be pooled+deduplicated
like hash keys in the Perl/Ruby interpreters and we'd pass
4/8-byte pointers instead of 36-byte structs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use]
  2023-02-01 22:09     ` Eric Wong
@ 2023-02-02  0:11       ` Jeff King
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2023-02-02  0:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

On Wed, Feb 01, 2023 at 10:09:29PM +0000, Eric Wong wrote:

> Eric Wong <e@80x24.org> wrote:
> > [1] WIP oidtree map, but I feel like I forgot C, again :<
> 
> Well, it hasn't crashed.  It's just much slower compared to khash.

I experimented a bit with critbit trees several years ago, mostly for
the main obj_hash. I could never make them faster than hashing. I think
part of it is that "hashing" an oid is basically free, since we just
pull off the first 4 bytes. And we keep our table factors quite low.
Whereas for trie structures, even though the big-O behavior is good,
there's a lot of branching that ends up being slow.

> I'm thinking `struct object_id' should be pooled+deduplicated
> like hash keys in the Perl/Ruby interpreters and we'd pass
> 4/8-byte pointers instead of 36-byte structs.

I think that could work, but it feels like it would be pretty major
surgery, given how often object_ids appear in the code.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-02  0:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-16 10:50 [PATCH] delta-islands: free island-related data after use Eric Wong
2022-11-16 15:44 ` Ævar Arnfjörð Bjarmason
2022-11-17 23:06   ` [PATCH v2] " Eric Wong
2022-11-18  1:51     ` Ævar Arnfjörð Bjarmason
2022-11-22 20:22     ` Jeff King
2022-11-16 18:44 ` [PATCH] " Jeff King
2023-02-01  9:20   ` pack-objects memory use observations [was: [PATCH] delta-islands: free island-related data after use] Eric Wong
2023-02-01 22:09     ` Eric Wong
2023-02-02  0:11       ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).