git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] clear_pattern_list(): clear embedded hashmaps
@ 2020-08-14 11:10 Jeff King
  2020-08-14 12:13 ` Derrick Stolee
  2020-08-17 16:55 ` Elijah Newren
  0 siblings, 2 replies; 5+ messages in thread
From: Jeff King @ 2020-08-14 11:10 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee

Commit 96cc8ab531 (sparse-checkout: use hashmaps for cone patterns,
2019-11-21) added some auxiliary hashmaps to the pattern_list struct,
but they're leaked when clear_pattern_list() is called.

Signed-off-by: Jeff King <peff@peff.net>
---
I have no idea how often this leak triggers in practice. I just noticed
it while poking at LSan output (which we remain depressingly far
from getting a clean run on).

 dir.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/dir.c b/dir.c
index fe64be30ed..9411b94e9b 100644
--- a/dir.c
+++ b/dir.c
@@ -916,6 +916,8 @@ void clear_pattern_list(struct pattern_list *pl)
 		free(pl->patterns[i]);
 	free(pl->patterns);
 	free(pl->filebuf);
+	hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent);
+	hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent);
 
 	memset(pl, 0, sizeof(*pl));
 }
-- 
2.28.0.596.g9c08d63829

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] clear_pattern_list(): clear embedded hashmaps
  2020-08-14 11:10 [PATCH] clear_pattern_list(): clear embedded hashmaps Jeff King
@ 2020-08-14 12:13 ` Derrick Stolee
  2020-08-17 16:55 ` Elijah Newren
  1 sibling, 0 replies; 5+ messages in thread
From: Derrick Stolee @ 2020-08-14 12:13 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Derrick Stolee

On 8/14/2020 7:10 AM, Jeff King wrote:
> Commit 96cc8ab531 (sparse-checkout: use hashmaps for cone patterns,
> 2019-11-21) added some auxiliary hashmaps to the pattern_list struct,
> but they're leaked when clear_pattern_list() is called.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I have no idea how often this leak triggers in practice. I just noticed
> it while poking at LSan output (which we remain depressingly far
> from getting a clean run on).

Good find. The impact of the leak is likely low since we don't create
multiple pattern_list structs per process (with these hashmaps) very
often. The sparse-checkout builtin is likely the only place where
multiple could be instantiated at the same time.

I also double-checked that hashmap_free_entries() handles a NULL
hashmap pointer or uninitialized hashmap, which is what happens
when cone mode is not enabled _or_ the pattern_list corresponds to
something like a .gitignore file.

Thanks,
-Stolee

>  dir.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/dir.c b/dir.c
> index fe64be30ed..9411b94e9b 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -916,6 +916,8 @@ void clear_pattern_list(struct pattern_list *pl)
>  		free(pl->patterns[i]);
>  	free(pl->patterns);
>  	free(pl->filebuf);
> +	hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent);
> +	hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent);
>  
>  	memset(pl, 0, sizeof(*pl));
>  }
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clear_pattern_list(): clear embedded hashmaps
  2020-08-14 11:10 [PATCH] clear_pattern_list(): clear embedded hashmaps Jeff King
  2020-08-14 12:13 ` Derrick Stolee
@ 2020-08-17 16:55 ` Elijah Newren
  2020-08-17 17:22   ` Elijah Newren
  1 sibling, 1 reply; 5+ messages in thread
From: Elijah Newren @ 2020-08-17 16:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Derrick Stolee

Hi,

On Fri, Aug 14, 2020 at 5:23 AM Jeff King <peff@peff.net> wrote:
>
> Commit 96cc8ab531 (sparse-checkout: use hashmaps for cone patterns,
> 2019-11-21) added some auxiliary hashmaps to the pattern_list struct,
> but they're leaked when clear_pattern_list() is called.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I have no idea how often this leak triggers in practice. I just noticed
> it while poking at LSan output (which we remain depressingly far
> from getting a clean run on).
>
>  dir.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/dir.c b/dir.c
> index fe64be30ed..9411b94e9b 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -916,6 +916,8 @@ void clear_pattern_list(struct pattern_list *pl)
>                 free(pl->patterns[i]);
>         free(pl->patterns);
>         free(pl->filebuf);
> +       hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent);
> +       hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent);

This clears up the hash entries, but continues to leak the hash table.
Since you submitted first, can you fix this to use hashmap_free_()
instead, as per
https://lore.kernel.org/git/932741d7598ca2934dbca40f715ba2d3819fcc51.1597561152.git.gitgitgadget@gmail.com/?
 Then I'll rebase my series on yours and drop my first patch (since
it'll then be identical).

Thanks,
Elijah

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clear_pattern_list(): clear embedded hashmaps
  2020-08-17 16:55 ` Elijah Newren
@ 2020-08-17 17:22   ` Elijah Newren
  2020-08-17 18:48     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Elijah Newren @ 2020-08-17 17:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Derrick Stolee

On Mon, Aug 17, 2020 at 9:55 AM Elijah Newren <newren@gmail.com> wrote:
>
> Hi,
>
> On Fri, Aug 14, 2020 at 5:23 AM Jeff King <peff@peff.net> wrote:
> >
> > Commit 96cc8ab531 (sparse-checkout: use hashmaps for cone patterns,
> > 2019-11-21) added some auxiliary hashmaps to the pattern_list struct,
> > but they're leaked when clear_pattern_list() is called.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > I have no idea how often this leak triggers in practice. I just noticed
> > it while poking at LSan output (which we remain depressingly far
> > from getting a clean run on).
> >
> >  dir.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/dir.c b/dir.c
> > index fe64be30ed..9411b94e9b 100644
> > --- a/dir.c
> > +++ b/dir.c
> > @@ -916,6 +916,8 @@ void clear_pattern_list(struct pattern_list *pl)
> >                 free(pl->patterns[i]);
> >         free(pl->patterns);
> >         free(pl->filebuf);
> > +       hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent);
> > +       hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent);
>
> This clears up the hash entries, but continues to leak the hash table.
> Since you submitted first, can you fix this to use hashmap_free_()
> instead, as per
> https://lore.kernel.org/git/932741d7598ca2934dbca40f715ba2d3819fcc51.1597561152.git.gitgitgadget@gmail.com/?
>  Then I'll rebase my series on yours and drop my first patch (since
> it'll then be identical).

Nevermind, I got confused once again by the name.
hashmap_free_entries() doesn't mean just free the entries, it means
free what hashmap_free() would plus all the entries, i.e. do what
hashmap_free() *should* *have* *been* defined to do.  Such a confusing
API.  And hashmap_free() really perplexes me -- it seems like a
function that can't possibly be useful; it's sole purpose seems to be
a trap for the unwary.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] clear_pattern_list(): clear embedded hashmaps
  2020-08-17 17:22   ` Elijah Newren
@ 2020-08-17 18:48     ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2020-08-17 18:48 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Derrick Stolee

On Mon, Aug 17, 2020 at 10:22:27AM -0700, Elijah Newren wrote:

> > > +       hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent);
> > > +       hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent);
> >
> > This clears up the hash entries, but continues to leak the hash table.
> > Since you submitted first, can you fix this to use hashmap_free_()
> > instead, as per
> > https://lore.kernel.org/git/932741d7598ca2934dbca40f715ba2d3819fcc51.1597561152.git.gitgitgadget@gmail.com/?
> >  Then I'll rebase my series on yours and drop my first patch (since
> > it'll then be identical).
> 
> Nevermind, I got confused once again by the name.
> hashmap_free_entries() doesn't mean just free the entries, it means
> free what hashmap_free() would plus all the entries, i.e. do what
> hashmap_free() *should* *have* *been* defined to do.  Such a confusing
> API.  And hashmap_free() really perplexes me -- it seems like a
> function that can't possibly be useful; it's sole purpose seems to be
> a trap for the unwary.

There used to be an "also free entries" flag, but that got complicated
by the loosening of the "hashmap_entry must be at the front of the
struct to be freed" rule.

With this kind of embedded-entry data structure (and list.h is in the
same boat) it _is_ sometimes useful to be part of a data structure
without giving up ownership of the memory. But I agree that the more
normal case is to free items when the hashmap is destroyed.

Likewise, the whole "you have to define a struct that contains the map
entry" thing is flexible and efficient, but a pain to use.

I generally find khash's "map this type to that type, the hash owns the
memory" much more natural. And it doesn't lose efficiency (and indeed
sometimes even gains it) because it uses macros to store concrete types.
But of course macros create their own headaches. :)

Anyway, I'm definitely open to renaming to something more sensible. I
already mentioned the free/clear thing earlier, but
hashmap_clear_entries() ends up _very_ confusing. Because it's clearing
the hashmap but freeing the entries. hashmap_clear_and_free_entries() is
kind of long, but a lot more descriptive.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-17 18:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-14 11:10 [PATCH] clear_pattern_list(): clear embedded hashmaps Jeff King
2020-08-14 12:13 ` Derrick Stolee
2020-08-17 16:55 ` Elijah Newren
2020-08-17 17:22   ` Elijah Newren
2020-08-17 18:48     ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).