git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, phil.hord@gmail.com, dstolee@microsoft.com,
	jonathantanmy@google.com, stefanbeller@gmail.com
Subject: Re: [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache()
Date: Mon, 28 Sep 2020 11:22:43 -0700	[thread overview]
Message-ID: <xmqqlfgtn4lo.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <5b6e3019e08c6bccdee29018e99b0c6933fe05e0.1601311803.git.matheus.bernardino@usp.br> (Matheus Tavares's message of "Mon, 28 Sep 2020 13:50:35 -0300")

Matheus Tavares <matheus.bernardino@usp.br> writes:

> When add_delta_base_cache() is called with a base that is already in the
> cache, no operation is performed. But the check is done after allocating
> space for a new entry, so we end up leaking memory on the early return.

Wow, that's so obvious a leak that it is surprising it has been
unnoticed, especially given that the runtime inflation of the
packfile was written so long time ago and was a central part of the
system.

I had to dig and find out that the breakage was fairly recent from
early this year, made in 31877c9a (object-store: allow threaded
access to object reading, 2020-01-15).

> Also, the caller always expect that the base will be inserted, so it
> never free()'s it. To fix both of these memory leaks, let's move the
> allocation of a new entry further down in add_delta_base_cache(), and
> make the function return an integer to indicate whether the insertion
> was performed or not. Then, make the caller free() the base when needed.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>

> @@ -1841,8 +1843,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>  		 * thread could free() it (e.g. to make space for another entry)
>  		 * before we are done using it.
>  		 */
> -		if (!external_base)
> -			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
> +		if (!external_base && !add_delta_base_cache(p, base_obj_offset,
> +						base, base_size, type)) {
> +			free(base);
> +		}

When you have to wrap a long expression, try to split after an
operator near the root of the parse tree, e.g.

		if (!external_base &&
		    !add_delta_base_cache(p, base_obj_offset, base, base_size, type)) {

would make the result easier to follow.

I however suspect that it may be better let add_delta_base_cache()
do the freeing.  There is only one caller, and from its point of
view, the timing when it throws the base at the cache (after the
previous patch) is when it is done with it.

In other words we can think of the call to add_delta_base_cache() as
the caller saying: "I am done with this, but somebody else might
want to reuse it later, so do whatever you want to do with it".  

If we were to go that route, it might even make sense to rename it
to reflect that mentality from the viewpoint of the caller, but a
single-caller helper like this one it may not matter all that much.

Thanks.



  reply	other threads:[~2020-09-28 18:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25  2:36 RFC - concurrency causes segfault in git grep since 2.26.0 Phil Hord
2020-09-25  5:52 ` Matheus Tavares
2020-09-25 19:53   ` Phil Hord
2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-09-28 18:05         ` Junio C Hamano
2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
2020-09-28 18:22         ` Junio C Hamano [this message]
2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-10-02 20:06           ` Phil Hord
2020-09-29  0:01         ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlfgtn4lo.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=matheus.bernardino@usp.br \
    --cc=phil.hord@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).