From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: phil.hord@gmail.com, dstolee@microsoft.com,
jonathantanmy@google.com, stefanbeller@gmail.com,
gitster@pobox.com
Subject: [PATCH v2 0/2] Fix race condition and memory leak in delta base cache
Date: Mon, 28 Sep 2020 21:01:51 -0300 [thread overview]
Message-ID: <cover.1601337543.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1601311803.git.matheus.bernardino@usp.br>
Changes since v1:
- Reworded patch one to remove misleading sentence about the locking
during step 2.
- In patch two, let add_delta_base_cache() take full ownership of the
base, properly releasing it when the insertion is skipped.
Matheus Tavares (2):
packfile: fix race condition on unpack_entry()
packfile: fix memory leak in add_delta_base_cache()
packfile.c | 48 +++++++++++++++++++++++++++++-------------------
1 file changed, 29 insertions(+), 19 deletions(-)
Range-diff against v1:
1: 42a7948f94 ! 1: 948d07673f packfile: fix race condition on unpack_entry()
@@ Commit message
objects.
To fix the race condition (and later segmentation fault), let's reorder
- the aforementioned steps so that the lock is not released between 1.
- and 3. An alternative solution which would not require the reordering
- would be to duplicate `base` before inserting it in the cache.
- However, as Phil Hord mentioned, memcpy()'ing large bases can negatively
- affect performance: in his experiments, this alternative approach slowed
- git-grep down by 10% to 20%.
+ the aforementioned steps so that `base` is only added to the cache at
+ the end. This will prevent the buffer from being released by another
+ thread while it is still in use. An alternative solution which would not
+ require the reordering would be to duplicate `base` before inserting it
+ in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases
+ can negatively affect performance: in his experiments, this alternative
+ approach slowed git-grep down by 10% to 20%.
Reported-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2: 5b6e3019e0 ! 2: f15f0c82fd packfile: fix memory leak in add_delta_base_cache()
@@ Commit message
When add_delta_base_cache() is called with a base that is already in the
cache, no operation is performed. But the check is done after allocating
space for a new entry, so we end up leaking memory on the early return.
- Also, the caller always expect that the base will be inserted, so it
- never free()'s it. To fix both of these memory leaks, let's move the
+ In addition, the caller never free()'s the base as it expects the
+ function to take ownership of it. But the base is not released when we
+ skip insertion, so it also gets leaked. To fix these problems, move the
allocation of a new entry further down in add_delta_base_cache(), and
- make the function return an integer to indicate whether the insertion
- was performed or not. Then, make the caller free() the base when needed.
+ free() the base on early return.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
## packfile.c ##
@@ packfile.c: void clear_delta_base_cache(void)
- }
- }
-
--static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
-+static int add_delta_base_cache(struct packed_git *p, off_t base_offset,
+ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
void *base, unsigned long base_size, enum object_type type)
{
- struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
@@ packfile.c: void clear_delta_base_cache(void)
/*
@@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
+ * is unpacking the same object, in unpack_entry() (since its phases I
* and III might run concurrently across multiple threads).
*/
- if (in_delta_base_cache(p, base_offset))
-- return;
-+ return 0;
+- if (in_delta_base_cache(p, base_offset))
++ if (in_delta_base_cache(p, base_offset)) {
++ free(base);
+ return;
++ }
delta_base_cached += base_size;
@@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_of
ent->key.p = p;
ent->key.base_offset = base_offset;
ent->type = type;
-@@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
- hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
- hashmap_entry_init(&ent->ent, pack_entry_hash(p, base_offset));
- hashmap_add(&delta_base_cache, &ent->ent);
-+ return 1;
- }
-
- int packed_object_info(struct repository *r, struct packed_git *p,
-@@ packfile.c: void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
- * thread could free() it (e.g. to make space for another entry)
- * before we are done using it.
- */
-- if (!external_base)
-- add_delta_base_cache(p, base_obj_offset, base, base_size, type);
-+ if (!external_base && !add_delta_base_cache(p, base_obj_offset,
-+ base, base_size, type)) {
-+ free(base);
-+ }
-
- free(delta_data);
- free(external_base);
--
2.28.0
next prev parent reply other threads:[~2020-09-29 0:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-25 2:36 RFC - concurrency causes segfault in git grep since 2.26.0 Phil Hord
2020-09-25 5:52 ` Matheus Tavares
2020-09-25 19:53 ` Phil Hord
2020-09-28 16:50 ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-28 16:50 ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-09-28 18:05 ` Junio C Hamano
2020-09-28 16:50 ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
2020-09-28 18:22 ` Junio C Hamano
2020-09-29 0:01 ` Matheus Tavares [this message]
2020-09-29 0:01 ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-10-02 20:06 ` Phil Hord
2020-09-29 0:01 ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1601337543.git.matheus.bernardino@usp.br \
--to=matheus.bernardino@usp.br \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=phil.hord@gmail.com \
--cc=stefanbeller@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).