git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* RFC - concurrency causes segfault in git grep since 2.26.0
@ 2020-09-25  2:36 Phil Hord
  2020-09-25  5:52 ` Matheus Tavares
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Hord @ 2020-09-25  2:36 UTC (permalink / raw)
  To: Git; +Cc: matheus.bernardino, Stefan Beller, Jonathan Tan, Derrick Stolee

Hi list.  It's been a while.

I can reliably reproduce a segfault in git grep when searching the
object DB. It is caused by a race when threads > 2.

I have a patch that fixes the problem, but I don't know exactly why.
Can someone else explain it and/or offer a better solution?

====

diff --git packfile.c packfile.c
index e69012e7f2..52b7b54aeb 100644
--- packfile.c
+++ packfile.c
@@ -1812,7 +1812,9 @@ void *unpack_entry(struct repository *r, struct
packed_git *p, off_t obj_offset,
                if (!base)
                        continue;

+               obj_read_lock();
                delta_data = unpack_compressed_entry(p, &w_curs,
curpos, delta_size);
+               obj_read_unlock();

                if (!delta_data) {
                        error("failed to unpack compressed delta "

====

My notes thus far:

I have seen it fail with `--threads 3`, but it fails very reliably for
me with `--threads 20`.

My repo is large (7GB) which gives grep enough to do. I'm not able to
make it fail on git.git; it seems like it's too small to run into this
problem.

The problem bisects to 1d1729caeb.  It was somewhat hard to find this,
though, because git grep disables --threads when reading from the
object store until another patch later in the same series.

====

$ git bisect run sh -c \
    "sed -i -e 's/list.nr || cached || show_in_pager/list.nr ||
show_in_pager/' builtin/grep.c || exit 129;
    make -j 30 || exit 125;
    git checkout HEAD -- builtin/grep.c ;
    ./bin-wrappers/git -C ~/git/purity grep --threads 20 --cached
'target-regex' || exit 1"

1d1729caebd41b340dd8dd61057f613da4df526c is the first bad commit
commit 1d1729caebd41b340dd8dd61057f613da4df526c
Author: Matheus Tavares <matheus.bernardino@usp.br>
Date:   Wed Jan 15 23:39:54 2020 -0300

    grep: replace grep_read_mutex by internal obj read lock

Here's what I'm testing with:
$ ./bin-wrappers/git --version
git version 2.28.0.585.ge1cfff6765

I have found that the crash always happens in cache_or_unpack_entry.
The thread that segfaults is always in patch_delta, and there are
always at least two threads inside unpack_compressed_entry at the time
of the failure.

I was able to get it to crash with just three threads one time.  I
copied a log with the stack trace of all (4) threads during that run
to a gist [1].

Another log showing four different segfaults when run with 40 threads
is also in that gist [1].

[1] https://gist.github.com/phord/02e84d003688baa493b978110225d443

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC - concurrency causes segfault in git grep since 2.26.0
  2020-09-25  2:36 RFC - concurrency causes segfault in git grep since 2.26.0 Phil Hord
@ 2020-09-25  5:52 ` Matheus Tavares
  2020-09-25 19:53   ` Phil Hord
  0 siblings, 1 reply; 12+ messages in thread
From: Matheus Tavares @ 2020-09-25  5:52 UTC (permalink / raw)
  To: phil.hord; +Cc: dstolee, git, jonathantanmy, stefanbeller

Hi, Phil

Thanks for the bug report and for the crash logs.

On Thu, Sep 24, 2020 at 11:36 PM Phil Hord <phil.hord@gmail.com> wrote:
>
> Hi list.  It's been a while.
>
> I can reliably reproduce a segfault in git grep when searching the
> object DB. It is caused by a race when threads > 2.
>
> I have a patch that fixes the problem, but I don't know exactly why.
> Can someone else explain it and/or offer a better solution?
>
> ====
>
> diff --git packfile.c packfile.c
> index e69012e7f2..52b7b54aeb 100644
> --- packfile.c
> +++ packfile.c
> @@ -1812,7 +1812,9 @@ void *unpack_entry(struct repository *r, struct
> packed_git *p, off_t obj_offset,
>                 if (!base)
>                         continue;
>
> +               obj_read_lock();
>                 delta_data = unpack_compressed_entry(p, &w_curs,
> curpos, delta_size);
> +               obj_read_unlock();
>
>                 if (!delta_data) {
>                         error("failed to unpack compressed delta "
>
> ====

Hmm, obj_read_mutex is a recursive lock, so by adding the
obj_read_lock() call here, we are incrementing the lock value to [at
least] 2 (because oid_object_info_extended() had already incremented
once). Therefore, when unpack_compressed_entry() calls obj_read_unlock()
before inflating the entry, the lock is not actually released, only
decremented. So the effect of this change is that the phase III of
unpack_entry() is performed entirely without releasing the lock.

Your crash log shows us that the segfault happened when trying to
memcpy() the string `base` (in unpack_entry()). I.e., the same string
that we had previously added to the delta base cache, right before
calling unpack_compressed_entry() and releasing the lock. The
problematic sequence is:
	add `base` to the cache -> release lock -> inflate ->
	acquire lock -> try to use `base`

Maybe, when a thread X releases the lock to perform decompression,
thread Y acquires the lock and tries to add another base to the cache.
The cache is full, so Y has to remove entries, and it ends up free()'ing
the base that was just inserted by X. Later, X tries to dereference
`base` in patch_entry(), which would cause a segfault. It would also
explain why your change solves the segfault.

I'm not sure, though, why this entry would be removed, since it was just
added... Maybe Y was adding a huge base to the cache, which required
removing all entries?

Anyways, you mentioned you can reproduce the failure quite reliably in your
repo with 20 threads, right? Could you please check with the following patch
applied? It adds a `base` copy to the cache instead of the original string,
allowing us to keep running decompression in parallel but without the risk of
having another thread free()'ing `base` in the mean time.

diff --git a/packfile.c b/packfile.c
index 9ef27508f2..79f83b6034 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1772,14 +1772,15 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 	while (delta_stack_nr) {
 		void *delta_data;
 		void *base = data;
-		void *external_base = NULL;
 		unsigned long delta_size, base_size = size;
 		int i;
 
 		data = NULL;
 
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
+		if (base) {
+			char *base_dup = memcpy(xmalloc(base_size), base, base_size);
+			add_delta_base_cache(p, obj_offset, base_dup, base_size, type);
+		}
 
 		if (!base) {
 			/*
@@ -1799,7 +1800,6 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 				      p->pack_name);
 				mark_bad_packed_object(p, base_oid.hash);
 				base = read_object(r, &base_oid, &type, &base_size);
-				external_base = base;
 			}
 		}
 
@@ -1818,7 +1818,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			      "at offset %"PRIuMAX" from %s",
 			      (uintmax_t)curpos, p->pack_name);
 			data = NULL;
-			free(external_base);
+			free(base);
 			continue;
 		}
 
@@ -1838,7 +1838,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			error("failed to apply delta");
 
 		free(delta_data);
-		free(external_base);
+		free(base);
 	}
 
 	if (final_type)
--

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: RFC - concurrency causes segfault in git grep since 2.26.0
  2020-09-25  5:52 ` Matheus Tavares
@ 2020-09-25 19:53   ` Phil Hord
  2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Hord @ 2020-09-25 19:53 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: Derrick Stolee, Git, Jonathan Tan, Stefan Beller

Hi Matheus,

Thanks for the quick response.

On Thu, Sep 24, 2020 at 10:53 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> Hi, Phil
>
> Thanks for the bug report and for the crash logs.
>
> On Thu, Sep 24, 2020 at 11:36 PM Phil Hord <phil.hord@gmail.com> wrote:
> >
> > Hi list.  It's been a while.
> >
> > I can reliably reproduce a segfault in git grep when searching the
> > object DB. It is caused by a race when threads > 2.
> >
> > I have a patch that fixes the problem, but I don't know exactly why.
> > Can someone else explain it and/or offer a better solution?
> >
> > ====
> >
> > diff --git packfile.c packfile.c
> > index e69012e7f2..52b7b54aeb 100644
> > --- packfile.c
> > +++ packfile.c
> > @@ -1812,7 +1812,9 @@ void *unpack_entry(struct repository *r, struct
> > packed_git *p, off_t obj_offset,
> >                 if (!base)
> >                         continue;
> >
> > +               obj_read_lock();
> >                 delta_data = unpack_compressed_entry(p, &w_curs,
> > curpos, delta_size);
> > +               obj_read_unlock();
> >
> >                 if (!delta_data) {
> >                         error("failed to unpack compressed delta "
> >
> > ====
>
> Hmm, obj_read_mutex is a recursive lock, so by adding the
> obj_read_lock() call here, we are incrementing the lock value to [at
> least] 2 (because oid_object_info_extended() had already incremented
> once). Therefore, when unpack_compressed_entry() calls obj_read_unlock()
> before inflating the entry, the lock is not actually released, only
> decremented. So the effect of this change is that the phase III of
> unpack_entry() is performed entirely without releasing the lock.

Ah, there's the piece I was missing.  I didn't realize we temporarily
drop the lock inside unpack_compressed_entry.  And obviously I also
didn't notice that we were already holding a lock before entering
unpack_entry.  Thanks for the detailed explanation.

> Your crash log shows us that the segfault happened when trying to
> memcpy() the string `base` (in unpack_entry()). I.e., the same string
> that we had previously added to the delta base cache, right before
> calling unpack_compressed_entry() and releasing the lock. The
> problematic sequence is:
>         add `base` to the cache -> release lock -> inflate ->
>         acquire lock -> try to use `base`
>
> Maybe, when a thread X releases the lock to perform decompression,
> thread Y acquires the lock and tries to add another base to the cache.
> The cache is full, so Y has to remove entries, and it ends up free()'ing
> the base that was just inserted by X. Later, X tries to dereference
> `base` in patch_entry(), which would cause a segfault. It would also
> explain why your change solves the segfault.
>
> I'm not sure, though, why this entry would be removed, since it was just
> added... Maybe Y was adding a huge base to the cache, which required
> removing all entries?

I think your theory is correct here on both counts.  The repo I am
searching has some sinfully large objects, and when there is a cache
limit, they are likely to exceed it.

> Anyways, you mentioned you can reproduce the failure quite reliably in your
> repo with 20 threads, right? Could you please check with the following patch
> applied? It adds a `base` copy to the cache instead of the original string,
> allowing us to keep running decompression in parallel but without the risk of
> having another thread free()'ing `base` in the mean time.


Yes, your patch reliably solves the problem, too.  The performance is
only slightly improved over holding the lock during inflate in my
case, but it does save a bit of time.  Surprisingly, performance seems
to peak between 5 and 10 threads for me in both cases.  It reliably
gets slightly worse at 20 threads and 40 threads.

In all cases when threads > 2, this search appears to be 10%-20%
slower than running without this change.  Even when threads=1, though,
the result is about 10% slower.  Perhaps it is worth avoiding the
base_dup when !!obj_read_use_lock.

> diff --git a/packfile.c b/packfile.c
> index 9ef27508f2..79f83b6034 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1772,14 +1772,15 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>         while (delta_stack_nr) {
>                 void *delta_data;
>                 void *base = data;
> -               void *external_base = NULL;
>                 unsigned long delta_size, base_size = size;
>                 int i;
>
>                 data = NULL;
>
> -               if (base)
> -                       add_delta_base_cache(p, obj_offset, base, base_size, type);
> +               if (base) {
> +                       char *base_dup = memcpy(xmalloc(base_size), base, base_size);
> +                       add_delta_base_cache(p, obj_offset, base_dup, base_size, type);
> +               }
>
>                 if (!base) {
>                         /*
> @@ -1799,7 +1800,6 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>                                       p->pack_name);
>                                 mark_bad_packed_object(p, base_oid.hash);
>                                 base = read_object(r, &base_oid, &type, &base_size);
> -                               external_base = base;
>                         }
>                 }
>
> @@ -1818,7 +1818,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>                               "at offset %"PRIuMAX" from %s",
>                               (uintmax_t)curpos, p->pack_name);
>                         data = NULL;
> -                       free(external_base);
> +                       free(base);
>                         continue;
>                 }
>
> @@ -1838,7 +1838,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>                         error("failed to apply delta");
>
>                 free(delta_data);
> -               free(external_base);
> +               free(base);
>         }
>
>         if (final_type)
> --

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] Fix race condition and memory leak in delta base cache
  2020-09-25 19:53   ` Phil Hord
@ 2020-09-28 16:50     ` Matheus Tavares
  2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Matheus Tavares @ 2020-09-28 16:50 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller

The first patch fixes the race condition problem reported by Phil, which
ocasionally caused a segmentation fault during git-grep executions. The
second patch fixes a memory leak in the same code.

Matheus Tavares (2):
  packfile: fix race condition on unpack_entry()
  packfile: fix memory leak in add_delta_base_cache()

 packfile.c | 51 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] packfile: fix race condition on unpack_entry()
  2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
@ 2020-09-28 16:50       ` Matheus Tavares
  2020-09-28 18:05         ` Junio C Hamano
  2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
  2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  2 siblings, 1 reply; 12+ messages in thread
From: Matheus Tavares @ 2020-09-28 16:50 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller

The third phase of unpack_entry() performs the following sequence in a
loop, until all the deltas enumerated in phase one are applied and the
entry is fully reconstructed:

1. Add the current base entry to the delta base cache
2. Unpack the next delta
3. Patch the unpacked delta on top of the base

When the optional object reading lock is enabled, the above steps will
be performed while holding the lock. However, step 2. momentarily
releases it so that inflation can be performed in parallel for increased
performance. Because the `base` buffer inserted in the cache at 1. is
not duplicated, another thread can potentially free() it while the lock
is released at 2. (e.g. when there is no space left in the cache to
insert another entry). In this case, the later attempt to dereference
`base` at 3. will cause a segmentation fault. This problem was observed
during a multithreaded git-grep execution on a repository with large
objects.

To fix the race condition (and later segmentation fault), let's reorder
the aforementioned steps so that the lock is not released between 1.
and 3. An alternative solution which would not require the reordering
would be to duplicate `base` before inserting it in the cache.
However, as Phil Hord mentioned, memcpy()'ing large bases can negatively
affect performance: in his experiments, this alternative approach slowed
git-grep down by 10% to 20%.

Reported-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 packfile.c | 41 ++++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/packfile.c b/packfile.c
index 9ef27508f2..0319418d88 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1775,12 +1775,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 		void *external_base = NULL;
 		unsigned long delta_size, base_size = size;
 		int i;
+		off_t base_obj_offset = obj_offset;
 
 		data = NULL;
 
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
-
 		if (!base) {
 			/*
 			 * We're probably in deep shit, but let's try to fetch
@@ -1818,24 +1816,33 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			      "at offset %"PRIuMAX" from %s",
 			      (uintmax_t)curpos, p->pack_name);
 			data = NULL;
-			free(external_base);
-			continue;
-		}
+		} else {
+			data = patch_delta(base, base_size, delta_data,
+					   delta_size, &size);
 
-		data = patch_delta(base, base_size,
-				   delta_data, delta_size,
-				   &size);
+			/*
+			 * We could not apply the delta; warn the user, but
+			 * keep going. Our failure will be noticed either in
+			 * the next iteration of the loop, or if this is the
+			 * final delta, in the caller when we return NULL.
+			 * Those code paths will take care of making a more
+			 * explicit warning and retrying with another copy of
+			 * the object.
+			 */
+			if (!data)
+				error("failed to apply delta");
+		}
 
 		/*
-		 * We could not apply the delta; warn the user, but keep going.
-		 * Our failure will be noticed either in the next iteration of
-		 * the loop, or if this is the final delta, in the caller when
-		 * we return NULL. Those code paths will take care of making
-		 * a more explicit warning and retrying with another copy of
-		 * the object.
+		 * We delay adding `base` to the cache until the end of the loop
+		 * because unpack_compressed_entry() momentarily releases the
+		 * obj_read_mutex, giving another thread the chance to access
+		 * the cache. Therefore, if `base` was already there, this other
+		 * thread could free() it (e.g. to make space for another entry)
+		 * before we are done using it.
 		 */
-		if (!data)
-			error("failed to apply delta");
+		if (!external_base)
+			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
 
 		free(delta_data);
 		free(external_base);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache()
  2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
@ 2020-09-28 16:50       ` Matheus Tavares
  2020-09-28 18:22         ` Junio C Hamano
  2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  2 siblings, 1 reply; 12+ messages in thread
From: Matheus Tavares @ 2020-09-28 16:50 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller

When add_delta_base_cache() is called with a base that is already in the
cache, no operation is performed. But the check is done after allocating
space for a new entry, so we end up leaking memory on the early return.
Also, the caller always expect that the base will be inserted, so it
never free()'s it. To fix both of these memory leaks, let's move the
allocation of a new entry further down in add_delta_base_cache(), and
make the function return an integer to indicate whether the insertion
was performed or not. Then, make the caller free() the base when needed.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 packfile.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/packfile.c b/packfile.c
index 0319418d88..177793e01a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1471,10 +1471,10 @@ void clear_delta_base_cache(void)
 	}
 }
 
-static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
+static int add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	void *base, unsigned long base_size, enum object_type type)
 {
-	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
+	struct delta_base_cache_entry *ent;
 	struct list_head *lru, *tmp;
 
 	/*
@@ -1483,7 +1483,7 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	 * and III might run concurrently across multiple threads).
 	 */
 	if (in_delta_base_cache(p, base_offset))
-		return;
+		return 0;
 
 	delta_base_cached += base_size;
 
@@ -1495,6 +1495,7 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 		release_delta_base_cache(f);
 	}
 
+	ent = xmalloc(sizeof(*ent));
 	ent->key.p = p;
 	ent->key.base_offset = base_offset;
 	ent->type = type;
@@ -1506,6 +1507,7 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
 	hashmap_entry_init(&ent->ent, pack_entry_hash(p, base_offset));
 	hashmap_add(&delta_base_cache, &ent->ent);
+	return 1;
 }
 
 int packed_object_info(struct repository *r, struct packed_git *p,
@@ -1841,8 +1843,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 		 * thread could free() it (e.g. to make space for another entry)
 		 * before we are done using it.
 		 */
-		if (!external_base)
-			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
+		if (!external_base && !add_delta_base_cache(p, base_obj_offset,
+						base, base_size, type)) {
+			free(base);
+		}
 
 		free(delta_data);
 		free(external_base);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] packfile: fix race condition on unpack_entry()
  2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
@ 2020-09-28 18:05         ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2020-09-28 18:05 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, phil.hord, dstolee, jonathantanmy, stefanbeller

Matheus Tavares <matheus.bernardino@usp.br> writes:

> To fix the race condition (and later segmentation fault), let's reorder
> the aforementioned steps so that the lock is not released between 1.
> and 3.

In other words, we hold the base in core only for ourselves without
adding it to the base cache, apply the delta to produce the result
and then place the base in the cache, and the reason why this change
fixes the breakage is because the base we have locally and not in
cache will not be seen by other people and will not be freed without
our consent?  Which does make sense.

But I was confused by the explanation "lock is not released".  We do
release the same lock when we call unpack_compressed_entry(), and
reaquire it before the unpack_compressed_entry() returns.  What the
reordering achieves is to protect the base from getting evicted when
the unlocking and relocing happens, no?

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache()
  2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
@ 2020-09-28 18:22         ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2020-09-28 18:22 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: git, phil.hord, dstolee, jonathantanmy, stefanbeller

Matheus Tavares <matheus.bernardino@usp.br> writes:

> When add_delta_base_cache() is called with a base that is already in the
> cache, no operation is performed. But the check is done after allocating
> space for a new entry, so we end up leaking memory on the early return.

Wow, that's so obvious a leak that it is surprising it has been
unnoticed, especially given that the runtime inflation of the
packfile was written so long time ago and was a central part of the
system.

I had to dig and find out that the breakage was fairly recent from
early this year, made in 31877c9a (object-store: allow threaded
access to object reading, 2020-01-15).

> Also, the caller always expect that the base will be inserted, so it
> never free()'s it. To fix both of these memory leaks, let's move the
> allocation of a new entry further down in add_delta_base_cache(), and
> make the function return an integer to indicate whether the insertion
> was performed or not. Then, make the caller free() the base when needed.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>

> @@ -1841,8 +1843,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
>  		 * thread could free() it (e.g. to make space for another entry)
>  		 * before we are done using it.
>  		 */
> -		if (!external_base)
> -			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
> +		if (!external_base && !add_delta_base_cache(p, base_obj_offset,
> +						base, base_size, type)) {
> +			free(base);
> +		}

When you have to wrap a long expression, try to split after an
operator near the root of the parse tree, e.g.

		if (!external_base &&
		    !add_delta_base_cache(p, base_obj_offset, base, base_size, type)) {

would make the result easier to follow.

I however suspect that it may be better let add_delta_base_cache()
do the freeing.  There is only one caller, and from its point of
view, the timing when it throws the base at the cache (after the
previous patch) is when it is done with it.

In other words we can think of the call to add_delta_base_cache() as
the caller saying: "I am done with this, but somebody else might
want to reuse it later, so do whatever you want to do with it".  

If we were to go that route, it might even make sense to rename it
to reflect that mentality from the viewpoint of the caller, but a
single-caller helper like this one it may not matter all that much.

Thanks.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/2] Fix race condition and memory leak in delta base cache
  2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
  2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
@ 2020-09-29  0:01       ` Matheus Tavares
  2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
  2020-09-29  0:01         ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
  2 siblings, 2 replies; 12+ messages in thread
From: Matheus Tavares @ 2020-09-29  0:01 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller, gitster

Changes since v1:

- Reworded patch one to remove misleading sentence about the locking
  during step 2.
- In patch two, let add_delta_base_cache() take full ownership of the
  base, properly releasing it when the insertion is skipped.

Matheus Tavares (2):
  packfile: fix race condition on unpack_entry()
  packfile: fix memory leak in add_delta_base_cache()

 packfile.c | 48 +++++++++++++++++++++++++++++-------------------
 1 file changed, 29 insertions(+), 19 deletions(-)

Range-diff against v1:
1:  42a7948f94 ! 1:  948d07673f packfile: fix race condition on unpack_entry()
    @@ Commit message
         objects.
     
         To fix the race condition (and later segmentation fault), let's reorder
    -    the aforementioned steps so that the lock is not released between 1.
    -    and 3. An alternative solution which would not require the reordering
    -    would be to duplicate `base` before inserting it in the cache.
    -    However, as Phil Hord mentioned, memcpy()'ing large bases can negatively
    -    affect performance: in his experiments, this alternative approach slowed
    -    git-grep down by 10% to 20%.
    +    the aforementioned steps so that `base` is only added to the cache at
    +    the end. This will prevent the buffer from being released by another
    +    thread while it is still in use. An alternative solution which would not
    +    require the reordering would be to duplicate `base` before inserting it
    +    in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases
    +    can negatively affect performance: in his experiments, this alternative
    +    approach slowed git-grep down by 10% to 20%.
     
         Reported-by: Phil Hord <phil.hord@gmail.com>
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2:  5b6e3019e0 ! 2:  f15f0c82fd packfile: fix memory leak in add_delta_base_cache()
    @@ Commit message
         When add_delta_base_cache() is called with a base that is already in the
         cache, no operation is performed. But the check is done after allocating
         space for a new entry, so we end up leaking memory on the early return.
    -    Also, the caller always expect that the base will be inserted, so it
    -    never free()'s it. To fix both of these memory leaks, let's move the
    +    In addition, the caller never free()'s the base as it expects the
    +    function to take ownership of it. But the base is not released when we
    +    skip insertion, so it also gets leaked. To fix these problems, move the
         allocation of a new entry further down in add_delta_base_cache(), and
    -    make the function return an integer to indicate whether the insertion
    -    was performed or not. Then, make the caller free() the base when needed.
    +    free() the base on early return.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
      ## packfile.c ##
     @@ packfile.c: void clear_delta_base_cache(void)
    - 	}
    - }
    - 
    --static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
    -+static int add_delta_base_cache(struct packed_git *p, off_t base_offset,
    + static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
      	void *base, unsigned long base_size, enum object_type type)
      {
     -	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
    @@ packfile.c: void clear_delta_base_cache(void)
      
      	/*
     @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
    + 	 * is unpacking the same object, in unpack_entry() (since its phases I
      	 * and III might run concurrently across multiple threads).
      	 */
    - 	if (in_delta_base_cache(p, base_offset))
    --		return;
    -+		return 0;
    +-	if (in_delta_base_cache(p, base_offset))
    ++	if (in_delta_base_cache(p, base_offset)) {
    ++		free(base);
    + 		return;
    ++	}
      
      	delta_base_cached += base_size;
      
    @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_of
      	ent->key.p = p;
      	ent->key.base_offset = base_offset;
      	ent->type = type;
    -@@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
    - 		hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0);
    - 	hashmap_entry_init(&ent->ent, pack_entry_hash(p, base_offset));
    - 	hashmap_add(&delta_base_cache, &ent->ent);
    -+	return 1;
    - }
    - 
    - int packed_object_info(struct repository *r, struct packed_git *p,
    -@@ packfile.c: void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
    - 		 * thread could free() it (e.g. to make space for another entry)
    - 		 * before we are done using it.
    - 		 */
    --		if (!external_base)
    --			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
    -+		if (!external_base && !add_delta_base_cache(p, base_obj_offset,
    -+						base, base_size, type)) {
    -+			free(base);
    -+		}
    - 
    - 		free(delta_data);
    - 		free(external_base);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] packfile: fix race condition on unpack_entry()
  2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
@ 2020-09-29  0:01         ` Matheus Tavares
  2020-10-02 20:06           ` Phil Hord
  2020-09-29  0:01         ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
  1 sibling, 1 reply; 12+ messages in thread
From: Matheus Tavares @ 2020-09-29  0:01 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller, gitster

The third phase of unpack_entry() performs the following sequence in a
loop, until all the deltas enumerated in phase one are applied and the
entry is fully reconstructed:

1. Add the current base entry to the delta base cache
2. Unpack the next delta
3. Patch the unpacked delta on top of the base

When the optional object reading lock is enabled, the above steps will
be performed while holding the lock. However, step 2. momentarily
releases it so that inflation can be performed in parallel for increased
performance. Because the `base` buffer inserted in the cache at 1. is
not duplicated, another thread can potentially free() it while the lock
is released at 2. (e.g. when there is no space left in the cache to
insert another entry). In this case, the later attempt to dereference
`base` at 3. will cause a segmentation fault. This problem was observed
during a multithreaded git-grep execution on a repository with large
objects.

To fix the race condition (and later segmentation fault), let's reorder
the aforementioned steps so that `base` is only added to the cache at
the end. This will prevent the buffer from being released by another
thread while it is still in use. An alternative solution which would not
require the reordering would be to duplicate `base` before inserting it
in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases
can negatively affect performance: in his experiments, this alternative
approach slowed git-grep down by 10% to 20%.

Reported-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 packfile.c | 41 ++++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/packfile.c b/packfile.c
index 9ef27508f2..0319418d88 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1775,12 +1775,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 		void *external_base = NULL;
 		unsigned long delta_size, base_size = size;
 		int i;
+		off_t base_obj_offset = obj_offset;
 
 		data = NULL;
 
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
-
 		if (!base) {
 			/*
 			 * We're probably in deep shit, but let's try to fetch
@@ -1818,24 +1816,33 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			      "at offset %"PRIuMAX" from %s",
 			      (uintmax_t)curpos, p->pack_name);
 			data = NULL;
-			free(external_base);
-			continue;
-		}
+		} else {
+			data = patch_delta(base, base_size, delta_data,
+					   delta_size, &size);
 
-		data = patch_delta(base, base_size,
-				   delta_data, delta_size,
-				   &size);
+			/*
+			 * We could not apply the delta; warn the user, but
+			 * keep going. Our failure will be noticed either in
+			 * the next iteration of the loop, or if this is the
+			 * final delta, in the caller when we return NULL.
+			 * Those code paths will take care of making a more
+			 * explicit warning and retrying with another copy of
+			 * the object.
+			 */
+			if (!data)
+				error("failed to apply delta");
+		}
 
 		/*
-		 * We could not apply the delta; warn the user, but keep going.
-		 * Our failure will be noticed either in the next iteration of
-		 * the loop, or if this is the final delta, in the caller when
-		 * we return NULL. Those code paths will take care of making
-		 * a more explicit warning and retrying with another copy of
-		 * the object.
+		 * We delay adding `base` to the cache until the end of the loop
+		 * because unpack_compressed_entry() momentarily releases the
+		 * obj_read_mutex, giving another thread the chance to access
+		 * the cache. Therefore, if `base` was already there, this other
+		 * thread could free() it (e.g. to make space for another entry)
+		 * before we are done using it.
 		 */
-		if (!data)
-			error("failed to apply delta");
+		if (!external_base)
+			add_delta_base_cache(p, base_obj_offset, base, base_size, type);
 
 		free(delta_data);
 		free(external_base);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache()
  2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
  2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
@ 2020-09-29  0:01         ` Matheus Tavares
  1 sibling, 0 replies; 12+ messages in thread
From: Matheus Tavares @ 2020-09-29  0:01 UTC (permalink / raw)
  To: git; +Cc: phil.hord, dstolee, jonathantanmy, stefanbeller, gitster

When add_delta_base_cache() is called with a base that is already in the
cache, no operation is performed. But the check is done after allocating
space for a new entry, so we end up leaking memory on the early return.
In addition, the caller never free()'s the base as it expects the
function to take ownership of it. But the base is not released when we
skip insertion, so it also gets leaked. To fix these problems, move the
allocation of a new entry further down in add_delta_base_cache(), and
free() the base on early return.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 packfile.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/packfile.c b/packfile.c
index 0319418d88..d31aaaeeaa 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1474,7 +1474,7 @@ void clear_delta_base_cache(void)
 static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	void *base, unsigned long base_size, enum object_type type)
 {
-	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
+	struct delta_base_cache_entry *ent;
 	struct list_head *lru, *tmp;
 
 	/*
@@ -1482,8 +1482,10 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	 * is unpacking the same object, in unpack_entry() (since its phases I
 	 * and III might run concurrently across multiple threads).
 	 */
-	if (in_delta_base_cache(p, base_offset))
+	if (in_delta_base_cache(p, base_offset)) {
+		free(base);
 		return;
+	}
 
 	delta_base_cached += base_size;
 
@@ -1495,6 +1497,7 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 		release_delta_base_cache(f);
 	}
 
+	ent = xmalloc(sizeof(*ent));
 	ent->key.p = p;
 	ent->key.base_offset = base_offset;
 	ent->type = type;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] packfile: fix race condition on unpack_entry()
  2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
@ 2020-10-02 20:06           ` Phil Hord
  0 siblings, 0 replies; 12+ messages in thread
From: Phil Hord @ 2020-10-02 20:06 UTC (permalink / raw)
  To: Matheus Tavares; +Cc: Git, Derrick Stolee, Jonathan Tan, Stefan Beller, gitster

On Mon, Sep 28, 2020 at 5:02 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> The third phase of unpack_entry() performs the following sequence in a
> loop, until all the deltas enumerated in phase one are applied and the
> entry is fully reconstructed:
>
> 1. Add the current base entry to the delta base cache
> 2. Unpack the next delta
> 3. Patch the unpacked delta on top of the base
>
> When the optional object reading lock is enabled, the above steps will
> be performed while holding the lock. However, step 2. momentarily
> releases it so that inflation can be performed in parallel for increased
> performance. Because the `base` buffer inserted in the cache at 1. is
> not duplicated, another thread can potentially free() it while the lock
> is released at 2. (e.g. when there is no space left in the cache to
> insert another entry). In this case, the later attempt to dereference
> `base` at 3. will cause a segmentation fault. This problem was observed
> during a multithreaded git-grep execution on a repository with large
> objects.
>
> To fix the race condition (and later segmentation fault), let's reorder
> the aforementioned steps so that `base` is only added to the cache at
> the end. This will prevent the buffer from being released by another
> thread while it is still in use. An alternative solution which would not
> require the reordering would be to duplicate `base` before inserting it
> in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases
> can negatively affect performance: in his experiments, this alternative
> approach slowed git-grep down by 10% to 20%.
>
> Reported-by: Phil Hord <phil.hord@gmail.com>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---

Thanks for looking after this so quickly.  This all looks good to me,
and I confirmed it does fix the problems I was seeing.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-02 20:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-25  2:36 RFC - concurrency causes segfault in git grep since 2.26.0 Phil Hord
2020-09-25  5:52 ` Matheus Tavares
2020-09-25 19:53   ` Phil Hord
2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-09-28 18:05         ` Junio C Hamano
2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
2020-09-28 18:22         ` Junio C Hamano
2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-10-02 20:06           ` Phil Hord
2020-09-29  0:01         ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).