git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: phil.hord@gmail.com
Cc: dstolee@microsoft.com, git@vger.kernel.org,
	jonathantanmy@google.com, stefanbeller@gmail.com
Subject: Re: RFC - concurrency causes segfault in git grep since 2.26.0
Date: Fri, 25 Sep 2020 02:52:58 -0300	[thread overview]
Message-ID: <20200925055258.79347-1-matheus.bernardino@usp.br> (raw)
In-Reply-To: <CABURp0r_VPNeU5ugbJDQo=hV_aOit7W-pf9nhDF9Og=vVJG8tw@mail.gmail.com>

Hi, Phil

Thanks for the bug report and for the crash logs.

On Thu, Sep 24, 2020 at 11:36 PM Phil Hord <phil.hord@gmail.com> wrote:
>
> Hi list.  It's been a while.
>
> I can reliably reproduce a segfault in git grep when searching the
> object DB. It is caused by a race when threads > 2.
>
> I have a patch that fixes the problem, but I don't know exactly why.
> Can someone else explain it and/or offer a better solution?
>
> ====
>
> diff --git packfile.c packfile.c
> index e69012e7f2..52b7b54aeb 100644
> --- packfile.c
> +++ packfile.c
> @@ -1812,7 +1812,9 @@ void *unpack_entry(struct repository *r, struct
> packed_git *p, off_t obj_offset,
>                 if (!base)
>                         continue;
>
> +               obj_read_lock();
>                 delta_data = unpack_compressed_entry(p, &w_curs,
> curpos, delta_size);
> +               obj_read_unlock();
>
>                 if (!delta_data) {
>                         error("failed to unpack compressed delta "
>
> ====

Hmm, obj_read_mutex is a recursive lock, so by adding the
obj_read_lock() call here, we are incrementing the lock value to [at
least] 2 (because oid_object_info_extended() had already incremented
once). Therefore, when unpack_compressed_entry() calls obj_read_unlock()
before inflating the entry, the lock is not actually released, only
decremented. So the effect of this change is that the phase III of
unpack_entry() is performed entirely without releasing the lock.

Your crash log shows us that the segfault happened when trying to
memcpy() the string `base` (in unpack_entry()). I.e., the same string
that we had previously added to the delta base cache, right before
calling unpack_compressed_entry() and releasing the lock. The
problematic sequence is:
	add `base` to the cache -> release lock -> inflate ->
	acquire lock -> try to use `base`

Maybe, when a thread X releases the lock to perform decompression,
thread Y acquires the lock and tries to add another base to the cache.
The cache is full, so Y has to remove entries, and it ends up free()'ing
the base that was just inserted by X. Later, X tries to dereference
`base` in patch_entry(), which would cause a segfault. It would also
explain why your change solves the segfault.

I'm not sure, though, why this entry would be removed, since it was just
added... Maybe Y was adding a huge base to the cache, which required
removing all entries?

Anyways, you mentioned you can reproduce the failure quite reliably in your
repo with 20 threads, right? Could you please check with the following patch
applied? It adds a `base` copy to the cache instead of the original string,
allowing us to keep running decompression in parallel but without the risk of
having another thread free()'ing `base` in the mean time.

diff --git a/packfile.c b/packfile.c
index 9ef27508f2..79f83b6034 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1772,14 +1772,15 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 	while (delta_stack_nr) {
 		void *delta_data;
 		void *base = data;
-		void *external_base = NULL;
 		unsigned long delta_size, base_size = size;
 		int i;
 
 		data = NULL;
 
-		if (base)
-			add_delta_base_cache(p, obj_offset, base, base_size, type);
+		if (base) {
+			char *base_dup = memcpy(xmalloc(base_size), base, base_size);
+			add_delta_base_cache(p, obj_offset, base_dup, base_size, type);
+		}
 
 		if (!base) {
 			/*
@@ -1799,7 +1800,6 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 				      p->pack_name);
 				mark_bad_packed_object(p, base_oid.hash);
 				base = read_object(r, &base_oid, &type, &base_size);
-				external_base = base;
 			}
 		}
 
@@ -1818,7 +1818,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			      "at offset %"PRIuMAX" from %s",
 			      (uintmax_t)curpos, p->pack_name);
 			data = NULL;
-			free(external_base);
+			free(base);
 			continue;
 		}
 
@@ -1838,7 +1838,7 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
 			error("failed to apply delta");
 
 		free(delta_data);
-		free(external_base);
+		free(base);
 	}
 
 	if (final_type)
--

  reply	other threads:[~2020-09-25  5:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25  2:36 RFC - concurrency causes segfault in git grep since 2.26.0 Phil Hord
2020-09-25  5:52 ` Matheus Tavares [this message]
2020-09-25 19:53   ` Phil Hord
2020-09-28 16:50     ` [PATCH 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-28 16:50       ` [PATCH 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-09-28 18:05         ` Junio C Hamano
2020-09-28 16:50       ` [PATCH 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares
2020-09-28 18:22         ` Junio C Hamano
2020-09-29  0:01       ` [PATCH v2 0/2] Fix race condition and memory leak in delta base cache Matheus Tavares
2020-09-29  0:01         ` [PATCH v2 1/2] packfile: fix race condition on unpack_entry() Matheus Tavares
2020-10-02 20:06           ` Phil Hord
2020-09-29  0:01         ` [PATCH v2 2/2] packfile: fix memory leak in add_delta_base_cache() Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200925055258.79347-1-matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=phil.hord@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).