git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: christian.couder@gmail.com, gitster@pobox.com,
	jrnieder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com,
	jonathantanmy@google.com, peff@peff.net,
	Stefan Beller <stefanbeller@gmail.com>
Subject: [PATCH v3 04/12] replace-object: make replace operations thread-safe
Date: Wed, 15 Jan 2020 23:39:52 -0300	[thread overview]
Message-ID: <d7f739bc57b6f59cab7c718300c28b8c6b0a61a8.1579141989.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1579141989.git.matheus.bernardino@usp.br>

replace-object functions are very close to being thread-safe: the only
current racy section is the lazy initialization at
prepare_replace_object(). The following patches will protect some object
reading operations to be called threaded, but before that, replace
functions must be protected. To do so, add a mutex to struct
raw_object_store and acquire it before lazy initializing the
replace_map. This won't cause any noticeable performance drop as the
mutex will no longer be used after the replace_map is initialized.

Later, when the replace functions are called in parallel, thread
debuggers might point our use of the added replace_map_initialized flag
as a data race. However, as this boolean variable is initialized as
false and it's only updated once, there's no real harm. It's perfectly
fine if the value is updated right after a thread read it in
replace-map.h:lookup_replace_object() (there'll only be a performance
penalty for the affected threads at that moment). We could cease the
debugger warning protecting the variable reading at the said function.
However, this would negatively affect performance for all threads
calling it, at any time, so it's not really worthy since the warning
doesn't represent a real problem. Instead, to make sure we don't get
false positives (at ThreadSanitizer, at least) an entry for the
respective function is added to .tsan-suppressions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .tsan-suppressions |  6 ++++++
 object-store.h     |  2 ++
 object.c           |  2 ++
 replace-object.c   | 11 ++++++++++-
 replace-object.h   |  7 ++++++-
 5 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/.tsan-suppressions b/.tsan-suppressions
index 8c85014a0a..5ba86d6845 100644
--- a/.tsan-suppressions
+++ b/.tsan-suppressions
@@ -8,3 +8,9 @@
 # in practice it (hopefully!) doesn't matter.
 race:^want_color$
 race:^transfer_debug$
+
+# A boolean value, which tells whether the replace_map has been initialized or
+# not, is read racily with an update. As this variable is written to only once,
+# and it's OK if the value change right after reading it, this shouldn't be a
+# problem.
+race:^lookup_replace_object$
diff --git a/object-store.h b/object-store.h
index 55ee639350..33739c9dee 100644
--- a/object-store.h
+++ b/object-store.h
@@ -125,6 +125,8 @@ struct raw_object_store {
 	 * (see git-replace(1)).
 	 */
 	struct oidmap *replace_map;
+	unsigned replace_map_initialized : 1;
+	pthread_mutex_t replace_mutex; /* protect object replace functions */
 
 	struct commit_graph *commit_graph;
 	unsigned commit_graph_attempted : 1; /* if loading has been attempted */
diff --git a/object.c b/object.c
index 142ef69399..b4e1d3db3c 100644
--- a/object.c
+++ b/object.c
@@ -480,6 +480,7 @@ struct raw_object_store *raw_object_store_new(void)
 	memset(o, 0, sizeof(*o));
 	INIT_LIST_HEAD(&o->packed_git_mru);
 	hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
+	pthread_mutex_init(&o->replace_mutex, NULL);
 	return o;
 }
 
@@ -507,6 +508,7 @@ void raw_object_store_clear(struct raw_object_store *o)
 
 	oidmap_free(o->replace_map, 1);
 	FREE_AND_NULL(o->replace_map);
+	pthread_mutex_destroy(&o->replace_mutex);
 
 	free_commit_graph(o->commit_graph);
 	o->commit_graph = NULL;
diff --git a/replace-object.c b/replace-object.c
index e295e87943..7bd9aba6ee 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -34,14 +34,23 @@ static int register_replace_ref(struct repository *r,
 
 void prepare_replace_object(struct repository *r)
 {
-	if (r->objects->replace_map)
+	if (r->objects->replace_map_initialized)
 		return;
 
+	pthread_mutex_lock(&r->objects->replace_mutex);
+	if (r->objects->replace_map_initialized) {
+		pthread_mutex_unlock(&r->objects->replace_mutex);
+		return;
+	}
+
 	r->objects->replace_map =
 		xmalloc(sizeof(*r->objects->replace_map));
 	oidmap_init(r->objects->replace_map, 0);
 
 	for_each_replace_ref(r, register_replace_ref, NULL);
+	r->objects->replace_map_initialized = 1;
+
+	pthread_mutex_unlock(&r->objects->replace_mutex);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
diff --git a/replace-object.h b/replace-object.h
index 04ed7a85a2..3fbc32eb7b 100644
--- a/replace-object.h
+++ b/replace-object.h
@@ -24,12 +24,17 @@ const struct object_id *do_lookup_replace_object(struct repository *r,
  * name (replaced recursively, if necessary).  The return value is
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
+ *
+ * Note: some thread debuggers might point a data race on the
+ * replace_map_initialized reading in this function. However, we know there's no
+ * problem in the value being updated by one thread right after another one read
+ * it here (and it should be written to only once, anyway).
  */
 static inline const struct object_id *lookup_replace_object(struct repository *r,
 							    const struct object_id *oid)
 {
 	if (!read_replace_refs ||
-	    (r->objects->replace_map &&
+	    (r->objects->replace_map_initialized &&
 	     r->objects->replace_map->map.tablesize == 0))
 		return oid;
 	return do_lookup_replace_object(r, oid);
-- 
2.24.1


  parent reply	other threads:[~2020-01-16  2:40 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
2019-11-12  2:54     ` Jonathan Tan
2019-11-13  5:20       ` Jeff King
2019-11-14  5:57         ` Matheus Tavares Bernardino
2019-11-14  6:01           ` Jeff King
2019-11-14 18:15             ` Jonathan Tan
2019-11-15  4:12               ` Jeff King
2019-12-19 22:27                 ` Matheus Tavares Bernardino
2020-01-09 22:02                   ` Matheus Tavares Bernardino
2020-01-10 19:07                     ` Christian Couder
2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` Matheus Tavares [this message]
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7f739bc57b6f59cab7c718300c28b8c6b0a61a8.1579141989.git.matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=olyatelezhnaya@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).