git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation
@ 2019-08-10 20:27 Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
                   ` (4 more replies)
  0 siblings, 5 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy

This series focuses on allowing parallel access to zlib inflation and
using that to perform a faster git-grep in the non-worktree case.

Threads were disabled for this case at 53b8d93 ("grep: disable
threading in non-worktree case", 12-12-2011), due to performance drops.

However, by allowing threads to perform inflation in parallel, we can
regain the speedup. This is a good hotspot for parallelism as some test
cases[1] showed that it can account for up to 48% of execution time.
And besides that, inflation tasks are already independent of each other.

As a result, grepping 'abcd[02]' ("Regex 1") and
'(static|extern) (int|double) \*' ("Regex 2") at chromium's
repository[2], I got (means of 30 executions):

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.3557s  | 20.8410s
        2    |   9.7170s  | 11.2415s
        8    |   6.1723s  |  6.9378s

As a reference, just enabling threads in the non-worktree case,
without parallel inflation, I got:

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.1359s  | 20.8306s
        2    |  14.5036s  | 15.4172s
        8    |  13.6304s  | 13.8659s

For now, the optimization is not supported when --textconv or
--recurse-submodules are used, but I hope to send another patchset for
that still during GSoC. We may also try to allow even more parallelism,
refining the added 'obj_read_lock'.

[1]: https://matheustavares.gitlab.io/posts/week-6-working-at-zlib-inflation#multithreading-zlib-inflation
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

travis build: https://travis-ci.org/matheustavares/git/builds/570255029

Matheus Tavares (4):
  object-store: add lock to read_object_file_extended()
  grep: allow locks to be enabled individually
  grep: disable grep_read_mutex when possible
  grep: re-enable threads in some non-worktree cases

 Documentation/git-grep.txt | 12 ++++++++
 builtin/grep.c             | 22 +++++++++++---
 grep.c                     |  4 +--
 grep.h                     |  8 +++--
 object-store.h             |  4 +++
 packfile.c                 |  7 +++++
 sha1-file.c                | 61 ++++++++++++++++++++++++++++++++++----
 7 files changed, 105 insertions(+), 13 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended()
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Stefan Beller, Denton Liu,
	Jeff King, Junio C Hamano

Allow read_object_file_extended() to be called by multiple threads
protecting it with a lock. The lock usage can be toggled with
enable_obj_read_lock() and disable_obj_read_lock().

Probably there are many spots in read_object_file_extended()'s call
chain that could be executed unlocked (and thus, in parallel). But, for
now, we are only interested in allowing parallel access to zlib
inflation. This is one of the sections where object reading spends most
of the time and it's already thread-safe. So, to take advantage of that,
the lock is released when entering it and re-acquired right after. We
may refine the lock to also exploit other possible parallel spots in the
future, but threaded zlib inflation should already give great speedups.

Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
now it is since phase I and phase III of unpack_entry() may execute
concurrently.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h |  4 ++++
 packfile.c     |  7 ++++++
 sha1-file.c    | 61 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 7f7b3cdd80..cfc9484995 100644
--- a/object-store.h
+++ b/object-store.h
@@ -159,6 +159,10 @@ const char *loose_object_path(struct repository *r, struct strbuf *buf,
 void *map_loose_object(struct repository *r, const struct object_id *oid,
 		       unsigned long *size);
 
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+void obj_read_lock(void);
+void obj_read_unlock(void);
 void *read_object_file_extended(struct repository *r,
 				const struct object_id *oid,
 				enum object_type *type,
diff --git a/packfile.c b/packfile.c
index fc43a6c52c..de93dc50e2 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1115,7 +1115,9 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1468,6 +1470,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	if (get_delta_base_cache_entry(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1597,7 +1602,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index 84fd02f107..f5ff51aedb 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1560,16 +1560,54 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
 	return 0;
 }
 
+static pthread_mutex_t obj_read_mutex;
+static int obj_read_use_lock = 0;
+
+/*
+ * Enabling the object read lock allows multiple threads to safely call the
+ * following functions in parallel: repo_read_object_file(), read_object_file()
+ * and read_object_file_extended().
+ */
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	pthread_mutex_init(&obj_read_mutex, NULL);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
+void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 /*
  * This function dies on corrupt objects; the callers who want to
  * deal with them should arrange to call read_object() and give error
  * messages themselves.
  */
-void *read_object_file_extended(struct repository *r,
-				const struct object_id *oid,
-				enum object_type *type,
-				unsigned long *size,
-				int lookup_replace)
+static void *do_read_object_file_extended(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size,
+					  int lookup_replace)
 {
 	void *data;
 	const struct packed_git *p;
@@ -1602,6 +1640,19 @@ void *read_object_file_extended(struct repository *r,
 	return NULL;
 }
 
+void *read_object_file_extended(struct repository *r,
+				const struct object_id *oid,
+				enum object_type *type,
+				unsigned long *size,
+				int lookup_replace)
+{
+	void *data;
+	obj_read_lock();
+	data = do_read_object_file_extended(r, oid, type, size, lookup_replace);
+	obj_read_unlock();
+	return data;
+}
+
 void *read_object_with_reference(struct repository *r,
 				 const struct object_id *oid,
 				 const char *required_type_name,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [GSoC][PATCH 2/4] grep: allow locks to be enabled individually
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Junio C Hamano, Denton Liu,
	brian m. carlson

git-grep has some internal locks to protect thread-unsafe operations
when running with threads. The usage of these locks can be toggled
through the variable 'grep_use_locks'. However, it's not currently
possible to enable each lock individually. And since object reading has
its own locks now, it is desirable to disable the respective grep lock
(and only that) in cases where we can do so. To do that, transform
'grep_use_locks' from a binary variable to a bitmask, which controls
each lock individually.

The actual disabling of grep_read_lock, when possible, will be done in
the following patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 2 +-
 grep.c         | 4 ++--
 grep.h         | 8 ++++++--
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 560051784e..a871bad8ad 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -205,7 +205,7 @@ static void start_threads(struct grep_opt *opt)
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
-	grep_use_locks = 1;
+	grep_use_locks = GREP_USE_ALL_LOCKS;
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
diff --git a/grep.c b/grep.c
index cd952ef5d3..3aca0db435 100644
--- a/grep.c
+++ b/grep.c
@@ -1523,13 +1523,13 @@ pthread_mutex_t grep_attr_mutex;
 
 static inline void grep_attr_lock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_ATTR_LOCK)
 		pthread_mutex_lock(&grep_attr_mutex);
 }
 
 static inline void grep_attr_unlock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_ATTR_LOCK)
 		pthread_mutex_unlock(&grep_attr_mutex);
 }
 
diff --git a/grep.h b/grep.h
index 1875880f37..02bffacfa2 100644
--- a/grep.h
+++ b/grep.h
@@ -229,6 +229,10 @@ int grep_source(struct grep_opt *opt, struct grep_source *gs);
 struct grep_opt *grep_opt_dup(const struct grep_opt *opt);
 int grep_threads_ok(const struct grep_opt *opt);
 
+#define GREP_USE_READ_LOCK (1 << 0)
+#define GREP_USE_ATTR_LOCK (1 << 1)
+#define GREP_USE_ALL_LOCKS (~0)
+
 /*
  * Mutex used around access to the attributes machinery if
  * opt->use_threads.  Must be initialized/destroyed by callers!
@@ -239,13 +243,13 @@ extern pthread_mutex_t grep_read_mutex;
 
 static inline void grep_read_lock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_READ_LOCK)
 		pthread_mutex_lock(&grep_read_mutex);
 }
 
 static inline void grep_read_unlock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_READ_LOCK)
 		pthread_mutex_unlock(&grep_read_mutex);
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
  4 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Brandon Williams,
	Ævar Arnfjörð Bjarmason, Junio C Hamano

git-grep uses 'grep_read_mutex' to protect some object reading
operations. But these have their own internal lock now, which ensure a
better performance (with more parallel regions). So, disable the former
when it's possible to use the latter, with enable_obj_read_lock().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index a871bad8ad..fa51392222 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -205,7 +205,17 @@ static void start_threads(struct grep_opt *opt)
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
-	grep_use_locks = GREP_USE_ALL_LOCKS;
+	if (recurse_submodules || opt->allow_textconv) {
+		/*
+		 * textconv and submodules' operations are not thread-safe yet
+		 * so we must use grep_read_lock when grepping multithreaded
+		 * with these options.
+		 */
+		grep_use_locks = GREP_USE_ALL_LOCKS;
+	} else {
+		grep_use_locks = GREP_USE_ATTR_LOCK;
+		enable_obj_read_lock();
+	}
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -227,7 +237,7 @@ static void start_threads(struct grep_opt *opt)
 	}
 }
 
-static int wait_all(void)
+static int wait_all(struct grep_opt *opt)
 {
 	int hit = 0;
 	int i;
@@ -263,6 +273,9 @@ static int wait_all(void)
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	if (!recurse_submodules && !opt->allow_textconv) {
+		disable_obj_read_lock();
+	}
 
 	return hit;
 }
@@ -1140,7 +1153,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	}
 
 	if (num_threads > 1)
-		hit |= wait_all();
+		hit |= wait_all(&opt);
 	if (hit && show_in_pager)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
                   ` (2 preceding siblings ...)
  2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
  4 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Junio C Hamano, Jeff King,
	Brandon Williams, Manav Rathi

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops. But now that
zlib inflation can be performed in parallel, for some of git-grep's
options, we can regain the speedup.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.3557s  | 20.8410s
    2    |   9.7170s  | 11.2415s
    8    |   6.1723s  |  6.9378s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ with 16GB of RAM and SSD. But to make sure the
optimization also performs well on HDD, the tests were repeated on an
AMD Turion 64 X2 TL-62 (dual-core) with 4GB of RAM and HDD (SATA-150,
5400 rpm):

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  40.3347s  |  47.6173s
    2    |  27.6547s  |  35.1797s

Unfortunately, textconv and submodules' operations remain thread-unsafe,
needing locks to be safely executed when threaded. Because of that, it's
not currently worthy to grep in parallel with them. So, when --textconv
or --recurse-submodules are given for a non-worktree case, threads are
kept disabled. In order to clarify this behavior, let's also add a
"NOTES" section to Documentation/git-grep.txt explaining the thread
usage details.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 12 ++++++++++++
 builtin/grep.c             |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 2d27969057..9686875fbc 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -330,6 +330,18 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES
+-----
+
+The --threads option (and grep.threads configuration) will be ignored when
+--open-files-in-pager is used, forcing a single-threaded execution.
+
+When grepping the index file (with --cached or giving tree objects), the
+following options will also suppress thread creation:
+
+	--recurse_submodules
+	--textconv
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index fa51392222..e5a9da471a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1073,7 +1073,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager ||
+	   ((list.nr || cached) && (recurse_submodules || opt.allow_textconv))) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 00/11] grep: improve threading and fix race conditions
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
                   ` (3 preceding siblings ...)
  2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
@ 2019-09-30  1:50 ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
                     ` (11 more replies)
  4 siblings, 12 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git; +Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder

This series focus on re-enabling threads at git-grep for the
non-worktree case. They are currently disabled due to being slower than
single-threaded grep in this case. However, by allowing parallel zlib
inflation when reading objects, speedups of up to 3x were observed.

The patchset also contains some fixes for race conditions found in the
worktree git-grep and thread optimizations to hopefully increase
overall performance.

This version was almost entirely re-written from scratch so I thought a
range-diff wouldn't be very useful. The major differences from the first
one are the race condition fixes and being able to run --textconv and
--recurse-submodules threaded now.

Matheus Tavares (11):
  grep: fix race conditions on userdiff calls
  grep: fix race conditions at grep_submodule()
  grep: fix racy calls in grep_objects()
  replace-object: make replace operations thread-safe
  object-store: allow threaded access to object reading
  grep: replace grep_read_mutex by internal obj read lock
  submodule-config: add skip_if_read option to repo_read_gitmodules()
  grep: allow submodule functions to run in parallel
  grep: protect packed_git [re-]initialization
  grep: re-enable threads in non-worktree case
  grep: move driver pre-load out of critical section

 .tsan-suppressions         |  6 +++
 Documentation/git-grep.txt | 11 +++++
 builtin/grep.c             | 90 +++++++++++++++++++-------------------
 grep.c                     | 32 ++++++++------
 grep.h                     | 13 ------
 object-store.h             | 37 ++++++++++++++++
 object.c                   |  2 +
 packfile.c                 |  9 ++++
 replace-object.c           | 11 ++++-
 replace-object.h           |  7 ++-
 sha1-file.c                | 57 +++++++++++++++++++++---
 submodule-config.c         | 18 +++-----
 submodule-config.h         |  2 +-
 unpack-trees.c             |  4 +-
 14 files changed, 205 insertions(+), 94 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 01/11] grep: fix race conditions on userdiff calls
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git; +Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder

git-grep uses an internal grep_read_mutex to protect object reading
operations. Similarly, there's a grep_attr_mutex to protect calls to the
gitattributes machinery. However, two of the three functions protected
by the last mutex may also perform object reading, as seen bellow:

- userdiff_get_textconv() > notes_cache_init() >
  notes_cache_match_validity() > lookup_commit_reference_gently() >
  parse_object() > repo_has_object_file() >
  repo_has_object_file_with_flags() > oid_object_info_extended()

- userdiff_find_by_path() > git_check_attr() > collect_some_attrs() >
  prepare_attr_stack() > read_attr() > read_attr_from_index() >
  read_blob_data_from_index() > read_object_file()

As these calls are not protected by grep_read_mutex, there might be race
conditions with other threads performing object reading (e.g. threads
calling fill_textconv() at grep.c:fill_textconv_grep()). To prevent
that, let's make sure to acquire the lock before both of these calls.

Note: this patch might slow down the threaded grep in worktree, for the
sake of thread-safeness. However, in the following patches we should
regain performance by replacing grep_read_mutex for an internal object
reading lock and allowing parallel inflation during object reading.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 grep.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index cd952ef5d3..b29946def2 100644
--- a/grep.c
+++ b/grep.c
@@ -1809,7 +1809,9 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		 * is not thread-safe.
 		 */
 		grep_attr_lock();
+		grep_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
+		grep_read_unlock();
 		grep_attr_unlock();
 	}
 
@@ -2177,8 +2179,11 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path)
+	if (gs->path) {
+		grep_read_lock();
 		gs->driver = userdiff_find_by_path(istate, gs->path);
+		grep_read_unlock();
+	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 02/11] grep: fix race conditions at grep_submodule()
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Antonio Ospite, Brandon Williams, Jonathan Tan, Stefan Beller

There're currently two function calls in builtin/grep.c:grep_submodule()
which might result in race conditions:

- submodule_from_path(): it has config_with_options() in its call stack
  which, in turn, may have read_object_file() in its own. Therefore,
  calling the first function without acquiring grep_read_mutex may end
  up causing a race condition with other object read operations
  performed by worker threads (for example, at the fill_textconv()
  call in grep.c:fill_textconv_grep()).
- parse_object_or_die(): it falls into the same problem, having
  repo_has_object_file(the_repository, ...) in its call stack. Besides
  that, parse_object(), which is also called by parse_object_or_die(),
  is thread-unsafe and also called by object reading functions.

It's unlikely to really fall into a data race with these operations as
the volume of calls to them is usually very low. But we better protect
ourselves against this possibility, anyway. So, to solve these issues,
move both of these function calls into the critical section of
grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 2699001fbd..626dbe554d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -407,8 +407,7 @@ static int grep_submodule(struct grep_opt *opt,
 {
 	struct repository subrepo;
 	struct repository *superproject = opt->repo;
-	const struct submodule *sub = submodule_from_path(superproject,
-							  &null_oid, path);
+	const struct submodule *sub;
 	struct grep_opt subopt;
 	int hit;
 
@@ -419,6 +418,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	grep_read_lock();
+	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
 		grep_read_unlock();
@@ -455,9 +455,8 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		object = parse_object_or_die(oid, oid_to_hex(oid));
-
 		grep_read_lock();
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 03/11] grep: fix racy calls in grep_objects()
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Jonathan Tan, Stefan Beller, Brandon Williams, brian m. carlson

deref_tag() calls is_promisor_object() and parse_object(), both of which
perform lazy initializations and other thread-unsafe operations. If it
was only called by grep_objects() this wouldn't be a problem as the
latter is only executed by the main thread. However, deref_tag() is also
present in read_object_file()'s call stack. So calling deref_tag() in
grep_objects() without acquiring the grep_read_mutex may incur in a race
condition with object reading operations (such as the ones internally
performed by fill_textconv(), called at fill_textconv_grep()). The same
problem happens with the call to gitmodules_config_oid() which also has
parse_object() in its call stack. Fix that protecting both call with the
said grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 626dbe554d..fa8b9996d1 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -658,13 +658,18 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
+
+		grep_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
+		grep_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
+			grep_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
+			grep_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 04/11] replace-object: make replace operations thread-safe
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (2 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Stefan Beller

replace-object functions are very close to being thread-safe: the only
current racy section is the lazy initialization at
prepare_replace_object(). The following patches will protect some object
reading operations to be called threaded, but before that, replace
functions must be protected. To do so, add a mutex to struct
raw_object_store and acquire it before lazy initializing the
replace_map. This won't cause any noticeable performance drop as the
mutex will no longer be used after the replace_map is initialized.

Later, when the replace functions are called in parallel, thread
debuggers might point our use of the added replace_map_initialized flag
as a data race. However, as this boolean variable is initialized as
false and it's only updated once, there's no real harm. It's perfectly
fine if the value is updated right after a thread read it in
replace-map.h:lookup_replace_object() (there'll only be a performance
penalty for the affected threads at that moment). We could cease the
debugger warning protecting the variable reading at the said function.
However, this would negatively affect performance for all threads
calling it, at any time, so it's not really worthy since the warning
doesn't represent a real problem. Instead, to make sure we don't get
false positives (at ThreadSanitizer, at least) an entry for the
respective function is added to .tsan-suppressions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .tsan-suppressions |  6 ++++++
 object-store.h     |  2 ++
 object.c           |  2 ++
 replace-object.c   | 11 ++++++++++-
 replace-object.h   |  7 ++++++-
 5 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/.tsan-suppressions b/.tsan-suppressions
index 8c85014a0a..5ba86d6845 100644
--- a/.tsan-suppressions
+++ b/.tsan-suppressions
@@ -8,3 +8,9 @@
 # in practice it (hopefully!) doesn't matter.
 race:^want_color$
 race:^transfer_debug$
+
+# A boolean value, which tells whether the replace_map has been initialized or
+# not, is read racily with an update. As this variable is written to only once,
+# and it's OK if the value change right after reading it, this shouldn't be a
+# problem.
+race:^lookup_replace_object$
diff --git a/object-store.h b/object-store.h
index 7f7b3cdd80..b22e20ad7d 100644
--- a/object-store.h
+++ b/object-store.h
@@ -110,6 +110,8 @@ struct raw_object_store {
 	 * (see git-replace(1)).
 	 */
 	struct oidmap *replace_map;
+	unsigned replace_map_initialized : 1;
+	pthread_mutex_t replace_mutex; /* protect object replace functions */
 
 	struct commit_graph *commit_graph;
 	unsigned commit_graph_attempted : 1; /* if loading has been attempted */
diff --git a/object.c b/object.c
index 07bdd5b26e..7ef5856f57 100644
--- a/object.c
+++ b/object.c
@@ -480,6 +480,7 @@ struct raw_object_store *raw_object_store_new(void)
 
 	memset(o, 0, sizeof(*o));
 	INIT_LIST_HEAD(&o->packed_git_mru);
+	pthread_mutex_init(&o->replace_mutex, NULL);
 	return o;
 }
 
@@ -507,6 +508,7 @@ void raw_object_store_clear(struct raw_object_store *o)
 
 	oidmap_free(o->replace_map, 1);
 	FREE_AND_NULL(o->replace_map);
+	pthread_mutex_destroy(&o->replace_mutex);
 
 	free_commit_graph(o->commit_graph);
 	o->commit_graph = NULL;
diff --git a/replace-object.c b/replace-object.c
index e295e87943..7bd9aba6ee 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -34,14 +34,23 @@ static int register_replace_ref(struct repository *r,
 
 void prepare_replace_object(struct repository *r)
 {
-	if (r->objects->replace_map)
+	if (r->objects->replace_map_initialized)
 		return;
 
+	pthread_mutex_lock(&r->objects->replace_mutex);
+	if (r->objects->replace_map_initialized) {
+		pthread_mutex_unlock(&r->objects->replace_mutex);
+		return;
+	}
+
 	r->objects->replace_map =
 		xmalloc(sizeof(*r->objects->replace_map));
 	oidmap_init(r->objects->replace_map, 0);
 
 	for_each_replace_ref(r, register_replace_ref, NULL);
+	r->objects->replace_map_initialized = 1;
+
+	pthread_mutex_unlock(&r->objects->replace_mutex);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
diff --git a/replace-object.h b/replace-object.h
index 04ed7a85a2..3fbc32eb7b 100644
--- a/replace-object.h
+++ b/replace-object.h
@@ -24,12 +24,17 @@ const struct object_id *do_lookup_replace_object(struct repository *r,
  * name (replaced recursively, if necessary).  The return value is
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
+ *
+ * Note: some thread debuggers might point a data race on the
+ * replace_map_initialized reading in this function. However, we know there's no
+ * problem in the value being updated by one thread right after another one read
+ * it here (and it should be written to only once, anyway).
  */
 static inline const struct object_id *lookup_replace_object(struct repository *r,
 							    const struct object_id *oid)
 {
 	if (!read_replace_refs ||
-	    (r->objects->replace_map &&
+	    (r->objects->replace_map_initialized &&
 	     r->objects->replace_map->map.tablesize == 0))
 		return oid;
 	return do_lookup_replace_object(r, oid);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (3 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-11-12  2:54     ` Jonathan Tan
  2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Stefan Beller, Jonathan Tan, Jeff King

Allow object reading to be performed by multiple threads protecting it
with an internal lock. The lock usage can be toggled with
enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
functions which can be safely called in parallel are:
read_object_file_extended(), repo_read_object_file(),
read_object_file(), read_object_with_reference(), read_object(),
oid_object_info() and oid_object_info_extended(). It's also possible to
use obj_read_lock() and obj_read_unlock() to protect other sections that
cannot execute in parallel with object reading.

Probably there are many spots in the functions listed above that could
be executed unlocked (and thus, in parallel). But, for now, we are most
interested in allowing parallel access to zlib inflation. This is one of
the sections where object reading spends most of the time and it's
already thread-safe. So, to take advantage of that, the respective lock
is released when calling git_inflate() and re-acquired right after, for
every calling spot in oid_object_info_extended()'s call chain. We may
refine the lock to also exploit other possible parallel spots in the
future, but threaded zlib inflation should already give great speedups
for now.

Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
now it is since phase I and phase III of unpack_entry() may execute
concurrently.

Another important thing to notice is that the object reading lock only
works in conjunction with the 'struct raw_object_store's replace_mutex.
Otherwise, there would still be racy spots in object reading
functions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h | 35 +++++++++++++++++++++++++++++++
 packfile.c     |  7 +++++++
 sha1-file.c    | 57 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index b22e20ad7d..8f63f21ad2 100644
--- a/object-store.h
+++ b/object-store.h
@@ -6,6 +6,7 @@
 #include "list.h"
 #include "sha1-array.h"
 #include "strbuf.h"
+#include "thread-utils.h"
 
 struct object_directory {
 	struct object_directory *next;
@@ -230,6 +231,40 @@ int has_loose_object_nonlocal(const struct object_id *);
 
 void assert_oid_type(const struct object_id *oid, enum object_type expect);
 
+/*
+ * Enabling the object read lock allows multiple threads to safely call the
+ * following functions in parallel: repo_read_object_file(), read_object_file(),
+ * read_object_file_extended(), read_object_with_reference(), read_object(),
+ * oid_object_info() and oid_object_info_extended().
+ *
+ * obj_read_lock() and obj_read_unlock() may also be used to protect other
+ * section which cannot execute in parallel with object reading. Since the used
+ * lock is a recursive mutex, these sections can even contain calls to object
+ * reading functions. However, beware that in these cases zlib inflation won't
+ * be performed in parallel, losing performance.
+ *
+ * TODO: oid_object_info_extended()'s call stack has a recursive behavior. If
+ * any of its callees end up calling it, this recursive call won't benefit from
+ * parallel inflation.
+ */
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+
+extern int obj_read_use_lock;
+extern pthread_mutex_t obj_read_mutex;
+
+static inline void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+static inline void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 struct object_info {
 	/* Request */
 	enum object_type *typep;
diff --git a/packfile.c b/packfile.c
index 1a7d69fe32..a336972174 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1098,7 +1098,9 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1451,6 +1453,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	if (get_delta_base_cache_entry(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1580,7 +1585,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index e85f249a5d..b4f2f5cb94 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1148,6 +1148,8 @@ static int unpack_loose_short_header(git_zstream *stream,
 				     unsigned char *map, unsigned long mapsize,
 				     void *buffer, unsigned long bufsiz)
 {
+	int ret;
+
 	/* Get the data stream */
 	memset(stream, 0, sizeof(*stream));
 	stream->next_in = map;
@@ -1156,7 +1158,11 @@ static int unpack_loose_short_header(git_zstream *stream,
 	stream->avail_out = bufsiz;
 
 	git_inflate_init(stream);
-	return git_inflate(stream, 0);
+	obj_read_unlock();
+	ret = git_inflate(stream, 0);
+	obj_read_lock();
+
+	return ret;
 }
 
 int unpack_loose_header(git_zstream *stream,
@@ -1201,7 +1207,9 @@ static int unpack_loose_header_to_strbuf(git_zstream *stream, unsigned char *map
 	stream->avail_out = bufsiz;
 
 	do {
+		obj_read_unlock();
 		status = git_inflate(stream, 0);
+		obj_read_lock();
 		strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
 		if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
 			return 0;
@@ -1241,8 +1249,11 @@ static void *unpack_loose_rest(git_zstream *stream,
 		 */
 		stream->next_out = buf + bytes;
 		stream->avail_out = size - bytes;
-		while (status == Z_OK)
+		while (status == Z_OK) {
+			obj_read_unlock();
 			status = git_inflate(stream, Z_FINISH);
+			obj_read_lock();
+		}
 	}
 	if (status == Z_STREAM_END && !stream->avail_in) {
 		git_inflate_end(stream);
@@ -1412,10 +1423,32 @@ static int loose_object_info(struct repository *r,
 	return (status < 0) ? status : 0;
 }
 
+int obj_read_use_lock = 0;
+pthread_mutex_t obj_read_mutex;
+
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	init_recursive_mutex(&obj_read_mutex);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
 int fetch_if_missing = 1;
 
-int oid_object_info_extended(struct repository *r, const struct object_id *oid,
-			     struct object_info *oi, unsigned flags)
+static int do_oid_object_info_extended(struct repository *r,
+				       const struct object_id *oid,
+				       struct object_info *oi, unsigned flags)
 {
 	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	struct pack_entry e;
@@ -1423,6 +1456,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
+
 	if (flags & OBJECT_INFO_LOOKUP_REPLACE)
 		real = lookup_replace_object(r, oid);
 
@@ -1498,7 +1532,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	rtype = packed_object_info(r, e.p, e.offset, oi);
 	if (rtype < 0) {
 		mark_bad_packed_object(e.p, real->hash);
-		return oid_object_info_extended(r, real, oi, 0);
+		return do_oid_object_info_extended(r, real, oi, 0);
 	} else if (oi->whence == OI_PACKED) {
 		oi->u.packed.offset = e.offset;
 		oi->u.packed.pack = e.p;
@@ -1509,6 +1543,17 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	return 0;
 }
 
+int oid_object_info_extended(struct repository *r, const struct object_id *oid,
+			     struct object_info *oi, unsigned flags)
+{
+	int ret;
+	obj_read_lock();
+	ret = do_oid_object_info_extended(r, oid, oi, flags);
+	obj_read_unlock();
+	return ret;
+}
+
+
 /* returns enum object_type or negative */
 int oid_object_info(struct repository *r,
 		    const struct object_id *oid,
@@ -1581,6 +1626,7 @@ void *read_object_file_extended(struct repository *r,
 	if (data)
 		return data;
 
+	obj_read_lock();
 	if (errno && errno != ENOENT)
 		die_errno(_("failed to read object %s"), oid_to_hex(oid));
 
@@ -1596,6 +1642,7 @@ void *read_object_file_extended(struct repository *r,
 	if ((p = has_packed_and_bad(r, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
+	obj_read_unlock();
 
 	return NULL;
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (4 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Stefan Beller, brian m. carlson, Brandon Williams

git-grep uses 'grep_read_mutex' to protect its calls to object reading
operations. But these have their own internal lock now, which ensures a
better performance (allowing parallel access to more regions). So, let's
remove the former and, instead, activate the latter with
enable_obj_read_lock().

Sections that are currently protected by 'grep_read_mutex' but are not
internally protected by the object reading lock should be surrounded by
obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion
with object reading operations, keeping the current behavior and
avoiding race conditions. Namely, these places are:

  In grep.c:

  - fill_textconv() at fill_textconv_grep().
  - userdiff_get_textconv() at grep_source_1().

  In builtin/grep.c:

  - parse_object_or_die() and the submodule functions at
    grep_submodule().
  - deref_tag() and gitmodules_config_oid() at grep_objects().

If these functions become thread-safe, in the future, we might remove
the locking and probably get some speedup.

Note that some of the submodule functions will already be thread-safe
(or close to being thread-safe) with the internal object reading lock.
However, as some of them will require additional modifications to be
removed from the critical section, this will be done in its own patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 46 ++++++++++++++++------------------------------
 grep.c         | 39 +++++++++++++++++++--------------------
 grep.h         | 13 -------------
 3 files changed, 35 insertions(+), 63 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index fa8b9996d1..5a404ee1db 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -200,12 +200,12 @@ static void start_threads(struct grep_opt *opt)
 	int i;
 
 	pthread_mutex_init(&grep_mutex, NULL);
-	pthread_mutex_init(&grep_read_mutex, NULL);
 	pthread_mutex_init(&grep_attr_mutex, NULL);
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
 	grep_use_locks = 1;
+	enable_obj_read_lock();
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -257,12 +257,12 @@ static int wait_all(void)
 	free(threads);
 
 	pthread_mutex_destroy(&grep_mutex);
-	pthread_mutex_destroy(&grep_read_mutex);
 	pthread_mutex_destroy(&grep_attr_mutex);
 	pthread_cond_destroy(&cond_add);
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	disable_obj_read_lock();
 
 	return hit;
 }
@@ -295,16 +295,6 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 	return st;
 }
 
-static void *lock_and_read_oid_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
-{
-	void *data;
-
-	grep_read_lock();
-	data = read_object_file(oid, type, size);
-	grep_read_unlock();
-	return data;
-}
-
 static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 		     const char *filename, int tree_name_len,
 		     const char *path)
@@ -413,20 +403,20 @@ static int grep_submodule(struct grep_opt *opt,
 
 	/*
 	 * NEEDSWORK: submodules functions need to be protected because they
-	 * access the object store via config_from_gitmodules(): the latter
-	 * uses get_oid() which, for now, relies on the global the_repository
-	 * object.
+	 * call config_from_gitmodules(): the latter contains in its call stack
+	 * many thread-unsafe operations that are racy with object reading, such
+	 * as parse_object() and is_promisor_object().
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
 	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
@@ -443,7 +433,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	add_to_alternates_memory(subrepo.objects->odb->path);
-	grep_read_unlock();
+	obj_read_unlock();
 
 	memcpy(&subopt, opt, sizeof(subopt));
 	subopt.repo = &subrepo;
@@ -455,13 +445,12 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		grep_read_lock();
+		obj_read_lock();
 		object = parse_object_or_die(oid, oid_to_hex(oid));
+		obj_read_unlock();
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
@@ -586,7 +575,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			void *data;
 			unsigned long size;
 
-			data = lock_and_read_oid_file(&entry.oid, &type, &size);
+			data = read_object_file(&entry.oid, &type, &size);
 			if (!data)
 				die(_("unable to read tree (%s)"),
 				    oid_to_hex(&entry.oid));
@@ -624,12 +613,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		struct strbuf base;
 		int hit, len;
 
-		grep_read_lock();
 		data = read_object_with_reference(opt->repo,
 						  &obj->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
@@ -659,17 +645,17 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 
-		grep_read_lock();
+		obj_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
-		grep_read_unlock();
+		obj_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
-			grep_read_lock();
+			obj_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
-			grep_read_unlock();
+			obj_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
diff --git a/grep.c b/grep.c
index b29946def2..0ca400f7b6 100644
--- a/grep.c
+++ b/grep.c
@@ -1533,11 +1533,6 @@ static inline void grep_attr_unlock(void)
 		pthread_mutex_unlock(&grep_attr_mutex);
 }
 
-/*
- * Same as git_attr_mutex, but protecting the thread-unsafe object db access.
- */
-pthread_mutex_t grep_read_mutex;
-
 static int match_funcname(struct grep_opt *opt, struct grep_source *gs, char *bol, char *eol)
 {
 	xdemitconf_t *xecfg = opt->priv;
@@ -1734,13 +1729,20 @@ static int fill_textconv_grep(struct repository *r,
 	}
 
 	/*
-	 * fill_textconv is not remotely thread-safe; it may load objects
-	 * behind the scenes, and it modifies the global diff tempfile
-	 * structure.
+	 * fill_textconv is not remotely thread-safe; it modifies the global
+	 * diff tempfile structure, writes to the_repo's odb and might
+	 * internally call thread-unsafe functions such as the
+	 * prepare_packed_git() lazy-initializator. Because of the last two, we
+	 * must ensure mutual exclusion between this call and the object reading
+	 * API, thus we use obj_read_lock() here.
+	 *
+	 * TODO: allowing text conversion to run in parallel with object
+	 * reading operations might increase performance in the multithreaded
+	 * non-worktreee git-grep with --textconv.
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	size = fill_textconv(r, driver, df, &buf);
-	grep_read_unlock();
+	obj_read_unlock();
 	free_filespec(df);
 
 	/*
@@ -1806,13 +1808,16 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		grep_source_load_driver(gs, opt->repo->index);
 		/*
 		 * We might set up the shared textconv cache data here, which
-		 * is not thread-safe.
+		 * is not thread-safe. Also, get_oid_with_context() and
+		 * parse_object() might be internally called. As they are not
+		 * currenty thread-safe and might be racy with object reading,
+		 * obj_read_lock() must be called.
 		 */
+		obj_read_lock();
 		grep_attr_lock();
-		grep_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
-		grep_read_unlock();
 		grep_attr_unlock();
+		obj_read_unlock();
 	}
 
 	/*
@@ -2111,10 +2116,7 @@ static int grep_source_load_oid(struct grep_source *gs)
 {
 	enum object_type type;
 
-	grep_read_lock();
 	gs->buf = read_object_file(gs->identifier, &type, &gs->size);
-	grep_read_unlock();
-
 	if (!gs->buf)
 		return error(_("'%s': unable to read %s"),
 			     gs->name,
@@ -2179,11 +2181,8 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path) {
-		grep_read_lock();
+	if (gs->path)
 		gs->driver = userdiff_find_by_path(istate, gs->path);
-		grep_read_unlock();
-	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
diff --git a/grep.h b/grep.h
index 1875880f37..54bf3a1ed4 100644
--- a/grep.h
+++ b/grep.h
@@ -235,18 +235,5 @@ int grep_threads_ok(const struct grep_opt *opt);
  */
 extern int grep_use_locks;
 extern pthread_mutex_t grep_attr_mutex;
-extern pthread_mutex_t grep_read_mutex;
-
-static inline void grep_read_lock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_lock(&grep_read_mutex);
-}
-
-static inline void grep_read_unlock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_unlock(&grep_read_mutex);
-}
 
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules()
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (5 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Stefan Beller, Brandon Williams, Jonathan Tan

Currently, submodule-config.c doesn't have an externally acessible
function to read gitmodules only if it wasn't already read. But this
exactly behavior is internally implemented by gitmodules_read_check(),
to perform a lazy load. Let's merge this function with
repo_read_gitmodules() adding an 'skip_if_read' which allow both
internal and external callers to access this functionality. This
simplifies a little the code. The added option will also be used in the
following patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c     |  2 +-
 submodule-config.c | 18 ++++++------------
 submodule-config.h |  2 +-
 unpack-trees.c     |  4 ++--
 4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5a404ee1db..1c4ff4a75f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -420,7 +420,7 @@ static int grep_submodule(struct grep_opt *opt,
 		return 0;
 	}
 
-	repo_read_gitmodules(&subrepo);
+	repo_read_gitmodules(&subrepo, 0);
 
 	/*
 	 * NEEDSWORK: This adds the submodule's object directory to the list of
diff --git a/submodule-config.c b/submodule-config.c
index 4264ee216f..8c4333120a 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -660,10 +660,13 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 	return parse_config(var, value, &parameter);
 }
 
-void repo_read_gitmodules(struct repository *repo)
+void repo_read_gitmodules(struct repository *repo, int skip_if_read)
 {
 	submodule_cache_check_init(repo);
 
+	if (repo->submodule_cache->gitmodules_read && skip_if_read)
+		return;
+
 	if (repo_read_index(repo) < 0)
 		return;
 
@@ -689,20 +692,11 @@ void gitmodules_config_oid(const struct object_id *commit_oid)
 	the_repository->submodule_cache->gitmodules_read = 1;
 }
 
-static void gitmodules_read_check(struct repository *repo)
-{
-	submodule_cache_check_init(repo);
-
-	/* read the repo's .gitmodules file if it hasn't been already */
-	if (!repo->submodule_cache->gitmodules_read)
-		repo_read_gitmodules(repo);
-}
-
 const struct submodule *submodule_from_name(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *name)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, name, lookup_name);
 }
 
@@ -710,7 +704,7 @@ const struct submodule *submodule_from_path(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *path)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, path, lookup_path);
 }
 
diff --git a/submodule-config.h b/submodule-config.h
index 1b4e2da658..7a76ef8cd8 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -39,7 +39,7 @@ int option_fetch_parse_recurse_submodules(const struct option *opt,
 					  const char *arg, int unset);
 int parse_update_recurse_submodules_arg(const char *opt, const char *arg);
 int parse_push_recurse_submodules_arg(const char *opt, const char *arg);
-void repo_read_gitmodules(struct repository *repo);
+void repo_read_gitmodules(struct repository *repo, int skip_if_read);
 void gitmodules_config_oid(const struct object_id *commit_oid);
 const struct submodule *submodule_from_name(struct repository *r,
 					    const struct object_id *commit_or_tree,
diff --git a/unpack-trees.c b/unpack-trees.c
index 9c25126aec..689575944c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -292,11 +292,11 @@ static void load_gitmodules_file(struct index_state *index,
 	if (pos >= 0) {
 		struct cache_entry *ce = index->cache[pos];
 		if (!state && ce->ce_flags & CE_WT_REMOVE) {
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		} else if (state && (ce->ce_flags & CE_UPDATE)) {
 			submodule_free(the_repository);
 			checkout_entry(ce, state, NULL, NULL);
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		}
 	}
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 08/11] grep: allow submodule functions to run in parallel
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (6 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Brandon Williams, Stefan Beller, Jeff King

Now that object reading operations are internally protected, the
submodule initialization functions at builtin/grep.c:grep_submodule()
are very close to being thread-safe. Let's take a look at each call and
remove from the critical section what we can, for better performance:

- submodule_from_path() and is_submodule_active() cannot be called in
  parallel yet only because they call repo_read_gitmodules() which
  contains, in its call stack, operations that would otherwise be in
  race condition with object reading (for example parse_object() and
  is_promisor_remote()). However, they only call repo_read_gitmodules()
  if it wasn't read before. So let's pre-read it before firing the
  threads and allow these two functions to safelly be called in
  parallel.

- repo_submodule_init() is already thread-safe, so remove it from the
  critical section without other necessary changes.

- The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
  no other thread is performing object reading operations in the subrepo
  yet. However, threads might be working in the superproject, and this
  function calls add_to_alternates_memory() internally, which is racy
  with object readings in the superproject. So it must be kept
  protected for now. Let's add a "NEEDSWORK" to it, informing why it
  cannot be removed from the critical section yet.

- Finally, add_to_alternates_memory() must be kept protected by the same
  reason of the above item.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 1c4ff4a75f..c973ac46a7 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt,
 	struct grep_opt subopt;
 	int hit;
 
-	/*
-	 * NEEDSWORK: submodules functions need to be protected because they
-	 * call config_from_gitmodules(): the latter contains in its call stack
-	 * many thread-unsafe operations that are racy with object reading, such
-	 * as parse_object() and is_promisor_object().
-	 */
-	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
-	if (!is_submodule_active(superproject, path)) {
-		obj_read_unlock();
+	if (!is_submodule_active(superproject, path))
 		return 0;
-	}
 
-	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		obj_read_unlock();
+	if (repo_submodule_init(&subrepo, superproject, sub))
 		return 0;
-	}
 
+	/*
+	 * NEEDSWORK: repo_read_gitmodules() might call
+	 * add_to_alternates_memory() via config_from_gitmodules(). This
+	 * operation causes a race condition with concurrent object readings
+	 * performed by the worker threads. That's why we need obj_read_lock()
+	 * here. It should be removed once it's no longer necessary to add the
+	 * subrepo's odbs to the in-memory alternates list.
+	 */
+	obj_read_lock();
 	repo_read_gitmodules(&subrepo, 0);
 
 	/*
@@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
+	if (recurse_submodules && (!use_index || untracked))
+		die(_("option not supported with --recurse-submodules"));
+
 	if (list.nr || cached || show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
@@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		    && (opt.pre_context || opt.post_context ||
 			opt.file_break || opt.funcbody))
 			skip_first_line = 1;
+
+		/*
+		 * Pre-read gitmodules (if not read already) to prevent racy
+		 * lazy reading in worker threads.
+		 */
+		if (recurse_submodules)
+			repo_read_gitmodules(the_repository, 1);
+
 		start_threads(&opt);
 	} else {
 		/*
@@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (recurse_submodules && (!use_index || untracked))
-		die(_("option not supported with --recurse-submodules"));
-
 	if (!show_in_pager && !opt.status_only)
 		setup_pager();
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 09/11] grep: protect packed_git [re-]initialization
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (7 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Brandon Williams, René Scharfe, Jonathan Tan, Jeff King,
	Stefan Beller

Some fields in struct raw_object_store are lazy initialized by the
thread-unsafe packfile.c:prepare_packed_git(). Although this function is
present in the call stack of git-grep threads, all paths to it are
currently protected by obj_read_lock() (and the main thread usually
indirectly calls it before firing the worker threads, anyway). However,
it's possible that future modifications add new unprotected paths to it,
introducing a race condition. Because errors derived from it wouldn't
happen often, it could be hard to detect. So to prevent future
headaches, let's force eager initialization of packed_git when setting
git-grep up. There'll be a small overhead in the cases where we didn't
really needed to prepare packed_git during execution but this shouldn't
be very noticeable.

Also, packed_git may be re-initialized by
packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
are already protected by obj_read_lock() but it may suffer from the same
problem in the future. So let's also internally protect it with
obj_read_lock() (which is a recursive mutex).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++++--
 packfile.c     | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index c973ac46a7..0947596bcd 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -24,6 +24,7 @@
 #include "submodule.h"
 #include "submodule-config.h"
 #include "object-store.h"
+#include "packfile.h"
 
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
@@ -1074,11 +1075,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			skip_first_line = 1;
 
 		/*
-		 * Pre-read gitmodules (if not read already) to prevent racy
-		 * lazy reading in worker threads.
+		 * Pre-read gitmodules (if not read already) and force eager
+		 * initialization of packed_git to prevent racy lazy
+		 * reading/initialization once worker threads are started.
 		 */
 		if (recurse_submodules)
 			repo_read_gitmodules(the_repository, 1);
+		if (startup_info->have_repository)
+			(void)get_packed_git(the_repository);
 
 		start_threads(&opt);
 	} else {
diff --git a/packfile.c b/packfile.c
index a336972174..5b32dac4ce 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1016,12 +1016,14 @@ void reprepare_packed_git(struct repository *r)
 {
 	struct object_directory *odb;
 
+	obj_read_lock();
 	for (odb = r->objects->odb; odb; odb = odb->next)
 		odb_clear_loose_cache(odb);
 
 	r->objects->approximate_object_count_valid = 0;
 	r->objects->packed_git_initialized = 0;
 	prepare_packed_git(r);
+	obj_read_unlock();
 }
 
 struct packed_git *get_packed_git(struct repository *r)
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 10/11] grep: re-enable threads in non-worktree case
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (8 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Brandon Williams, Manav Rathi

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops (to the point
that using a single thread would be faster than multiple threads). But
now that zlib inflation can be performed in parallel we can regain the
speedup, so let's re-enable threads in non-worktree grep.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.2920s  |  20.9624s
    2    |   9.6512s  |  11.3184s
    4    |   6.7723s  |   7.6268s
    8**  |   6.2886s  |   6.9843s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ (quad core w/ hyper-threading), 16GB of RAM and
SSD, running Manjaro Linux. But to make sure the optimization also
performs well on HDD, the tests were repeated on another machine with an
i5-4210U (dual core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
5400 rpm), also running Manjaro Linux:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  18.4035s  |  22.5368s
    2    |  12.5063s  |  14.6409s
    4**  |  10.9136s  |  12.7106s

** Note that in these cases we relied on hyper-threading, and that's
   probably why we don't see a big difference in time.

Unfortunately, multithreaded git-grep might be slow in the non-worktree
case when --textconv is used and there're too many text conversions.
Probably the reason for this is that the object read lock is used to
protect fill_textconv() and therefore there is a mutual exclusion
between textconv execution and object reading. Because both are time
consuming operations, not being able to perform them in parallel can
cause performance drops. To inform the users about this (and other
threading detais), let's also add a "NOTES ON THREADS" section to
Documentation/git-grep.txt.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 11 +++++++++++
 builtin/grep.c             |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 2d27969057..00fc59d565 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -330,6 +330,17 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES ON THREADS
+----------------
+
+The `--threads` option (and the grep.threads configuration) will be ignored when
+`--open-files-in-pager` is used, forcing a single-threaded execution.
+
+When grepping the object store (with `--cached` or giving tree objects), running
+with multiple threads might perform slower than single threaded if `--textconv`
+is given and there're too many text conversions. So if you experience low
+performance in this case, it might be desirable to use `--threads=1`.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index 0947596bcd..163f14b60d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1054,7 +1054,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	if (recurse_submodules && (!use_index || untracked))
 		die(_("option not supported with --recurse-submodules"));
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 11/11] grep: move driver pre-load out of critical section
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (9 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
@ 2019-09-30  1:50   ` Matheus Tavares
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git
  Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	Rasmus Villemoes, Jeff King

In builtin/grep.c:add_work() we pre-load the userdiff drivers before
adding the grep_source in the todo list. This operation is currently
being performed after acquiring the grep_mutex, but as it's already
thread-safe, we don't need to protect it here. So let's move it out of
the critical section which should avoid thread contention and improve
performance.

Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's
repository[2], I got the following mean times for 30 executions after 2
warmups:

        Original         |  6.2886s
-------------------------|-----------
 Out of critical section |  5.7852s

[1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running
     Manjaro Linux.
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
         04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 163f14b60d..d275b76647 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -92,8 +92,11 @@ static pthread_cond_t cond_result;
 
 static int skip_first_line;
 
-static void add_work(struct grep_opt *opt, const struct grep_source *gs)
+static void add_work(struct grep_opt *opt, struct grep_source *gs)
 {
+	if (opt->binary != GREP_BINARY_TEXT)
+		grep_source_load_driver(gs, opt->repo->index);
+
 	grep_lock();
 
 	while ((todo_end+1) % ARRAY_SIZE(todo) == todo_done) {
@@ -101,9 +104,6 @@ static void add_work(struct grep_opt *opt, const struct grep_source *gs)
 	}
 
 	todo[todo_end].source = *gs;
-	if (opt->binary != GREP_BINARY_TEXT)
-		grep_source_load_driver(&todo[todo_end].source,
-					opt->repo->index);
 	todo[todo_end].done = 0;
 	strbuf_reset(&todo[todo_end].out);
 	todo_end = (todo_end + 1) % ARRAY_SIZE(todo);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH] squash! grep: replace grep_read_mutex by internal obj read lock
  2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
@ 2019-10-01 19:23     ` Matheus Tavares
  0 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-10-01 19:23 UTC (permalink / raw)
  To: git
  Cc: bwilliams.eng, christian.couder, gitster, jrnieder,
	olyatelezhnaya, pclouds, sandals, stefanbeller

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---

This is just a small fixup to be squashed into patch 6: with multiple
locks, the locking order must be consistent across all critical sections
to avoid dead-lock. Since grep_attr_lock() is called before
obj_read_lock() in grep_source_load_driver(), it must also be called
first here.


 grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index 0ca400f7b6..85ee89b5d6 100644
--- a/grep.c
+++ b/grep.c
@@ -1813,11 +1813,11 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		 * currenty thread-safe and might be racy with object reading,
 		 * obj_read_lock() must be called.
 		 */
-		obj_read_lock();
 		grep_attr_lock();
+		obj_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
-		grep_attr_unlock();
 		obj_read_unlock();
+		grep_attr_unlock();
 	}
 
 	/*
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
@ 2019-11-12  2:54     ` Jonathan Tan
  2019-11-13  5:20       ` Jeff King
  0 siblings, 1 reply; 47+ messages in thread
From: Jonathan Tan @ 2019-11-12  2:54 UTC (permalink / raw)
  To: matheus.bernardino
  Cc: git, christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder,
	stefanbeller, jonathantanmy, peff

> @@ -1580,7 +1585,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
>  	do {
>  		in = use_pack(p, w_curs, curpos, &stream.avail_in);
>  		stream.next_in = in;
> +		obj_read_unlock();
>  		st = git_inflate(&stream, Z_FINISH);
> +		obj_read_lock();
>  		if (!stream.avail_out)
>  			break; /* the payload is larger than it should be */
>  		curpos += stream.next_in - in;

As I see it, the main purpose of this patch set is to move the mutex
guarding object reading from builtin/grep.c (grep_read_mutex) to
object-store.h (obj_read_mutex), so that we can add "holes" (non-mutex
sections) such as the one quoted above, in order that zlib inflation can
happen outside the mutex.

My concern is that the presence of these "holes" make object reading
non-thread-safe, defeating the purpose of obj_read_mutex. In particular,
the section quoted above assumes that the window section returned by
use_pack() is still valid throughout the inflation, but that window
could have been invalidated by things like an excess of windows open,
reprepare_packed_git(), etc.

I thought of this for a while but couldn't think of a good solution. If
we introduced a reference counting mechanism into Git, that would allow
us to hold the window open outside the mutex, but things like
reprepare_packed_git() would still be difficult.

If there's a good solution that only works for unpack_compressed_entry()
and not the other parts that also inflate, that might be sufficient (we
can inflate outside mutex just for this function, and inflate inside
mutex for the rest) - we would have to benchmark to be sure, but I think
that Matheus's main use case concerns grepping over a repo with a
packfile, not loose objects.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-12  2:54     ` Jonathan Tan
@ 2019-11-13  5:20       ` Jeff King
  2019-11-14  5:57         ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 47+ messages in thread
From: Jeff King @ 2019-11-13  5:20 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: matheus.bernardino, git, christian.couder, olyatelezhnaya,
	pclouds, gitster, jrnieder, stefanbeller

On Mon, Nov 11, 2019 at 06:54:18PM -0800, Jonathan Tan wrote:

> > @@ -1580,7 +1585,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
> >  	do {
> >  		in = use_pack(p, w_curs, curpos, &stream.avail_in);
> >  		stream.next_in = in;
> > +		obj_read_unlock();
> >  		st = git_inflate(&stream, Z_FINISH);
> > +		obj_read_lock();
> >  		if (!stream.avail_out)
> >  			break; /* the payload is larger than it should be */
> >  		curpos += stream.next_in - in;
> 
> As I see it, the main purpose of this patch set is to move the mutex
> guarding object reading from builtin/grep.c (grep_read_mutex) to
> object-store.h (obj_read_mutex), so that we can add "holes" (non-mutex
> sections) such as the one quoted above, in order that zlib inflation can
> happen outside the mutex.
> 
> My concern is that the presence of these "holes" make object reading
> non-thread-safe, defeating the purpose of obj_read_mutex. In particular,
> the section quoted above assumes that the window section returned by
> use_pack() is still valid throughout the inflation, but that window
> could have been invalidated by things like an excess of windows open,
> reprepare_packed_git(), etc.

Yeah, I don't think the code above is safe. The map window can be
modified by other threads.

> I thought of this for a while but couldn't think of a good solution. If
> we introduced a reference counting mechanism into Git, that would allow
> us to hold the window open outside the mutex, but things like
> reprepare_packed_git() would still be difficult.

I think you could put a reader-writer lock into each window. The code
here would take the reader lock, and multiple readers could use it at
the same time. Any time the window needs to be shifted, resized, or
discarded, that code would take the writer lock, waiting for (and then
blocking out) any readers.

A pthread_rwlock would work, but it would be the first use in Git. I
think we'd need to find an equivalent for compat/win32/pthread.h.

-Peff

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-13  5:20       ` Jeff King
@ 2019-11-14  5:57         ` Matheus Tavares Bernardino
  2019-11-14  6:01           ` Jeff King
  0 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares Bernardino @ 2019-11-14  5:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, Christian Couder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Jonathan Nieder, Stefan Beller

Hi, Peff and Jonathan

On Wed, Nov 13, 2019 at 2:20 AM Jeff King <peff@peff.net> wrote:
>
> On Mon, Nov 11, 2019 at 06:54:18PM -0800, Jonathan Tan wrote:
[...]
> > My concern is that the presence of these "holes" make object reading
> > non-thread-safe, defeating the purpose of obj_read_mutex. In particular,
> > the section quoted above assumes that the window section returned by
> > use_pack() is still valid throughout the inflation, but that window
> > could have been invalidated by things like an excess of windows open,
> > reprepare_packed_git(), etc.

Thank you for spotting this issue!

> > I thought of this for a while but couldn't think of a good solution. If
> > we introduced a reference counting mechanism into Git, that would allow
> > us to hold the window open outside the mutex, but things like
> > reprepare_packed_git() would still be difficult.
>
> I think you could put a reader-writer lock into each window. The code
> here would take the reader lock, and multiple readers could use it at
> the same time. Any time the window needs to be shifted, resized, or
> discarded, that code would take the writer lock, waiting for (and then
> blocking out) any readers.

Great idea, I'll work on that. Thanks!

About the other parallel inflation calls on loose objects at
unpack_loose_short_header(), unpack_loose_header_to_strbuf() and
unpack_loose_rest(): could they suffer from a similar race problem?
FWIU, the inflation input used in these cases comes from
map_loose_object(), and it's not referenced outside this scope. So I
think there's no risk of one thread munmapping the object file while
other is inflating it. Is that right?

> A pthread_rwlock would work, but it would be the first use in Git. I
> think we'd need to find an equivalent for compat/win32/pthread.h.

These[1][2] seems to be the equivalent options on Windows. I'll have
to read these docs more carefully, but [2] seems to be more
interesting in terms of speed. Also, the extra features of [1] are not
really needed for our use case here.

[1]: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/reader-writer-spin-locks
[2]: https://docs.microsoft.com/en-us/windows/win32/sync/slim-reader-writer--srw--locks

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-14  5:57         ` Matheus Tavares Bernardino
@ 2019-11-14  6:01           ` Jeff King
  2019-11-14 18:15             ` Jonathan Tan
  0 siblings, 1 reply; 47+ messages in thread
From: Jeff King @ 2019-11-14  6:01 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Jonathan Tan, git, Christian Couder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Jonathan Nieder, Stefan Beller

On Thu, Nov 14, 2019 at 02:57:42AM -0300, Matheus Tavares Bernardino wrote:

> About the other parallel inflation calls on loose objects at
> unpack_loose_short_header(), unpack_loose_header_to_strbuf() and
> unpack_loose_rest(): could they suffer from a similar race problem?
> FWIU, the inflation input used in these cases comes from
> map_loose_object(), and it's not referenced outside this scope. So I
> think there's no risk of one thread munmapping the object file while
> other is inflating it. Is that right?

Right, I think loose objects would be fine, because the mmap'd content
is local to that stack variable.

> > A pthread_rwlock would work, but it would be the first use in Git. I
> > think we'd need to find an equivalent for compat/win32/pthread.h.
> 
> These[1][2] seems to be the equivalent options on Windows. I'll have
> to read these docs more carefully, but [2] seems to be more
> interesting in terms of speed. Also, the extra features of [1] are not
> really needed for our use case here.
> 
> [1]: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/reader-writer-spin-locks
> [2]: https://docs.microsoft.com/en-us/windows/win32/sync/slim-reader-writer--srw--locks

Yeah, looks like it, but I don't have any expertise there (nor a Windows
system to test on).

-Peff

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-14  6:01           ` Jeff King
@ 2019-11-14 18:15             ` Jonathan Tan
  2019-11-15  4:12               ` Jeff King
  0 siblings, 1 reply; 47+ messages in thread
From: Jonathan Tan @ 2019-11-14 18:15 UTC (permalink / raw)
  To: peff
  Cc: matheus.bernardino, jonathantanmy, git, christian.couder,
	olyatelezhnaya, pclouds, gitster, jrnieder, stefanbeller

> > > A pthread_rwlock would work, but it would be the first use in Git. I
> > > think we'd need to find an equivalent for compat/win32/pthread.h.
> > 
> > These[1][2] seems to be the equivalent options on Windows. I'll have
> > to read these docs more carefully, but [2] seems to be more
> > interesting in terms of speed. Also, the extra features of [1] are not
> > really needed for our use case here.
> > 
> > [1]: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/reader-writer-spin-locks
> > [2]: https://docs.microsoft.com/en-us/windows/win32/sync/slim-reader-writer--srw--locks
> 
> Yeah, looks like it, but I don't have any expertise there (nor a Windows
> system to test on).

One thing to note is that that if we do this, we'll be having one rwlock
per pack window. I couldn't find out what the Windows limits were, but
it seems that pthreads does not mandate having no limit [1]:

> Defining symbols for the maximum number of mutexes and condition
> variables was considered but rejected because the number of these
> objects may change dynamically. Furthermore, many implementations
> place these objects into application memory; thus, there is no
> explicit maximum.

[1] https://linux.die.net/man/3/pthread_mutex_init

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-14 18:15             ` Jonathan Tan
@ 2019-11-15  4:12               ` Jeff King
  2019-12-19 22:27                 ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 47+ messages in thread
From: Jeff King @ 2019-11-15  4:12 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: matheus.bernardino, git, christian.couder, olyatelezhnaya,
	pclouds, gitster, jrnieder, stefanbeller

On Thu, Nov 14, 2019 at 10:15:52AM -0800, Jonathan Tan wrote:

> > > > A pthread_rwlock would work, but it would be the first use in Git. I
> > > > think we'd need to find an equivalent for compat/win32/pthread.h.
> > > 
> > > These[1][2] seems to be the equivalent options on Windows. I'll have
> > > to read these docs more carefully, but [2] seems to be more
> > > interesting in terms of speed. Also, the extra features of [1] are not
> > > really needed for our use case here.
> > > 
> > > [1]: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/reader-writer-spin-locks
> > > [2]: https://docs.microsoft.com/en-us/windows/win32/sync/slim-reader-writer--srw--locks
> > 
> > Yeah, looks like it, but I don't have any expertise there (nor a Windows
> > system to test on).
> 
> One thing to note is that that if we do this, we'll be having one rwlock
> per pack window. I couldn't find out what the Windows limits were, but
> it seems that pthreads does not mandate having no limit [1]:

Yeah, interesting point. We shouldn't _usually_ have too many windows at
once, but I think this would be the first place where we allocate a
non-constant number of thread mechanisms. And there are degenerate cases
where you might have tens of thousands of packs.

I suspect it's a non-issue in practice, though. Any limits there are
likely related to kernel resources like descriptors. And those
degenerate cases already run into issues there (did you know that Linux
systems limit the number of mmaps a single process can have? I didn't
until I tried repacking a repo with 35,000 packs).

-Peff

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-11-15  4:12               ` Jeff King
@ 2019-12-19 22:27                 ` Matheus Tavares Bernardino
  2020-01-09 22:02                   ` Matheus Tavares Bernardino
  0 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares Bernardino @ 2019-12-19 22:27 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, Christian Couder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Jonathan Nieder, Stefan Beller

Hi, Peff and Jonathan

Sorry for the delay in re-rolling this series, it was a very busy end
of semester. But I finally completed my degree and had time to get
back to it :)

I tried the rwlock approach, but there were some subtle difficulties.
For example, we should hold the lock in write mode when free()-ing the
window, and thus, the lock couldn't be declared inside the struct
pack_window.

Also, it seemed that protecting the window reading at git_inflate()
wouldn't be enough: suppose a thread has just read from a window in
git_inflate() (holding the rwlock) and is now waiting for the
obj_read_mutex to continue its object reading operation. If another
thread (that already has the obj_read_mutex) acquires the rwlock in
sequence, it could free() the said window. It might not sound like a
problem since the first thread has already finished reading from it.
But since a pointer to the window would still be in the first thread's
stack as a window cursor, it could be later passed down to use_pack()
leading to a segfault. I couldn't come up with a solution for this
yet.

However, re-inspecting the code, it seemed to me that we might already
have a thread-safe mechanism. The window disposal operations (at
close_pack_windows() and unuse_one_window()) are only performed if
window.inuse_cnt == 0. So as a thread which reads from the window will
also previously increment its inuse_cnt, wouldn't the reading
operation be already protected?

Another concern would be close_pack_fd(), which can close packs even
with in-use windows. However, as the mmap docs[1] says: "closing the
file descriptor does not unmap the region".

Finally, we also considered reprepare_packed_git() as a possible
conflicting operation. But the function called by it to handle
packfile opening, prepare_pack(), won't reopen already available
packs. Therefore, IIUC, it will leave the opened windows intact.

So, aren't perhaps the window readings performed outside the
obj_read_mutex critical section already thread-safe?

Thanks,
Matheus

[1]: https://linux.die.net/man/2/mmap

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2019-12-19 22:27                 ` Matheus Tavares Bernardino
@ 2020-01-09 22:02                   ` Matheus Tavares Bernardino
  2020-01-10 19:07                     ` Christian Couder
  0 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares Bernardino @ 2020-01-09 22:02 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, Christian Couder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Jonathan Nieder, Stefan Beller

Hi, folks

On Thu, Dec 19, 2019 at 5:27 PM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
[...]
> However, re-inspecting the code, it seemed to me that we might already
> have a thread-safe mechanism. The window disposal operations (at
> close_pack_windows() and unuse_one_window()) are only performed if
> window.inuse_cnt == 0. So as a thread which reads from the window will
> also previously increment its inuse_cnt, wouldn't the reading
> operation be already protected?
>
> Another concern would be close_pack_fd(), which can close packs even
> with in-use windows. However, as the mmap docs[1] says: "closing the
> file descriptor does not unmap the region".
>
> Finally, we also considered reprepare_packed_git() as a possible
> conflicting operation. But the function called by it to handle
> packfile opening, prepare_pack(), won't reopen already available
> packs. Therefore, IIUC, it will leave the opened windows intact.
>
> So, aren't perhaps the window readings performed outside the
> obj_read_mutex critical section already thread-safe?

Any thoughts on this?

> Thanks,
> Matheus
>
> [1]: https://linux.die.net/man/2/mmap

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/11] object-store: allow threaded access to object reading
  2020-01-09 22:02                   ` Matheus Tavares Bernardino
@ 2020-01-10 19:07                     ` Christian Couder
  0 siblings, 0 replies; 47+ messages in thread
From: Christian Couder @ 2020-01-10 19:07 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Jeff King, Jonathan Tan, git,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Jonathan Nieder, Stefan Beller

Hi Matheus,

On Thu, Jan 9, 2020 at 11:02 PM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
> On Thu, Dec 19, 2019 at 5:27 PM Matheus Tavares Bernardino
> <matheus.bernardino@usp.br> wrote:

> > However, re-inspecting the code, it seemed to me that we might already
> > have a thread-safe mechanism. The window disposal operations (at
> > close_pack_windows() and unuse_one_window()) are only performed if
> > window.inuse_cnt == 0. So as a thread which reads from the window will
> > also previously increment its inuse_cnt, wouldn't the reading
> > operation be already protected?
> >
> > Another concern would be close_pack_fd(), which can close packs even
> > with in-use windows. However, as the mmap docs[1] says: "closing the
> > file descriptor does not unmap the region".
> >
> > Finally, we also considered reprepare_packed_git() as a possible
> > conflicting operation. But the function called by it to handle
> > packfile opening, prepare_pack(), won't reopen already available
> > packs. Therefore, IIUC, it will leave the opened windows intact.
> >
> > So, aren't perhaps the window readings performed outside the
> > obj_read_mutex critical section already thread-safe?
>
> Any thoughts on this?

My advice, if you don't hear from anyone in the next few days, is to
update the commit message, the cover letter and the comments inside
the patches with the information you researched and to resubmit a new
version of the patch series.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 00/12] grep: improve threading and fix race conditions
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
                     ` (10 preceding siblings ...)
  2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
@ 2020-01-16  2:39   ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
                       ` (11 more replies)
  11 siblings, 12 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff

This series focus on re-enabling threads at git-grep for the
object store case. They are currently disabled due to being slower than
single-threaded grep in this case. However, by allowing parallel zlib
inflation when reading objects, speedups of up to 3.3x were observed.

The patchset also contains some fixes for race conditions found in the
working tree git-grep and other thread optimizations. With them, the
working tree case reached speedups of up to 1.52x.

Changes since v2:
- Updated commit message of patch 6 and added comment in the code to
  document how we ensure that the unlocked inflation code is thread-safe
  (as suggested by Christian[1]).
- Added patch to adjust the default number of threads in git-grep
  according to the number of available cores.
- Fixed some typos in the commit messages
- Rebased with master (d0654dc)

Note: this patchset is part of my GSoC project, and also of my
bachelor's capstone project (which had the same theme/goal). In case
someone might want to take a look, the project's final essay can be
found here[2] :) Besides extra details, it contains more performance
tests and plots of the resulting speedups.


[1]: https://public-inbox.org/git/CAP8UFD3QHNoePGsc9t0fVxTJpPy+KBCbPfXKGoNOx_2bCf1Hxg@mail.gmail.com/
[2]: https://matheustavares.gitlab.io/assets/tavares-final-essay.pdf
travis build: https://travis-ci.org/matheustavares/git/builds/637708218

Matheus Tavares (12):
  grep: fix race conditions on userdiff calls
  grep: fix race conditions at grep_submodule()
  grep: fix racy calls in grep_objects()
  replace-object: make replace operations thread-safe
  object-store: allow threaded access to object reading
  grep: replace grep_read_mutex by internal obj read lock
  submodule-config: add skip_if_read option to repo_read_gitmodules()
  grep: allow submodule functions to run in parallel
  grep: protect packed_git [re-]initialization
  grep: re-enable threads in non-worktree case
  grep: move driver pre-load out of critical section
  grep: use no. of cores as the default no. of threads

 .tsan-suppressions         |  6 +++
 Documentation/git-grep.txt | 15 +++++-
 builtin/grep.c             | 93 +++++++++++++++++++-------------------
 grep.c                     | 32 +++++++------
 grep.h                     | 13 ------
 object-store.h             | 37 +++++++++++++++
 object.c                   |  2 +
 packfile.c                 | 34 ++++++++++++++
 replace-object.c           | 11 ++++-
 replace-object.h           |  7 ++-
 sha1-file.c                | 57 +++++++++++++++++++++--
 submodule-config.c         | 18 +++-----
 submodule-config.h         |  2 +-
 unpack-trees.c             |  4 +-
 14 files changed, 233 insertions(+), 98 deletions(-)

Range-diff against v2:
 1:  0f31cb0c12 !  1:  e2f3d377f5 grep: fix race conditions on userdiff calls
    @@ Commit message
         git-grep uses an internal grep_read_mutex to protect object reading
         operations. Similarly, there's a grep_attr_mutex to protect calls to the
         gitattributes machinery. However, two of the three functions protected
    -    by the last mutex may also perform object reading, as seen bellow:
    +    by the last mutex may also perform object reading, as seen below:
     
         - userdiff_get_textconv() > notes_cache_init() >
           notes_cache_match_validity() > lookup_commit_reference_gently() >
    @@ Commit message
         that, let's make sure to acquire the lock before both of these calls.
     
         Note: this patch might slow down the threaded grep in worktree, for the
    -    sake of thread-safeness. However, in the following patches we should
    +    sake of thread-safeness. However, in the following patches, we should
         regain performance by replacing grep_read_mutex for an internal object
         reading lock and allowing parallel inflation during object reading.
     
 2:  be32683f1d =  2:  6f0899701b grep: fix race conditions at grep_submodule()
 3:  34aeb218bf !  3:  5295c892ee grep: fix racy calls in grep_objects()
    @@ Commit message
         condition with object reading operations (such as the ones internally
         performed by fill_textconv(), called at fill_textconv_grep()). The same
         problem happens with the call to gitmodules_config_oid() which also has
    -    parse_object() in its call stack. Fix that protecting both call with the
    -    said grep_read_mutex.
    +    parse_object() in its call stack. Fix that protecting both calls with
    +    the said grep_read_mutex.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
 4:  5deee3cf11 !  4:  d7f739bc57 replace-object: make replace operations thread-safe
    @@ object-store.h: struct raw_object_store {
     
      ## object.c ##
     @@ object.c: struct raw_object_store *raw_object_store_new(void)
    - 
      	memset(o, 0, sizeof(*o));
      	INIT_LIST_HEAD(&o->packed_git_mru);
    + 	hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
     +	pthread_mutex_init(&o->replace_mutex, NULL);
      	return o;
      }
 5:  4c5652ab34 !  5:  b72e90f229 object-store: allow threaded access to object reading
    @@ Commit message
         object-store: allow threaded access to object reading
     
         Allow object reading to be performed by multiple threads protecting it
    -    with an internal lock. The lock usage can be toggled with
    -    enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
    +    with an internal lock, the obj_read_mutex. The lock usage can be toggled
    +    with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
         functions which can be safely called in parallel are:
         read_object_file_extended(), repo_read_object_file(),
         read_object_file(), read_object_with_reference(), read_object(),
    -    oid_object_info() and oid_object_info_extended(). It's also possible to
    -    use obj_read_lock() and obj_read_unlock() to protect other sections that
    -    cannot execute in parallel with object reading.
    +    oid_object_info() and oid_object_info_extended(). It's also possible
    +    to use obj_read_lock() and obj_read_unlock() to protect other sections
    +    that cannot execute in parallel with object reading.
     
         Probably there are many spots in the functions listed above that could
         be executed unlocked (and thus, in parallel). But, for now, we are most
         interested in allowing parallel access to zlib inflation. This is one of
    -    the sections where object reading spends most of the time and it's
    -    already thread-safe. So, to take advantage of that, the respective lock
    -    is released when calling git_inflate() and re-acquired right after, for
    -    every calling spot in oid_object_info_extended()'s call chain. We may
    -    refine the lock to also exploit other possible parallel spots in the
    -    future, but threaded zlib inflation should already give great speedups
    -    for now.
    +    the sections where object reading spends most of the time in (e.g. up to
    +    one-third of git-grep's execution time in the chromium repo corresponds
    +    to inflation) and it's already thread-safe. So, to take advantage of
    +    that, the obj_read_mutex is released when calling git_inflate() and
    +    re-acquired right after, for every calling spot in
    +    oid_object_info_extended()'s call chain. We may refine this lock to also
    +    exploit other possible parallel spots in the future, but for now,
    +    threaded zlib inflation should already give great speedups for threaded
    +    object reading callers.
     
         Note that add_delta_base_cache() was also modified to skip adding
         already present entries to the cache. This wasn't possible before, but
    -    now it is since phase I and phase III of unpack_entry() may execute
    -    concurrently.
    +    it would be now, with the parallel inflation. Take for example the
    +    following situation, where two threads - A and B - are executing the
    +    code at unpack_entry():
     
    -    Another important thing to notice is that the object reading lock only
    -    works in conjunction with the 'struct raw_object_store's replace_mutex.
    -    Otherwise, there would still be racy spots in object reading
    -    functions.
    +    1. Thread A is performing the decompression of a base O (which is not
    +       yet in the cache) at PHASE II. Thread B is simultaneously trying to
    +       unpack O, but just starting at PHASE I.
    +    2. Since O is not yet in the cache, B will go to PHASE II to also
    +       perform the decompression.
    +    3. When they finish decompressing, one of them will get the object
    +       reading mutex and go to PHASE III while the other waits for the
    +       mutex. Let’s say A got the mutex first.
    +    4. Thread A will add O to the cache, go throughout the rest of PHASE III
    +       and return.
    +    5. Thread B gets the mutex, also add O to the cache (if the check wasn't
    +       there) and returns.
    +
    +    Finally, it is also important to highlight that the object reading lock
    +    can only ensure thread-safety in the mentioned functions thanks to two
    +    complementary mechanisms: the use of 'struct raw_object_store's
    +    replace_mutex, which guards sections in the object reading machinery
    +    that would otherwise be thread-unsafe; and the 'struct pack_window's
    +    inuse_cnt, which protects window reading operations (such as the one
    +    performed during the inflation of a packed object), allowing them to
    +    execute without the acquisition of the obj_read_mutex.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    @@ packfile.c: unsigned long get_size_from_delta(struct packed_git *p,
      	do {
      		in = use_pack(p, w_curs, curpos, &stream.avail_in);
      		stream.next_in = in;
    ++		/*
    ++		 * Note: the window section returned by use_pack() must be
    ++		 * available throughout git_inflate()'s unlocked execution. To
    ++		 * ensure no other thread will modify the window in the
    ++		 * meantime, we rely on the packed_window.inuse_cnt. This
    ++		 * counter is incremented before window reading and checked
    ++		 * before window disposal.
    ++		 *
    ++		 * Other worrying sections could be the call to close_pack_fd(),
    ++		 * which can close packs even with in-use windows, and to
    ++		 * reprepare_packed_git(). Regarding the former, mmap doc says:
    ++		 * "closing the file descriptor does not unmap the region". And
    ++		 * for the latter, it won't re-open already available packs.
    ++		 */
     +		obj_read_unlock();
      		st = git_inflate(&stream, Z_FINISH);
     +		obj_read_lock();
    @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_of
      	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
      	struct list_head *lru, *tmp;
      
    -+	if (get_delta_base_cache_entry(p, base_offset))
    ++	/*
    ++	 * Check required to avoid redundant entries when more than one thread
    ++	 * is unpacking the same object, in unpack_entry() (since its phases I
    ++	 * and III might run concurrently across multiple threads).
    ++	 */
    ++	if (in_delta_base_cache(p, base_offset))
     +		return;
     +
      	delta_base_cached += base_size;
    @@ packfile.c: static void *unpack_compressed_entry(struct packed_git *p,
      	do {
      		in = use_pack(p, w_curs, curpos, &stream.avail_in);
      		stream.next_in = in;
    ++		/*
    ++		 * Note: we must ensure the window section returned by
    ++		 * use_pack() will be available throughout git_inflate()'s
    ++		 * unlocked execution. Please refer to the comment at
    ++		 * get_size_from_delta() to see how this is done.
    ++		 */
     +		obj_read_unlock();
      		st = git_inflate(&stream, Z_FINISH);
     +		obj_read_lock();
 6:  b439bdeed3 =  6:  fc1200bb07 grep: replace grep_read_mutex by internal obj read lock
 7:  d759f1e8c7 !  7:  d39d2ce9c4 submodule-config: add skip_if_read option to repo_read_gitmodules()
    @@ Metadata
      ## Commit message ##
         submodule-config: add skip_if_read option to repo_read_gitmodules()
     
    -    Currently, submodule-config.c doesn't have an externally acessible
    +    Currently, submodule-config.c doesn't have an externally accessible
         function to read gitmodules only if it wasn't already read. But this
    -    exactly behavior is internally implemented by gitmodules_read_check(),
    -    to perform a lazy load. Let's merge this function with
    -    repo_read_gitmodules() adding an 'skip_if_read' which allow both
    +    exact behavior is internally implemented by gitmodules_read_check(), to
    +    perform a lazy load. Let's merge this function with
    +    repo_read_gitmodules() adding a 'skip_if_read' which allows both
         internal and external callers to access this functionality. This
    -    simplifies a little the code. The added option will also be used in the
    -    following patch.
    +    simplifies a little the code. The added option will also be used in
    +    the following patch.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    @@ submodule-config.h: int option_fetch_parse_recurse_submodules(const struct optio
     -void repo_read_gitmodules(struct repository *repo);
     +void repo_read_gitmodules(struct repository *repo, int skip_if_read);
      void gitmodules_config_oid(const struct object_id *commit_oid);
    - const struct submodule *submodule_from_name(struct repository *r,
    - 					    const struct object_id *commit_or_tree,
    + 
    + /**
     
      ## unpack-trees.c ##
     @@ unpack-trees.c: static void load_gitmodules_file(struct index_state *index,
 8:  2e76847ec9 !  8:  af8ad95d41 grep: allow submodule functions to run in parallel
    @@ Commit message
           race condition with object reading (for example parse_object() and
           is_promisor_remote()). However, they only call repo_read_gitmodules()
           if it wasn't read before. So let's pre-read it before firing the
    -      threads and allow these two functions to safelly be called in
    +      threads and allow these two functions to safely be called in
           parallel.
     
         - repo_submodule_init() is already thread-safe, so remove it from the
    @@ Commit message
           protected for now. Let's add a "NEEDSWORK" to it, informing why it
           cannot be removed from the critical section yet.
     
    -    - Finally, add_to_alternates_memory() must be kept protected by the same
    -      reason of the above item.
    +    - Finally, add_to_alternates_memory() must be kept protected for the
    +      same reason as the item above.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
 9:  134114b001 !  9:  0ccf79ba86 grep: protect packed_git [re-]initialization
    @@ Commit message
         happen often, it could be hard to detect. So to prevent future
         headaches, let's force eager initialization of packed_git when setting
         git-grep up. There'll be a small overhead in the cases where we didn't
    -    really needed to prepare packed_git during execution but this shouldn't
    -    be very noticeable.
    +    really need to prepare packed_git during execution but this shouldn't be
    +    very noticeable.
     
         Also, packed_git may be re-initialized by
         packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
10:  04a4d81ff5 ! 10:  6c09e9169d grep: re-enable threads in non-worktree case
    @@ Commit message
             8**  |   6.2886s  |   6.9843s
     
         These are all means of 30 executions after 2 warmup runs. All tests were
    -    executed on an i7-7700HQ (quad core w/ hyper-threading), 16GB of RAM and
    +    executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
         SSD, running Manjaro Linux. But to make sure the optimization also
         performs well on HDD, the tests were repeated on another machine with an
    -    i5-4210U (dual core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
    +    i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
         5400 rpm), also running Manjaro Linux:
     
          Threads |   Regex 1  |  Regex 2
    @@ Commit message
         case when --textconv is used and there're too many text conversions.
         Probably the reason for this is that the object read lock is used to
         protect fill_textconv() and therefore there is a mutual exclusion
    -    between textconv execution and object reading. Because both are time
    -    consuming operations, not being able to perform them in parallel can
    -    cause performance drops. To inform the users about this (and other
    -    threading detais), let's also add a "NOTES ON THREADS" section to
    +    between textconv execution and object reading. Because both are
    +    time-consuming operations, not being able to perform them in parallel
    +    can cause performance drops. To inform the users about this (and other
    +    threading details), let's also add a "NOTES ON THREADS" section to
         Documentation/git-grep.txt.
     
         [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
11:  4f6f8b611c = 11:  2f72f30341 grep: move driver pre-load out of critical section
 -:  ---------- > 12:  a5891176d7 grep: use no. of cores as the default no. of threads
-- 
2.24.1


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 01/12] grep: fix race conditions on userdiff calls
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Ramkumar Ramachandra, Eric Wong

git-grep uses an internal grep_read_mutex to protect object reading
operations. Similarly, there's a grep_attr_mutex to protect calls to the
gitattributes machinery. However, two of the three functions protected
by the last mutex may also perform object reading, as seen below:

- userdiff_get_textconv() > notes_cache_init() >
  notes_cache_match_validity() > lookup_commit_reference_gently() >
  parse_object() > repo_has_object_file() >
  repo_has_object_file_with_flags() > oid_object_info_extended()

- userdiff_find_by_path() > git_check_attr() > collect_some_attrs() >
  prepare_attr_stack() > read_attr() > read_attr_from_index() >
  read_blob_data_from_index() > read_object_file()

As these calls are not protected by grep_read_mutex, there might be race
conditions with other threads performing object reading (e.g. threads
calling fill_textconv() at grep.c:fill_textconv_grep()). To prevent
that, let's make sure to acquire the lock before both of these calls.

Note: this patch might slow down the threaded grep in worktree, for the
sake of thread-safeness. However, in the following patches, we should
regain performance by replacing grep_read_mutex for an internal object
reading lock and allowing parallel inflation during object reading.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 grep.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 0552b127c1..c028f70aba 100644
--- a/grep.c
+++ b/grep.c
@@ -1816,7 +1816,9 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		 * is not thread-safe.
 		 */
 		grep_attr_lock();
+		grep_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
+		grep_read_unlock();
 		grep_attr_unlock();
 	}
 
@@ -2184,8 +2186,11 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path)
+	if (gs->path) {
+		grep_read_lock();
 		gs->driver = userdiff_find_by_path(istate, gs->path);
+		grep_read_unlock();
+	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 02/12] grep: fix race conditions at grep_submodule()
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Antonio Ospite,
	Stefan Beller

There're currently two function calls in builtin/grep.c:grep_submodule()
which might result in race conditions:

- submodule_from_path(): it has config_with_options() in its call stack
  which, in turn, may have read_object_file() in its own. Therefore,
  calling the first function without acquiring grep_read_mutex may end
  up causing a race condition with other object read operations
  performed by worker threads (for example, at the fill_textconv()
  call in grep.c:fill_textconv_grep()).
- parse_object_or_die(): it falls into the same problem, having
  repo_has_object_file(the_repository, ...) in its call stack. Besides
  that, parse_object(), which is also called by parse_object_or_die(),
  is thread-unsafe and also called by object reading functions.

It's unlikely to really fall into a data race with these operations as
the volume of calls to them is usually very low. But we better protect
ourselves against this possibility, anyway. So, to solve these issues,
move both of these function calls into the critical section of
grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 50ce8d9461..896e7effce 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -407,8 +407,7 @@ static int grep_submodule(struct grep_opt *opt,
 {
 	struct repository subrepo;
 	struct repository *superproject = opt->repo;
-	const struct submodule *sub = submodule_from_path(superproject,
-							  &null_oid, path);
+	const struct submodule *sub;
 	struct grep_opt subopt;
 	int hit;
 
@@ -419,6 +418,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	grep_read_lock();
+	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
 		grep_read_unlock();
@@ -455,9 +455,8 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		object = parse_object_or_die(oid, oid_to_hex(oid));
-
 		grep_read_lock();
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 03/12] grep: fix racy calls in grep_objects()
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Matthieu Moy, brian m. carlson, Alex Henrie,
	Brandon Williams, Stefan Beller

deref_tag() calls is_promisor_object() and parse_object(), both of which
perform lazy initializations and other thread-unsafe operations. If it
was only called by grep_objects() this wouldn't be a problem as the
latter is only executed by the main thread. However, deref_tag() is also
present in read_object_file()'s call stack. So calling deref_tag() in
grep_objects() without acquiring the grep_read_mutex may incur in a race
condition with object reading operations (such as the ones internally
performed by fill_textconv(), called at fill_textconv_grep()). The same
problem happens with the call to gitmodules_config_oid() which also has
parse_object() in its call stack. Fix that protecting both calls with
the said grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 896e7effce..91fc032a32 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -658,13 +658,18 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
+
+		grep_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
+		grep_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
+			grep_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
+			grep_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 04/12] replace-object: make replace operations thread-safe
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (2 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Stefan Beller

replace-object functions are very close to being thread-safe: the only
current racy section is the lazy initialization at
prepare_replace_object(). The following patches will protect some object
reading operations to be called threaded, but before that, replace
functions must be protected. To do so, add a mutex to struct
raw_object_store and acquire it before lazy initializing the
replace_map. This won't cause any noticeable performance drop as the
mutex will no longer be used after the replace_map is initialized.

Later, when the replace functions are called in parallel, thread
debuggers might point our use of the added replace_map_initialized flag
as a data race. However, as this boolean variable is initialized as
false and it's only updated once, there's no real harm. It's perfectly
fine if the value is updated right after a thread read it in
replace-map.h:lookup_replace_object() (there'll only be a performance
penalty for the affected threads at that moment). We could cease the
debugger warning protecting the variable reading at the said function.
However, this would negatively affect performance for all threads
calling it, at any time, so it's not really worthy since the warning
doesn't represent a real problem. Instead, to make sure we don't get
false positives (at ThreadSanitizer, at least) an entry for the
respective function is added to .tsan-suppressions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .tsan-suppressions |  6 ++++++
 object-store.h     |  2 ++
 object.c           |  2 ++
 replace-object.c   | 11 ++++++++++-
 replace-object.h   |  7 ++++++-
 5 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/.tsan-suppressions b/.tsan-suppressions
index 8c85014a0a..5ba86d6845 100644
--- a/.tsan-suppressions
+++ b/.tsan-suppressions
@@ -8,3 +8,9 @@
 # in practice it (hopefully!) doesn't matter.
 race:^want_color$
 race:^transfer_debug$
+
+# A boolean value, which tells whether the replace_map has been initialized or
+# not, is read racily with an update. As this variable is written to only once,
+# and it's OK if the value change right after reading it, this shouldn't be a
+# problem.
+race:^lookup_replace_object$
diff --git a/object-store.h b/object-store.h
index 55ee639350..33739c9dee 100644
--- a/object-store.h
+++ b/object-store.h
@@ -125,6 +125,8 @@ struct raw_object_store {
 	 * (see git-replace(1)).
 	 */
 	struct oidmap *replace_map;
+	unsigned replace_map_initialized : 1;
+	pthread_mutex_t replace_mutex; /* protect object replace functions */
 
 	struct commit_graph *commit_graph;
 	unsigned commit_graph_attempted : 1; /* if loading has been attempted */
diff --git a/object.c b/object.c
index 142ef69399..b4e1d3db3c 100644
--- a/object.c
+++ b/object.c
@@ -480,6 +480,7 @@ struct raw_object_store *raw_object_store_new(void)
 	memset(o, 0, sizeof(*o));
 	INIT_LIST_HEAD(&o->packed_git_mru);
 	hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
+	pthread_mutex_init(&o->replace_mutex, NULL);
 	return o;
 }
 
@@ -507,6 +508,7 @@ void raw_object_store_clear(struct raw_object_store *o)
 
 	oidmap_free(o->replace_map, 1);
 	FREE_AND_NULL(o->replace_map);
+	pthread_mutex_destroy(&o->replace_mutex);
 
 	free_commit_graph(o->commit_graph);
 	o->commit_graph = NULL;
diff --git a/replace-object.c b/replace-object.c
index e295e87943..7bd9aba6ee 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -34,14 +34,23 @@ static int register_replace_ref(struct repository *r,
 
 void prepare_replace_object(struct repository *r)
 {
-	if (r->objects->replace_map)
+	if (r->objects->replace_map_initialized)
 		return;
 
+	pthread_mutex_lock(&r->objects->replace_mutex);
+	if (r->objects->replace_map_initialized) {
+		pthread_mutex_unlock(&r->objects->replace_mutex);
+		return;
+	}
+
 	r->objects->replace_map =
 		xmalloc(sizeof(*r->objects->replace_map));
 	oidmap_init(r->objects->replace_map, 0);
 
 	for_each_replace_ref(r, register_replace_ref, NULL);
+	r->objects->replace_map_initialized = 1;
+
+	pthread_mutex_unlock(&r->objects->replace_mutex);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
diff --git a/replace-object.h b/replace-object.h
index 04ed7a85a2..3fbc32eb7b 100644
--- a/replace-object.h
+++ b/replace-object.h
@@ -24,12 +24,17 @@ const struct object_id *do_lookup_replace_object(struct repository *r,
  * name (replaced recursively, if necessary).  The return value is
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
+ *
+ * Note: some thread debuggers might point a data race on the
+ * replace_map_initialized reading in this function. However, we know there's no
+ * problem in the value being updated by one thread right after another one read
+ * it here (and it should be written to only once, anyway).
  */
 static inline const struct object_id *lookup_replace_object(struct repository *r,
 							    const struct object_id *oid)
 {
 	if (!read_replace_refs ||
-	    (r->objects->replace_map &&
+	    (r->objects->replace_map_initialized &&
 	     r->objects->replace_map->map.tablesize == 0))
 		return oid;
 	return do_lookup_replace_object(r, oid);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 05/12] object-store: allow threaded access to object reading
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (3 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
                       ` (6 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Stefan Beller

Allow object reading to be performed by multiple threads protecting it
with an internal lock, the obj_read_mutex. The lock usage can be toggled
with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
functions which can be safely called in parallel are:
read_object_file_extended(), repo_read_object_file(),
read_object_file(), read_object_with_reference(), read_object(),
oid_object_info() and oid_object_info_extended(). It's also possible
to use obj_read_lock() and obj_read_unlock() to protect other sections
that cannot execute in parallel with object reading.

Probably there are many spots in the functions listed above that could
be executed unlocked (and thus, in parallel). But, for now, we are most
interested in allowing parallel access to zlib inflation. This is one of
the sections where object reading spends most of the time in (e.g. up to
one-third of git-grep's execution time in the chromium repo corresponds
to inflation) and it's already thread-safe. So, to take advantage of
that, the obj_read_mutex is released when calling git_inflate() and
re-acquired right after, for every calling spot in
oid_object_info_extended()'s call chain. We may refine this lock to also
exploit other possible parallel spots in the future, but for now,
threaded zlib inflation should already give great speedups for threaded
object reading callers.

Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
it would be now, with the parallel inflation. Take for example the
following situation, where two threads - A and B - are executing the
code at unpack_entry():

1. Thread A is performing the decompression of a base O (which is not
   yet in the cache) at PHASE II. Thread B is simultaneously trying to
   unpack O, but just starting at PHASE I.
2. Since O is not yet in the cache, B will go to PHASE II to also
   perform the decompression.
3. When they finish decompressing, one of them will get the object
   reading mutex and go to PHASE III while the other waits for the
   mutex. Let’s say A got the mutex first.
4. Thread A will add O to the cache, go throughout the rest of PHASE III
   and return.
5. Thread B gets the mutex, also add O to the cache (if the check wasn't
   there) and returns.

Finally, it is also important to highlight that the object reading lock
can only ensure thread-safety in the mentioned functions thanks to two
complementary mechanisms: the use of 'struct raw_object_store's
replace_mutex, which guards sections in the object reading machinery
that would otherwise be thread-unsafe; and the 'struct pack_window's
inuse_cnt, which protects window reading operations (such as the one
performed during the inflation of a packed object), allowing them to
execute without the acquisition of the obj_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h | 35 +++++++++++++++++++++++++++++++
 packfile.c     | 32 ++++++++++++++++++++++++++++
 sha1-file.c    | 57 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 119 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 33739c9dee..7c80e0d64c 100644
--- a/object-store.h
+++ b/object-store.h
@@ -6,6 +6,7 @@
 #include "list.h"
 #include "sha1-array.h"
 #include "strbuf.h"
+#include "thread-utils.h"
 
 struct object_directory {
 	struct object_directory *next;
@@ -251,6 +252,40 @@ int has_loose_object_nonlocal(const struct object_id *);
 
 void assert_oid_type(const struct object_id *oid, enum object_type expect);
 
+/*
+ * Enabling the object read lock allows multiple threads to safely call the
+ * following functions in parallel: repo_read_object_file(), read_object_file(),
+ * read_object_file_extended(), read_object_with_reference(), read_object(),
+ * oid_object_info() and oid_object_info_extended().
+ *
+ * obj_read_lock() and obj_read_unlock() may also be used to protect other
+ * section which cannot execute in parallel with object reading. Since the used
+ * lock is a recursive mutex, these sections can even contain calls to object
+ * reading functions. However, beware that in these cases zlib inflation won't
+ * be performed in parallel, losing performance.
+ *
+ * TODO: oid_object_info_extended()'s call stack has a recursive behavior. If
+ * any of its callees end up calling it, this recursive call won't benefit from
+ * parallel inflation.
+ */
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+
+extern int obj_read_use_lock;
+extern pthread_mutex_t obj_read_mutex;
+
+static inline void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+static inline void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 struct object_info {
 	/* Request */
 	enum object_type *typep;
diff --git a/packfile.c b/packfile.c
index 7e7c04e4d8..24a73fc33a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1086,7 +1086,23 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: the window section returned by use_pack() must be
+		 * available throughout git_inflate()'s unlocked execution. To
+		 * ensure no other thread will modify the window in the
+		 * meantime, we rely on the packed_window.inuse_cnt. This
+		 * counter is incremented before window reading and checked
+		 * before window disposal.
+		 *
+		 * Other worrying sections could be the call to close_pack_fd(),
+		 * which can close packs even with in-use windows, and to
+		 * reprepare_packed_git(). Regarding the former, mmap doc says:
+		 * "closing the file descriptor does not unmap the region". And
+		 * for the latter, it won't re-open already available packs.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1445,6 +1461,14 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	/*
+	 * Check required to avoid redundant entries when more than one thread
+	 * is unpacking the same object, in unpack_entry() (since its phases I
+	 * and III might run concurrently across multiple threads).
+	 */
+	if (in_delta_base_cache(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1574,7 +1598,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: we must ensure the window section returned by
+		 * use_pack() will be available throughout git_inflate()'s
+		 * unlocked execution. Please refer to the comment at
+		 * get_size_from_delta() to see how this is done.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index 188de57634..9dc0649748 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1147,6 +1147,8 @@ static int unpack_loose_short_header(git_zstream *stream,
 				     unsigned char *map, unsigned long mapsize,
 				     void *buffer, unsigned long bufsiz)
 {
+	int ret;
+
 	/* Get the data stream */
 	memset(stream, 0, sizeof(*stream));
 	stream->next_in = map;
@@ -1155,7 +1157,11 @@ static int unpack_loose_short_header(git_zstream *stream,
 	stream->avail_out = bufsiz;
 
 	git_inflate_init(stream);
-	return git_inflate(stream, 0);
+	obj_read_unlock();
+	ret = git_inflate(stream, 0);
+	obj_read_lock();
+
+	return ret;
 }
 
 int unpack_loose_header(git_zstream *stream,
@@ -1200,7 +1206,9 @@ static int unpack_loose_header_to_strbuf(git_zstream *stream, unsigned char *map
 	stream->avail_out = bufsiz;
 
 	do {
+		obj_read_unlock();
 		status = git_inflate(stream, 0);
+		obj_read_lock();
 		strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
 		if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
 			return 0;
@@ -1240,8 +1248,11 @@ static void *unpack_loose_rest(git_zstream *stream,
 		 */
 		stream->next_out = buf + bytes;
 		stream->avail_out = size - bytes;
-		while (status == Z_OK)
+		while (status == Z_OK) {
+			obj_read_unlock();
 			status = git_inflate(stream, Z_FINISH);
+			obj_read_lock();
+		}
 	}
 	if (status == Z_STREAM_END && !stream->avail_in) {
 		git_inflate_end(stream);
@@ -1411,10 +1422,32 @@ static int loose_object_info(struct repository *r,
 	return (status < 0) ? status : 0;
 }
 
+int obj_read_use_lock = 0;
+pthread_mutex_t obj_read_mutex;
+
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	init_recursive_mutex(&obj_read_mutex);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
 int fetch_if_missing = 1;
 
-int oid_object_info_extended(struct repository *r, const struct object_id *oid,
-			     struct object_info *oi, unsigned flags)
+static int do_oid_object_info_extended(struct repository *r,
+				       const struct object_id *oid,
+				       struct object_info *oi, unsigned flags)
 {
 	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	struct pack_entry e;
@@ -1422,6 +1455,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
+
 	if (flags & OBJECT_INFO_LOOKUP_REPLACE)
 		real = lookup_replace_object(r, oid);
 
@@ -1497,7 +1531,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	rtype = packed_object_info(r, e.p, e.offset, oi);
 	if (rtype < 0) {
 		mark_bad_packed_object(e.p, real->hash);
-		return oid_object_info_extended(r, real, oi, 0);
+		return do_oid_object_info_extended(r, real, oi, 0);
 	} else if (oi->whence == OI_PACKED) {
 		oi->u.packed.offset = e.offset;
 		oi->u.packed.pack = e.p;
@@ -1508,6 +1542,17 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	return 0;
 }
 
+int oid_object_info_extended(struct repository *r, const struct object_id *oid,
+			     struct object_info *oi, unsigned flags)
+{
+	int ret;
+	obj_read_lock();
+	ret = do_oid_object_info_extended(r, oid, oi, flags);
+	obj_read_unlock();
+	return ret;
+}
+
+
 /* returns enum object_type or negative */
 int oid_object_info(struct repository *r,
 		    const struct object_id *oid,
@@ -1580,6 +1625,7 @@ void *read_object_file_extended(struct repository *r,
 	if (data)
 		return data;
 
+	obj_read_lock();
 	if (errno && errno != ENOENT)
 		die_errno(_("failed to read object %s"), oid_to_hex(oid));
 
@@ -1595,6 +1641,7 @@ void *read_object_file_extended(struct repository *r,
 	if ((p = has_packed_and_bad(r, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
+	obj_read_unlock();
 
 	return NULL;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (4 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, brian m. carlson, Stefan Beller,
	Brandon Williams

git-grep uses 'grep_read_mutex' to protect its calls to object reading
operations. But these have their own internal lock now, which ensures a
better performance (allowing parallel access to more regions). So, let's
remove the former and, instead, activate the latter with
enable_obj_read_lock().

Sections that are currently protected by 'grep_read_mutex' but are not
internally protected by the object reading lock should be surrounded by
obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion
with object reading operations, keeping the current behavior and
avoiding race conditions. Namely, these places are:

  In grep.c:

  - fill_textconv() at fill_textconv_grep().
  - userdiff_get_textconv() at grep_source_1().

  In builtin/grep.c:

  - parse_object_or_die() and the submodule functions at
    grep_submodule().
  - deref_tag() and gitmodules_config_oid() at grep_objects().

If these functions become thread-safe, in the future, we might remove
the locking and probably get some speedup.

Note that some of the submodule functions will already be thread-safe
(or close to being thread-safe) with the internal object reading lock.
However, as some of them will require additional modifications to be
removed from the critical section, this will be done in its own patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 46 ++++++++++++++++------------------------------
 grep.c         | 39 +++++++++++++++++++--------------------
 grep.h         | 13 -------------
 3 files changed, 35 insertions(+), 63 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 91fc032a32..4a436d6c99 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -200,12 +200,12 @@ static void start_threads(struct grep_opt *opt)
 	int i;
 
 	pthread_mutex_init(&grep_mutex, NULL);
-	pthread_mutex_init(&grep_read_mutex, NULL);
 	pthread_mutex_init(&grep_attr_mutex, NULL);
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
 	grep_use_locks = 1;
+	enable_obj_read_lock();
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -257,12 +257,12 @@ static int wait_all(void)
 	free(threads);
 
 	pthread_mutex_destroy(&grep_mutex);
-	pthread_mutex_destroy(&grep_read_mutex);
 	pthread_mutex_destroy(&grep_attr_mutex);
 	pthread_cond_destroy(&cond_add);
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	disable_obj_read_lock();
 
 	return hit;
 }
@@ -295,16 +295,6 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 	return st;
 }
 
-static void *lock_and_read_oid_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
-{
-	void *data;
-
-	grep_read_lock();
-	data = read_object_file(oid, type, size);
-	grep_read_unlock();
-	return data;
-}
-
 static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 		     const char *filename, int tree_name_len,
 		     const char *path)
@@ -413,20 +403,20 @@ static int grep_submodule(struct grep_opt *opt,
 
 	/*
 	 * NEEDSWORK: submodules functions need to be protected because they
-	 * access the object store via config_from_gitmodules(): the latter
-	 * uses get_oid() which, for now, relies on the global the_repository
-	 * object.
+	 * call config_from_gitmodules(): the latter contains in its call stack
+	 * many thread-unsafe operations that are racy with object reading, such
+	 * as parse_object() and is_promisor_object().
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
 	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
@@ -443,7 +433,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	add_to_alternates_memory(subrepo.objects->odb->path);
-	grep_read_unlock();
+	obj_read_unlock();
 
 	memcpy(&subopt, opt, sizeof(subopt));
 	subopt.repo = &subrepo;
@@ -455,13 +445,12 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		grep_read_lock();
+		obj_read_lock();
 		object = parse_object_or_die(oid, oid_to_hex(oid));
+		obj_read_unlock();
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
@@ -586,7 +575,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			void *data;
 			unsigned long size;
 
-			data = lock_and_read_oid_file(&entry.oid, &type, &size);
+			data = read_object_file(&entry.oid, &type, &size);
 			if (!data)
 				die(_("unable to read tree (%s)"),
 				    oid_to_hex(&entry.oid));
@@ -624,12 +613,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		struct strbuf base;
 		int hit, len;
 
-		grep_read_lock();
 		data = read_object_with_reference(opt->repo,
 						  &obj->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
@@ -659,17 +645,17 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 
-		grep_read_lock();
+		obj_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
-		grep_read_unlock();
+		obj_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
-			grep_read_lock();
+			obj_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
-			grep_read_unlock();
+			obj_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
diff --git a/grep.c b/grep.c
index c028f70aba..13232a904a 100644
--- a/grep.c
+++ b/grep.c
@@ -1540,11 +1540,6 @@ static inline void grep_attr_unlock(void)
 		pthread_mutex_unlock(&grep_attr_mutex);
 }
 
-/*
- * Same as git_attr_mutex, but protecting the thread-unsafe object db access.
- */
-pthread_mutex_t grep_read_mutex;
-
 static int match_funcname(struct grep_opt *opt, struct grep_source *gs, char *bol, char *eol)
 {
 	xdemitconf_t *xecfg = opt->priv;
@@ -1741,13 +1736,20 @@ static int fill_textconv_grep(struct repository *r,
 	}
 
 	/*
-	 * fill_textconv is not remotely thread-safe; it may load objects
-	 * behind the scenes, and it modifies the global diff tempfile
-	 * structure.
+	 * fill_textconv is not remotely thread-safe; it modifies the global
+	 * diff tempfile structure, writes to the_repo's odb and might
+	 * internally call thread-unsafe functions such as the
+	 * prepare_packed_git() lazy-initializator. Because of the last two, we
+	 * must ensure mutual exclusion between this call and the object reading
+	 * API, thus we use obj_read_lock() here.
+	 *
+	 * TODO: allowing text conversion to run in parallel with object
+	 * reading operations might increase performance in the multithreaded
+	 * non-worktreee git-grep with --textconv.
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	size = fill_textconv(r, driver, df, &buf);
-	grep_read_unlock();
+	obj_read_unlock();
 	free_filespec(df);
 
 	/*
@@ -1813,12 +1815,15 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		grep_source_load_driver(gs, opt->repo->index);
 		/*
 		 * We might set up the shared textconv cache data here, which
-		 * is not thread-safe.
+		 * is not thread-safe. Also, get_oid_with_context() and
+		 * parse_object() might be internally called. As they are not
+		 * currenty thread-safe and might be racy with object reading,
+		 * obj_read_lock() must be called.
 		 */
 		grep_attr_lock();
-		grep_read_lock();
+		obj_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
-		grep_read_unlock();
+		obj_read_unlock();
 		grep_attr_unlock();
 	}
 
@@ -2118,10 +2123,7 @@ static int grep_source_load_oid(struct grep_source *gs)
 {
 	enum object_type type;
 
-	grep_read_lock();
 	gs->buf = read_object_file(gs->identifier, &type, &gs->size);
-	grep_read_unlock();
-
 	if (!gs->buf)
 		return error(_("'%s': unable to read %s"),
 			     gs->name,
@@ -2186,11 +2188,8 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path) {
-		grep_read_lock();
+	if (gs->path)
 		gs->driver = userdiff_find_by_path(istate, gs->path);
-		grep_read_unlock();
-	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
diff --git a/grep.h b/grep.h
index 811fd274c9..9115db8515 100644
--- a/grep.h
+++ b/grep.h
@@ -220,18 +220,5 @@ int grep_threads_ok(const struct grep_opt *opt);
  */
 extern int grep_use_locks;
 extern pthread_mutex_t grep_attr_mutex;
-extern pthread_mutex_t grep_read_mutex;
-
-static inline void grep_read_lock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_lock(&grep_read_mutex);
-}
-
-static inline void grep_read_unlock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_unlock(&grep_read_mutex);
-}
 
 #endif
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules()
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (5 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
                       ` (4 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Stefan Beller

Currently, submodule-config.c doesn't have an externally accessible
function to read gitmodules only if it wasn't already read. But this
exact behavior is internally implemented by gitmodules_read_check(), to
perform a lazy load. Let's merge this function with
repo_read_gitmodules() adding a 'skip_if_read' which allows both
internal and external callers to access this functionality. This
simplifies a little the code. The added option will also be used in
the following patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c     |  2 +-
 submodule-config.c | 18 ++++++------------
 submodule-config.h |  2 +-
 unpack-trees.c     |  4 ++--
 4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 4a436d6c99..d3ed05c1da 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -420,7 +420,7 @@ static int grep_submodule(struct grep_opt *opt,
 		return 0;
 	}
 
-	repo_read_gitmodules(&subrepo);
+	repo_read_gitmodules(&subrepo, 0);
 
 	/*
 	 * NEEDSWORK: This adds the submodule's object directory to the list of
diff --git a/submodule-config.c b/submodule-config.c
index 85064810b2..bd5e14ab20 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -674,10 +674,13 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 	return parse_config(var, value, &parameter);
 }
 
-void repo_read_gitmodules(struct repository *repo)
+void repo_read_gitmodules(struct repository *repo, int skip_if_read)
 {
 	submodule_cache_check_init(repo);
 
+	if (repo->submodule_cache->gitmodules_read && skip_if_read)
+		return;
+
 	if (repo_read_index(repo) < 0)
 		return;
 
@@ -703,20 +706,11 @@ void gitmodules_config_oid(const struct object_id *commit_oid)
 	the_repository->submodule_cache->gitmodules_read = 1;
 }
 
-static void gitmodules_read_check(struct repository *repo)
-{
-	submodule_cache_check_init(repo);
-
-	/* read the repo's .gitmodules file if it hasn't been already */
-	if (!repo->submodule_cache->gitmodules_read)
-		repo_read_gitmodules(repo);
-}
-
 const struct submodule *submodule_from_name(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *name)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, name, lookup_name);
 }
 
@@ -724,7 +718,7 @@ const struct submodule *submodule_from_path(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *path)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, path, lookup_path);
 }
 
diff --git a/submodule-config.h b/submodule-config.h
index 42918b55e8..c11e22cf50 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -61,7 +61,7 @@ int option_fetch_parse_recurse_submodules(const struct option *opt,
 					  const char *arg, int unset);
 int parse_update_recurse_submodules_arg(const char *opt, const char *arg);
 int parse_push_recurse_submodules_arg(const char *opt, const char *arg);
-void repo_read_gitmodules(struct repository *repo);
+void repo_read_gitmodules(struct repository *repo, int skip_if_read);
 void gitmodules_config_oid(const struct object_id *commit_oid);
 
 /**
diff --git a/unpack-trees.c b/unpack-trees.c
index 2399b6818b..f5a8051803 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -291,11 +291,11 @@ static void load_gitmodules_file(struct index_state *index,
 	if (pos >= 0) {
 		struct cache_entry *ce = index->cache[pos];
 		if (!state && ce->ce_flags & CE_WT_REMOVE) {
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		} else if (state && (ce->ce_flags & CE_UPDATE)) {
 			submodule_free(the_repository);
 			checkout_entry(ce, state, NULL, NULL);
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		}
 	}
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (6 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-29 11:26       ` SZEDER Gábor
  2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
                       ` (3 subsequent siblings)
  11 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Stefan Beller

Now that object reading operations are internally protected, the
submodule initialization functions at builtin/grep.c:grep_submodule()
are very close to being thread-safe. Let's take a look at each call and
remove from the critical section what we can, for better performance:

- submodule_from_path() and is_submodule_active() cannot be called in
  parallel yet only because they call repo_read_gitmodules() which
  contains, in its call stack, operations that would otherwise be in
  race condition with object reading (for example parse_object() and
  is_promisor_remote()). However, they only call repo_read_gitmodules()
  if it wasn't read before. So let's pre-read it before firing the
  threads and allow these two functions to safely be called in
  parallel.

- repo_submodule_init() is already thread-safe, so remove it from the
  critical section without other necessary changes.

- The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
  no other thread is performing object reading operations in the subrepo
  yet. However, threads might be working in the superproject, and this
  function calls add_to_alternates_memory() internally, which is racy
  with object readings in the superproject. So it must be kept
  protected for now. Let's add a "NEEDSWORK" to it, informing why it
  cannot be removed from the critical section yet.

- Finally, add_to_alternates_memory() must be kept protected for the
  same reason as the item above.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d3ed05c1da..ac3d86c2e5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt,
 	struct grep_opt subopt;
 	int hit;
 
-	/*
-	 * NEEDSWORK: submodules functions need to be protected because they
-	 * call config_from_gitmodules(): the latter contains in its call stack
-	 * many thread-unsafe operations that are racy with object reading, such
-	 * as parse_object() and is_promisor_object().
-	 */
-	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
-	if (!is_submodule_active(superproject, path)) {
-		obj_read_unlock();
+	if (!is_submodule_active(superproject, path))
 		return 0;
-	}
 
-	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		obj_read_unlock();
+	if (repo_submodule_init(&subrepo, superproject, sub))
 		return 0;
-	}
 
+	/*
+	 * NEEDSWORK: repo_read_gitmodules() might call
+	 * add_to_alternates_memory() via config_from_gitmodules(). This
+	 * operation causes a race condition with concurrent object readings
+	 * performed by the worker threads. That's why we need obj_read_lock()
+	 * here. It should be removed once it's no longer necessary to add the
+	 * subrepo's odbs to the in-memory alternates list.
+	 */
+	obj_read_lock();
 	repo_read_gitmodules(&subrepo, 0);
 
 	/*
@@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
+	if (recurse_submodules && (!use_index || untracked))
+		die(_("option not supported with --recurse-submodules"));
+
 	if (list.nr || cached || show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
@@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		    && (opt.pre_context || opt.post_context ||
 			opt.file_break || opt.funcbody))
 			skip_first_line = 1;
+
+		/*
+		 * Pre-read gitmodules (if not read already) to prevent racy
+		 * lazy reading in worker threads.
+		 */
+		if (recurse_submodules)
+			repo_read_gitmodules(the_repository, 1);
+
 		start_threads(&opt);
 	} else {
 		/*
@@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (recurse_submodules && (!use_index || untracked))
-		die(_("option not supported with --recurse-submodules"));
-
 	if (!show_in_pager && !opt.status_only)
 		setup_pager();
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 09/12] grep: protect packed_git [re-]initialization
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (7 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, René Scharfe, Stefan Beller,
	Brandon Williams

Some fields in struct raw_object_store are lazy initialized by the
thread-unsafe packfile.c:prepare_packed_git(). Although this function is
present in the call stack of git-grep threads, all paths to it are
currently protected by obj_read_lock() (and the main thread usually
indirectly calls it before firing the worker threads, anyway). However,
it's possible that future modifications add new unprotected paths to it,
introducing a race condition. Because errors derived from it wouldn't
happen often, it could be hard to detect. So to prevent future
headaches, let's force eager initialization of packed_git when setting
git-grep up. There'll be a small overhead in the cases where we didn't
really need to prepare packed_git during execution but this shouldn't be
very noticeable.

Also, packed_git may be re-initialized by
packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
are already protected by obj_read_lock() but it may suffer from the same
problem in the future. So let's also internally protect it with
obj_read_lock() (which is a recursive mutex).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++++--
 packfile.c     | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index ac3d86c2e5..1535fd50f8 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -24,6 +24,7 @@
 #include "submodule.h"
 #include "submodule-config.h"
 #include "object-store.h"
+#include "packfile.h"
 
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
@@ -1074,11 +1075,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			skip_first_line = 1;
 
 		/*
-		 * Pre-read gitmodules (if not read already) to prevent racy
-		 * lazy reading in worker threads.
+		 * Pre-read gitmodules (if not read already) and force eager
+		 * initialization of packed_git to prevent racy lazy
+		 * reading/initialization once worker threads are started.
 		 */
 		if (recurse_submodules)
 			repo_read_gitmodules(the_repository, 1);
+		if (startup_info->have_repository)
+			(void)get_packed_git(the_repository);
 
 		start_threads(&opt);
 	} else {
diff --git a/packfile.c b/packfile.c
index 24a73fc33a..946ca83e7a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1004,12 +1004,14 @@ void reprepare_packed_git(struct repository *r)
 {
 	struct object_directory *odb;
 
+	obj_read_lock();
 	for (odb = r->objects->odb; odb; odb = odb->next)
 		odb_clear_loose_cache(odb);
 
 	r->objects->approximate_object_count_valid = 0;
 	r->objects->packed_git_initialized = 0;
 	prepare_packed_git(r);
+	obj_read_unlock();
 }
 
 struct packed_git *get_packed_git(struct repository *r)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 10/12] grep: re-enable threads in non-worktree case
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (8 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Eric Wong,
	Ramkumar Ramachandra, Manav Rathi

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops (to the point
that using a single thread would be faster than multiple threads). But
now that zlib inflation can be performed in parallel we can regain the
speedup, so let's re-enable threads in non-worktree grep.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.2920s  |  20.9624s
    2    |   9.6512s  |  11.3184s
    4    |   6.7723s  |   7.6268s
    8**  |   6.2886s  |   6.9843s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
SSD, running Manjaro Linux. But to make sure the optimization also
performs well on HDD, the tests were repeated on another machine with an
i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
5400 rpm), also running Manjaro Linux:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  18.4035s  |  22.5368s
    2    |  12.5063s  |  14.6409s
    4**  |  10.9136s  |  12.7106s

** Note that in these cases we relied on hyper-threading, and that's
   probably why we don't see a big difference in time.

Unfortunately, multithreaded git-grep might be slow in the non-worktree
case when --textconv is used and there're too many text conversions.
Probably the reason for this is that the object read lock is used to
protect fill_textconv() and therefore there is a mutual exclusion
between textconv execution and object reading. Because both are
time-consuming operations, not being able to perform them in parallel
can cause performance drops. To inform the users about this (and other
threading details), let's also add a "NOTES ON THREADS" section to
Documentation/git-grep.txt.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 11 +++++++++++
 builtin/grep.c             |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index c89fb569e3..de628741fa 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -347,6 +347,17 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES ON THREADS
+----------------
+
+The `--threads` option (and the grep.threads configuration) will be ignored when
+`--open-files-in-pager` is used, forcing a single-threaded execution.
+
+When grepping the object store (with `--cached` or giving tree objects), running
+with multiple threads might perform slower than single threaded if `--textconv`
+is given and there're too many text conversions. So if you experience low
+performance in this case, it might be desirable to use `--threads=1`.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index 1535fd50f8..6aaa8d4406 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1054,7 +1054,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	if (recurse_submodules && (!use_index || untracked))
 		die(_("option not supported with --recurse-submodules"));
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 11/12] grep: move driver pre-load out of critical section
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (9 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
  11 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Alex Henrie, Rasmus Villemoes, Matthieu Moy

In builtin/grep.c:add_work() we pre-load the userdiff drivers before
adding the grep_source in the todo list. This operation is currently
being performed after acquiring the grep_mutex, but as it's already
thread-safe, we don't need to protect it here. So let's move it out of
the critical section which should avoid thread contention and improve
performance.

Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's
repository[2], I got the following mean times for 30 executions after 2
warmups:

        Original         |  6.2886s
-------------------------|-----------
 Out of critical section |  5.7852s

[1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running
     Manjaro Linux.
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
         04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 6aaa8d4406..a85b710b48 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -92,8 +92,11 @@ static pthread_cond_t cond_result;
 
 static int skip_first_line;
 
-static void add_work(struct grep_opt *opt, const struct grep_source *gs)
+static void add_work(struct grep_opt *opt, struct grep_source *gs)
 {
+	if (opt->binary != GREP_BINARY_TEXT)
+		grep_source_load_driver(gs, opt->repo->index);
+
 	grep_lock();
 
 	while ((todo_end+1) % ARRAY_SIZE(todo) == todo_done) {
@@ -101,9 +104,6 @@ static void add_work(struct grep_opt *opt, const struct grep_source *gs)
 	}
 
 	todo[todo_end].source = *gs;
-	if (opt->binary != GREP_BINARY_TEXT)
-		grep_source_load_driver(&todo[todo_end].source,
-					opt->repo->index);
 	todo[todo_end].done = 0;
 	strbuf_reset(&todo[todo_end].out);
 	todo_end = (todo_end + 1) % ARRAY_SIZE(todo);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 12/12] grep: use no. of cores as the default no. of threads
  2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
                       ` (10 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
@ 2020-01-16  2:40     ` Matheus Tavares
  2020-01-16 13:11       ` Victor Leschuk
  11 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:40 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Alex Henrie,
	Victor Leschuk, Ramkumar Ramachandra, Victor Leschuk, Eric Wong,
	Matthieu Moy

When --threads is not specified, git-grep will use 8 threads by default.
This fixed number may be too many for machines with fewer cores and too
little for machines with more cores. So, instead, use the number of
logical cores available in the machine, which seems to result in the
best overall performance: The following measurements correspond to the
mean elapsed times for 30 git-grep executions in chromium's
repository[1] with a 95% confidence interval (each set of 30 were
performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
'(static|extern) (int|double) \*'.

      |          Working tree         |           Object Store
------|-------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
------|---------------|---------------|----------------|---------------
  32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
  16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
>  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
   4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
   2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
   1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01

The above tests were performed in a desktop running Debian 10.0 with
Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
RAM and a 7200 rpm, SATA 3.1 HDD.

Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:

      |          Working tree          |           Object Store
------|--------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
------|---------------|----------------|----------------|---------------
  32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
  16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
>  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
   4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
   2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
   1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 4 ++--
 builtin/grep.c             | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index de628741fa..eb5412724f 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -59,8 +59,8 @@ grep.extendedRegexp::
 	other than 'default'.
 
 grep.threads::
-	Number of grep worker threads to use.  If unset (or set to 0),
-	8 threads are used by default (for now).
+	Number of grep worker threads to use. If unset (or set to 0), Git will
+	use as many threads as the number of logical cores available.
 
 grep.fullName::
 	If set to true, enable `--full-name` option by default.
diff --git a/builtin/grep.c b/builtin/grep.c
index a85b710b48..629eaf5dbc 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
 
 static int recurse_submodules;
 
-#define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
 
 static pthread_t *threads;
@@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	} else if (num_threads < 0)
 		die(_("invalid number of threads specified (%d)"), num_threads);
 	else if (num_threads == 0)
-		num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
+		num_threads = HAVE_THREADS ? online_cpus() : 1;
 
 	if (num_threads > 1) {
 		if (!HAVE_THREADS)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 12/12] grep: use no. of cores as the default no. of threads
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
@ 2020-01-16 13:11       ` Victor Leschuk
  2020-01-16 14:47         ` [PATCH] " Matheus Tavares
  0 siblings, 1 reply; 47+ messages in thread
From: Victor Leschuk @ 2020-01-16 13:11 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: Git List, christian.couder, Junio C Hamano, jrnieder,
	olyatelezhnaya, Nguyễn Thái Ngọc Duy,
	Jonathan Tan, Jeff King, Brandon Williams, Alex Henrie,
	Ramkumar Ramachandra, Victor Leschuk, Eric Wong, Matthieu Moy

Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
use not online_cpus() but online_cpus()*2?

--
Victor


On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> When --threads is not specified, git-grep will use 8 threads by default.
> This fixed number may be too many for machines with fewer cores and too
> little for machines with more cores. So, instead, use the number of
> logical cores available in the machine, which seems to result in the
> best overall performance: The following measurements correspond to the
> mean elapsed times for 30 git-grep executions in chromium's
> repository[1] with a 95% confidence interval (each set of 30 were
> performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> '(static|extern) (int|double) \*'.
>
>       |          Working tree         |           Object Store
> ------|-------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> ------|---------------|---------------|----------------|---------------
>   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
>   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
>    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
>    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
>    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
>
> The above tests were performed in a desktop running Debian 10.0 with
> Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> RAM and a 7200 rpm, SATA 3.1 HDD.
>
> Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
>
>       |          Working tree          |           Object Store
> ------|--------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> ------|---------------|----------------|----------------|---------------
>   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
>   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
>    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
>    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
>    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08
>
> [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
>      04-06-2019), after a 'git gc' execution.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  Documentation/git-grep.txt | 4 ++--
>  builtin/grep.c             | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index de628741fa..eb5412724f 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -59,8 +59,8 @@ grep.extendedRegexp::
>         other than 'default'.
>
>  grep.threads::
> -       Number of grep worker threads to use.  If unset (or set to 0),
> -       8 threads are used by default (for now).
> +       Number of grep worker threads to use. If unset (or set to 0), Git will
> +       use as many threads as the number of logical cores available.
>
>  grep.fullName::
>         If set to true, enable `--full-name` option by default.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index a85b710b48..629eaf5dbc 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
>
>  static int recurse_submodules;
>
> -#define GREP_NUM_THREADS_DEFAULT 8
>  static int num_threads;
>
>  static pthread_t *threads;
> @@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         } else if (num_threads < 0)
>                 die(_("invalid number of threads specified (%d)"), num_threads);
>         else if (num_threads == 0)
> -               num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
> +               num_threads = HAVE_THREADS ? online_cpus() : 1;
>
>         if (num_threads > 1) {
>                 if (!HAVE_THREADS)
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] grep: use no. of cores as the default no. of threads
  2020-01-16 13:11       ` Victor Leschuk
@ 2020-01-16 14:47         ` Matheus Tavares
  0 siblings, 0 replies; 47+ messages in thread
From: Matheus Tavares @ 2020-01-16 14:47 UTC (permalink / raw)
  To: vleschuk
  Cc: alexhenrie24, artagnon, bwilliams.eng, christian.couder, e, git,
	git, gitster, jonathantanmy, jrnieder, matheus.bernardino,
	olyatelezhnaya, pclouds, peff, vleschuk

Hi, Victor

On Thu, Jan 16, 2020 at 10:11 AM Victor Leschuk <vleschuk@gmail.com> wrote:
>
> Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
> use not online_cpus() but online_cpus()*2?

I also tried this approach, but the tests I ran with online_cpus() * 2
only showed slowdowns. The results can be seen in the commit message:

> On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
[...]
> > The following measurements correspond to the
> > mean elapsed times for 30 git-grep executions in chromium's
> > repository[1] with a 95% confidence interval (each set of 30 were
> > performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> > '(static|extern) (int|double) \*'.
> >
> >       |          Working tree         |           Object Store
> > ------|-------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> > ------|---------------|---------------|----------------|---------------
> >   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
> >   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> > >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
> >    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
> >    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
> >    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
> >
> > The above tests were performed in a desktop running Debian 10.0 with
> > Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> > RAM and a 7200 rpm, SATA 3.1 HDD.
> >
> > Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> > with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
> >
> >       |          Working tree          |           Object Store
> > ------|--------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> > ------|---------------|----------------|----------------|---------------
> >   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
> >   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> > >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
> >    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
> >    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
> >    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

I deliberately used somehow complex regexes for these tests. So I
decided to do one more test with a very simple fixed string ("abc"),
allowing git-grep to spend less time in the cpu-bound regex searching.
The results can be seen bellow (the metodology is the same as described
above and the machine is the Manjaro laptop, for which online_cpus()
returns 8):

  #ths |  Working Three |  Object Store
 ------|----------------|---------------
    16 |  3.22s ± 0.20  |  5.96s ± 0.06
     8 |  2.92s ± 0.01  |  5.73s ± 0.02


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
@ 2020-01-29 11:26       ` SZEDER Gábor
  2020-01-29 18:49         ` Junio C Hamano
  2020-01-29 18:57         ` Junio C Hamano
  0 siblings, 2 replies; 47+ messages in thread
From: SZEDER Gábor @ 2020-01-29 11:26 UTC (permalink / raw)
  To: Matheus Tavares, gitster, Philippe Blain
  Cc: git, christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Stefan Beller

Junio, Matheus, Philippe,

this patch below and a7f3240877 (grep: ignore --recurse-submodules if
--no-index is given, 2020-01-26) on topic
'pb/do-not-recurse-grep-no-index' don't work well together, and cause
failure of the test 'grep --recurse-submodules --no-index ignores
--recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
the new test added in a7f3240877.

More below.

On Wed, Jan 15, 2020 at 11:39:56PM -0300, Matheus Tavares wrote:
> Now that object reading operations are internally protected, the
> submodule initialization functions at builtin/grep.c:grep_submodule()
> are very close to being thread-safe. Let's take a look at each call and
> remove from the critical section what we can, for better performance:
> 
> - submodule_from_path() and is_submodule_active() cannot be called in
>   parallel yet only because they call repo_read_gitmodules() which
>   contains, in its call stack, operations that would otherwise be in
>   race condition with object reading (for example parse_object() and
>   is_promisor_remote()). However, they only call repo_read_gitmodules()
>   if it wasn't read before. So let's pre-read it before firing the
>   threads and allow these two functions to safely be called in
>   parallel.
> 
> - repo_submodule_init() is already thread-safe, so remove it from the
>   critical section without other necessary changes.
> 
> - The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
>   no other thread is performing object reading operations in the subrepo
>   yet. However, threads might be working in the superproject, and this
>   function calls add_to_alternates_memory() internally, which is racy
>   with object readings in the superproject. So it must be kept
>   protected for now. Let's add a "NEEDSWORK" to it, informing why it
>   cannot be removed from the critical section yet.
> 
> - Finally, add_to_alternates_memory() must be kept protected for the
>   same reason as the item above.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  builtin/grep.c | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index d3ed05c1da..ac3d86c2e5 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt,
>  	struct grep_opt subopt;
>  	int hit;
>  
> -	/*
> -	 * NEEDSWORK: submodules functions need to be protected because they
> -	 * call config_from_gitmodules(): the latter contains in its call stack
> -	 * many thread-unsafe operations that are racy with object reading, such
> -	 * as parse_object() and is_promisor_object().
> -	 */
> -	obj_read_lock();
>  	sub = submodule_from_path(superproject, &null_oid, path);
>  
> -	if (!is_submodule_active(superproject, path)) {
> -		obj_read_unlock();
> +	if (!is_submodule_active(superproject, path))
>  		return 0;
> -	}
>  
> -	if (repo_submodule_init(&subrepo, superproject, sub)) {
> -		obj_read_unlock();
> +	if (repo_submodule_init(&subrepo, superproject, sub))
>  		return 0;
> -	}
>  
> +	/*
> +	 * NEEDSWORK: repo_read_gitmodules() might call
> +	 * add_to_alternates_memory() via config_from_gitmodules(). This
> +	 * operation causes a race condition with concurrent object readings
> +	 * performed by the worker threads. That's why we need obj_read_lock()
> +	 * here. It should be removed once it's no longer necessary to add the
> +	 * subrepo's odbs to the in-memory alternates list.
> +	 */
> +	obj_read_lock();
>  	repo_read_gitmodules(&subrepo, 0);
>  
>  	/*
> @@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  	pathspec.recursive = 1;
>  	pathspec.recurse_submodules = !!recurse_submodules;
>  
> +	if (recurse_submodules && (!use_index || untracked))
> +		die(_("option not supported with --recurse-submodules"));

So this patch moves this condition here, expecting git to die with 
'--recurse-submodules --no-index'.  However, a7f3240877 removes the
'!use_index' part of the condition, so we won't die here ...

>  	if (list.nr || cached || show_in_pager) {
>  		if (num_threads > 1)
>  			warning(_("invalid option combination, ignoring --threads"));
> @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		    && (opt.pre_context || opt.post_context ||
>  			opt.file_break || opt.funcbody))
>  			skip_first_line = 1;
> +
> +		/*
> +		 * Pre-read gitmodules (if not read already) to prevent racy
> +		 * lazy reading in worker threads.
> +		 */
> +		if (recurse_submodules)
> +			repo_read_gitmodules(the_repository, 1);

... and eventually reach this condition, which then reads the
submodules even with '--no-index', which is just what a7f3240877 tried
to avoid, thus triggering the test failure.

It might be that all we need is changing this condition to:

  if (recurse_submodules && use_index)

Or maybe not, but this change on top of 'pu' makes t7814 succeed
again.

However, I'm not familiar with the intricacies of either threaded grep
or submodules, much less the combination of the two...  so just an
idea.

>  		start_threads(&opt);
>  	} else {
>  		/*
> @@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		}
>  	}
>  
> -	if (recurse_submodules && (!use_index || untracked))
> -		die(_("option not supported with --recurse-submodules"));
> -
>  	if (!show_in_pager && !opt.status_only)
>  		setup_pager();
>  
> -- 
> 2.24.1
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 11:26       ` SZEDER Gábor
@ 2020-01-29 18:49         ` Junio C Hamano
  2020-01-29 18:57         ` Junio C Hamano
  1 sibling, 0 replies; 47+ messages in thread
From: Junio C Hamano @ 2020-01-29 18:49 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Matheus Tavares, Philippe Blain, git

SZEDER Gábor <szeder.dev@gmail.com> writes:

> this patch below and a7f3240877 (grep: ignore --recurse-submodules if
> --no-index is given, 2020-01-26) on topic
> 'pb/do-not-recurse-grep-no-index' don't work well together, and cause
> failure of the test 'grep --recurse-submodules --no-index ignores
> --recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
> the new test added in a7f3240877.

Yup, I was looking at it and trying to see what was going on.

>> +	if (recurse_submodules && (!use_index || untracked))
>> +		die(_("option not supported with --recurse-submodules"));
>
> So this patch moves this condition here, expecting git to die with 
> '--recurse-submodules --no-index'.  However, a7f3240877 removes the
> '!use_index' part of the condition, so we won't die here ...
>
>>  	if (list.nr || cached || show_in_pager) {
>>  		if (num_threads > 1)
>>  			warning(_("invalid option combination, ignoring --threads"));
>> @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>  		    && (opt.pre_context || opt.post_context ||
>>  			opt.file_break || opt.funcbody))
>>  			skip_first_line = 1;
>> +
>> +		/*
>> +		 * Pre-read gitmodules (if not read already) to prevent racy
>> +		 * lazy reading in worker threads.
>> +		 */
>> +		if (recurse_submodules)
>> +			repo_read_gitmodules(the_repository, 1);
>
> ... and eventually reach this condition, which then reads the
> submodules even with '--no-index', which is just what a7f3240877 tried
> to avoid, thus triggering the test failure.
>
> It might be that all we need is changing this condition to:
>
>   if (recurse_submodules && use_index)
>
> Or maybe not, but this change on top of 'pu' makes t7814 succeed
> again.

Sounds like a sensible idea.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 11:26       ` SZEDER Gábor
  2020-01-29 18:49         ` Junio C Hamano
@ 2020-01-29 18:57         ` Junio C Hamano
  2020-01-29 20:42           ` Matheus Tavares Bernardino
  1 sibling, 1 reply; 47+ messages in thread
From: Junio C Hamano @ 2020-01-29 18:57 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Matheus Tavares, Philippe Blain, git, christian.couder, jrnieder,
	olyatelezhnaya, pclouds, jonathantanmy, peff, Brandon Williams,
	Stefan Beller

SZEDER Gábor <szeder.dev@gmail.com> writes:

> Junio, Matheus, Philippe,
>
> this patch below and a7f3240877 (grep: ignore --recurse-submodules if
> --no-index is given, 2020-01-26) on topic
> 'pb/do-not-recurse-grep-no-index' don't work well together, and cause
> failure of the test 'grep --recurse-submodules --no-index ignores
> --recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
> the new test added in a7f3240877.

Hmph, I wonder if "ignore --recurse-submodules if --no-index" should
have been done as a single liner patch, something along the lines of
"after parse_options() returns, drop recurse_submodules if no-index
was given", i.e.

@@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			/* die the same way as if we did it at the beginning */
 			setup_git_directory();
 	}
+	if (!use_index)
+		recurse_submodules = 0; /* ignore */
 
 	/*
 	 * skip a -- separator; we know it cannot be

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 18:57         ` Junio C Hamano
@ 2020-01-29 20:42           ` Matheus Tavares Bernardino
  2020-01-30 13:28             ` Philippe Blain
  0 siblings, 1 reply; 47+ messages in thread
From: Matheus Tavares Bernardino @ 2020-01-29 20:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Philippe Blain, git, Christian Couder,
	Jonathan Nieder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Jonathan Tan, Jeff King,
	Brandon Williams, Stefan Beller

On Wed, Jan 29, 2020 at 3:57 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> >
[...]
> > @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >                   && (opt.pre_context || opt.post_context ||
> >                       opt.file_break || opt.funcbody))
> >                       skip_first_line = 1;
> > +
> > +             /*
> > +              * Pre-read gitmodules (if not read already) to prevent racy
> > +              * lazy reading in worker threads.
> > +              */
> > +             if (recurse_submodules)
> > +                     repo_read_gitmodules(the_repository, 1);
> >
> > ... and eventually reach this condition, which then reads the
> > submodules even with '--no-index', which is just what a7f3240877 tried
> > to avoid, thus triggering the test failure.
> >
> > It might be that all we need is changing this condition to:
> >
> >  if (recurse_submodules && use_index)

Yes, I think that would work. I was only worried that, in case of
!use_index, the path taken could somehow lead to an unprotected call
to repo_read_gitmodules() (with threads spawned).Then, since the file
would not have been pre-loaded by the sequential code, we could
encounter a race condition. But by what I've inspected, when use_index
is false, grep_directory() will be called to traverse the files, and
it does not have repo_read_gitmodules() in its call graph[1]. So the
solution should be fine in the point of view of thread-safeness.

> Hmph, I wonder if "ignore --recurse-submodules if --no-index" should
> have been done as a single liner patch, something along the lines of
> "after parse_options() returns, drop recurse_submodules if no-index
> was given", i.e.
>
> @@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>                         /* die the same way as if we did it at the beginning */
>                         setup_git_directory();
>         }
> +       if (!use_index)
> +               recurse_submodules = 0; /* ignore */
>
>         /*
>          * skip a -- separator; we know it cannot be

Yeah, this seems more meaningful, IMHO, as we can easily see that the
recurse_submodules option was dropped in favor of using --no-index.

[1]: Well, in fact repo_read_gitmodules() *is* in grep_directory()'s
call graph, but the only path to it is through the
fill_textconv_grep() > fill_textconv() call, which is already guarded
by the obj_read_mutex. So there is no problem here.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 20:42           ` Matheus Tavares Bernardino
@ 2020-01-30 13:28             ` Philippe Blain
  0 siblings, 0 replies; 47+ messages in thread
From: Philippe Blain @ 2020-01-30 13:28 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Junio C Hamano, SZEDER Gábor, git, Christian Couder,
	Jonathan Nieder,
	Оля Тележная,
	Nguyễn Thái Ngọc Duy, Jonathan Tan, Jeff King,
	Brandon Williams, Stefan Beller

Hi everyone,
>> 
>> @@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>                        /* die the same way as if we did it at the beginning */
>>                        setup_git_directory();
>>        }
>> +       if (!use_index)
>> +               recurse_submodules = 0; /* ignore */
>> 
>>        /*
>>         * skip a -- separator; we know it cannot be
> 
> Yeah, this seems more meaningful, IMHO, as we can easily see that the
> recurse_submodules option was dropped in favor of using --no-index.
> 
I agree. I’ll send a v2 of my patch with this added.

Philippe.


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-01-30 13:28 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
2019-11-12  2:54     ` Jonathan Tan
2019-11-13  5:20       ` Jeff King
2019-11-14  5:57         ` Matheus Tavares Bernardino
2019-11-14  6:01           ` Jeff King
2019-11-14 18:15             ` Jonathan Tan
2019-11-15  4:12               ` Jeff King
2019-12-19 22:27                 ` Matheus Tavares Bernardino
2020-01-09 22:02                   ` Matheus Tavares Bernardino
2020-01-10 19:07                     ` Christian Couder
2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).