git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation
@ 2019-08-10 20:27 Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya, Nguyễn Thái Ngọc Duy

This series focuses on allowing parallel access to zlib inflation and
using that to perform a faster git-grep in the non-worktree case.

Threads were disabled for this case at 53b8d93 ("grep: disable
threading in non-worktree case", 12-12-2011), due to performance drops.

However, by allowing threads to perform inflation in parallel, we can
regain the speedup. This is a good hotspot for parallelism as some test
cases[1] showed that it can account for up to 48% of execution time.
And besides that, inflation tasks are already independent of each other.

As a result, grepping 'abcd[02]' ("Regex 1") and
'(static|extern) (int|double) \*' ("Regex 2") at chromium's
repository[2], I got (means of 30 executions):

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.3557s  | 20.8410s
        2    |   9.7170s  | 11.2415s
        8    |   6.1723s  |  6.9378s

As a reference, just enabling threads in the non-worktree case,
without parallel inflation, I got:

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.1359s  | 20.8306s
        2    |  14.5036s  | 15.4172s
        8    |  13.6304s  | 13.8659s

For now, the optimization is not supported when --textconv or
--recurse-submodules are used, but I hope to send another patchset for
that still during GSoC. We may also try to allow even more parallelism,
refining the added 'obj_read_lock'.

[1]: https://matheustavares.gitlab.io/posts/week-6-working-at-zlib-inflation#multithreading-zlib-inflation
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

travis build: https://travis-ci.org/matheustavares/git/builds/570255029

Matheus Tavares (4):
  object-store: add lock to read_object_file_extended()
  grep: allow locks to be enabled individually
  grep: disable grep_read_mutex when possible
  grep: re-enable threads in some non-worktree cases

 Documentation/git-grep.txt | 12 ++++++++
 builtin/grep.c             | 22 +++++++++++---
 grep.c                     |  4 +--
 grep.h                     |  8 +++--
 object-store.h             |  4 +++
 packfile.c                 |  7 +++++
 sha1-file.c                | 61 ++++++++++++++++++++++++++++++++++----
 7 files changed, 105 insertions(+), 13 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended()
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Stefan Beller, Denton Liu,
	Jeff King, Junio C Hamano

Allow read_object_file_extended() to be called by multiple threads
protecting it with a lock. The lock usage can be toggled with
enable_obj_read_lock() and disable_obj_read_lock().

Probably there are many spots in read_object_file_extended()'s call
chain that could be executed unlocked (and thus, in parallel). But, for
now, we are only interested in allowing parallel access to zlib
inflation. This is one of the sections where object reading spends most
of the time and it's already thread-safe. So, to take advantage of that,
the lock is released when entering it and re-acquired right after. We
may refine the lock to also exploit other possible parallel spots in the
future, but threaded zlib inflation should already give great speedups.

Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
now it is since phase I and phase III of unpack_entry() may execute
concurrently.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h |  4 ++++
 packfile.c     |  7 ++++++
 sha1-file.c    | 61 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 7f7b3cdd80..cfc9484995 100644
--- a/object-store.h
+++ b/object-store.h
@@ -159,6 +159,10 @@ const char *loose_object_path(struct repository *r, struct strbuf *buf,
 void *map_loose_object(struct repository *r, const struct object_id *oid,
 		       unsigned long *size);
 
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+void obj_read_lock(void);
+void obj_read_unlock(void);
 void *read_object_file_extended(struct repository *r,
 				const struct object_id *oid,
 				enum object_type *type,
diff --git a/packfile.c b/packfile.c
index fc43a6c52c..de93dc50e2 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1115,7 +1115,9 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1468,6 +1470,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	if (get_delta_base_cache_entry(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1597,7 +1602,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index 84fd02f107..f5ff51aedb 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1560,16 +1560,54 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
 	return 0;
 }
 
+static pthread_mutex_t obj_read_mutex;
+static int obj_read_use_lock = 0;
+
+/*
+ * Enabling the object read lock allows multiple threads to safely call the
+ * following functions in parallel: repo_read_object_file(), read_object_file()
+ * and read_object_file_extended().
+ */
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	pthread_mutex_init(&obj_read_mutex, NULL);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
+void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 /*
  * This function dies on corrupt objects; the callers who want to
  * deal with them should arrange to call read_object() and give error
  * messages themselves.
  */
-void *read_object_file_extended(struct repository *r,
-				const struct object_id *oid,
-				enum object_type *type,
-				unsigned long *size,
-				int lookup_replace)
+static void *do_read_object_file_extended(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size,
+					  int lookup_replace)
 {
 	void *data;
 	const struct packed_git *p;
@@ -1602,6 +1640,19 @@ void *read_object_file_extended(struct repository *r,
 	return NULL;
 }
 
+void *read_object_file_extended(struct repository *r,
+				const struct object_id *oid,
+				enum object_type *type,
+				unsigned long *size,
+				int lookup_replace)
+{
+	void *data;
+	obj_read_lock();
+	data = do_read_object_file_extended(r, oid, type, size, lookup_replace);
+	obj_read_unlock();
+	return data;
+}
+
 void *read_object_with_reference(struct repository *r,
 				 const struct object_id *oid,
 				 const char *required_type_name,
-- 
2.22.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [GSoC][PATCH 2/4] grep: allow locks to be enabled individually
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
  3 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Junio C Hamano,
	Denton Liu, brian m. carlson

git-grep has some internal locks to protect thread-unsafe operations
when running with threads. The usage of these locks can be toggled
through the variable 'grep_use_locks'. However, it's not currently
possible to enable each lock individually. And since object reading has
its own locks now, it is desirable to disable the respective grep lock
(and only that) in cases where we can do so. To do that, transform
'grep_use_locks' from a binary variable to a bitmask, which controls
each lock individually.

The actual disabling of grep_read_lock, when possible, will be done in
the following patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 2 +-
 grep.c         | 4 ++--
 grep.h         | 8 ++++++--
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 560051784e..a871bad8ad 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -205,7 +205,7 @@ static void start_threads(struct grep_opt *opt)
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
-	grep_use_locks = 1;
+	grep_use_locks = GREP_USE_ALL_LOCKS;
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
diff --git a/grep.c b/grep.c
index cd952ef5d3..3aca0db435 100644
--- a/grep.c
+++ b/grep.c
@@ -1523,13 +1523,13 @@ pthread_mutex_t grep_attr_mutex;
 
 static inline void grep_attr_lock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_ATTR_LOCK)
 		pthread_mutex_lock(&grep_attr_mutex);
 }
 
 static inline void grep_attr_unlock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_ATTR_LOCK)
 		pthread_mutex_unlock(&grep_attr_mutex);
 }
 
diff --git a/grep.h b/grep.h
index 1875880f37..02bffacfa2 100644
--- a/grep.h
+++ b/grep.h
@@ -229,6 +229,10 @@ int grep_source(struct grep_opt *opt, struct grep_source *gs);
 struct grep_opt *grep_opt_dup(const struct grep_opt *opt);
 int grep_threads_ok(const struct grep_opt *opt);
 
+#define GREP_USE_READ_LOCK (1 << 0)
+#define GREP_USE_ATTR_LOCK (1 << 1)
+#define GREP_USE_ALL_LOCKS (~0)
+
 /*
  * Mutex used around access to the attributes machinery if
  * opt->use_threads.  Must be initialized/destroyed by callers!
@@ -239,13 +243,13 @@ extern pthread_mutex_t grep_read_mutex;
 
 static inline void grep_read_lock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_READ_LOCK)
 		pthread_mutex_lock(&grep_read_mutex);
 }
 
 static inline void grep_read_unlock(void)
 {
-	if (grep_use_locks)
+	if (grep_use_locks & GREP_USE_READ_LOCK)
 		pthread_mutex_unlock(&grep_read_mutex);
 }
 
-- 
2.22.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
  3 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Brandon Williams,
	Ævar Arnfjörð Bjarmason, Junio C Hamano

git-grep uses 'grep_read_mutex' to protect some object reading
operations. But these have their own internal lock now, which ensure a
better performance (with more parallel regions). So, disable the former
when it's possible to use the latter, with enable_obj_read_lock().

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index a871bad8ad..fa51392222 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -205,7 +205,17 @@ static void start_threads(struct grep_opt *opt)
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
-	grep_use_locks = GREP_USE_ALL_LOCKS;
+	if (recurse_submodules || opt->allow_textconv) {
+		/*
+		 * textconv and submodules' operations are not thread-safe yet
+		 * so we must use grep_read_lock when grepping multithreaded
+		 * with these options.
+		 */
+		grep_use_locks = GREP_USE_ALL_LOCKS;
+	} else {
+		grep_use_locks = GREP_USE_ATTR_LOCK;
+		enable_obj_read_lock();
+	}
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -227,7 +237,7 @@ static void start_threads(struct grep_opt *opt)
 	}
 }
 
-static int wait_all(void)
+static int wait_all(struct grep_opt *opt)
 {
 	int hit = 0;
 	int i;
@@ -263,6 +273,9 @@ static int wait_all(void)
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	if (!recurse_submodules && !opt->allow_textconv) {
+		disable_obj_read_lock();
+	}
 
 	return hit;
 }
@@ -1140,7 +1153,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	}
 
 	if (num_threads > 1)
-		hit |= wait_all();
+		hit |= wait_all(&opt);
 	if (hit && show_in_pager)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
-- 
2.22.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
                   ` (2 preceding siblings ...)
  2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
@ 2019-08-10 20:27 ` Matheus Tavares
  3 siblings, 0 replies; 5+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, Junio C Hamano, Jeff King,
	Brandon Williams, Manav Rathi

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops. But now that
zlib inflation can be performed in parallel, for some of git-grep's
options, we can regain the speedup.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.3557s  | 20.8410s
    2    |   9.7170s  | 11.2415s
    8    |   6.1723s  |  6.9378s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ with 16GB of RAM and SSD. But to make sure the
optimization also performs well on HDD, the tests were repeated on an
AMD Turion 64 X2 TL-62 (dual-core) with 4GB of RAM and HDD (SATA-150,
5400 rpm):

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  40.3347s  |  47.6173s
    2    |  27.6547s  |  35.1797s

Unfortunately, textconv and submodules' operations remain thread-unsafe,
needing locks to be safely executed when threaded. Because of that, it's
not currently worthy to grep in parallel with them. So, when --textconv
or --recurse-submodules are given for a non-worktree case, threads are
kept disabled. In order to clarify this behavior, let's also add a
"NOTES" section to Documentation/git-grep.txt explaining the thread
usage details.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 12 ++++++++++++
 builtin/grep.c             |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 2d27969057..9686875fbc 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -330,6 +330,18 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES
+-----
+
+The --threads option (and grep.threads configuration) will be ignored when
+--open-files-in-pager is used, forcing a single-threaded execution.
+
+When grepping the index file (with --cached or giving tree objects), the
+following options will also suppress thread creation:
+
+	--recurse_submodules
+	--textconv
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index fa51392222..e5a9da471a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1073,7 +1073,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager ||
+	   ((list.nr || cached) && (recurse_submodules || opt.allow_textconv))) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.22.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox