git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [GSoC][RFC WIP PATCH 0/3] grep: allow parallelism in zlib inflation
@ 2019-07-19 23:08 Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 1/3] object-store: make read_object_file_extended() thread-safe Matheus Tavares
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Matheus Tavares @ 2019-07-19 23:08 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, kernel-usp

Threads were disabled in git-grep for non-worktree case at 53b8d93
("grep: disable threading in non-worktree case", 12-12-2011), due to
perfomance drops. 

To regain performance, this series work on the object reading code so
that zlib inflation may be performed in parallel. This is a good hotspot
for parallelism as, in some test cases[1], it accounts for up to 48% of
execution time. And besides that, inflation tasks are already
independent from one another.

Grep'ing 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[2], I got:

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.5815s  | 21.7217s
        2    |   9.7983s  | 11.3965s
        8    |   6.3097s  |  6.9667s
    
These are all means of 30 executions after 2 warmup runs. All tests were
executed on a i7-7700HQ with 16GB of RAM and SSD. The output was also
validated against current git-grep.

I still want to repeat the test in an HDD machine and in a repo with
mainly loose objects.

There're still some open issues thought:

1) To allow parallel inflation, the `obj_read_mutex` is released at
`unpack_compressed_entry()` and `get_size_from_delta()` right before
calling `git_inflate()` (and re-acquired right after). For git-grep it
seems to be OK. But, besides git-grep, if it's wanted to read objects
and perform other operations in parallel, we probably cannot guarantee
that the lock will be held by the current thread when it gets to those
two functions. Also, as the lock does only protect this call tree, if
those other operations access the same global states through other
paths, we may also have race conditions.

2) git-grep operations on textconv and submodules are still
unprotected and should result in race conditions. I'm not sure if
having two separate mutexes for them would fix it as the latter also has
to access the object store.

Any comments on the series and open issues will be highly appreciated.

[1]: https://matheustavares.gitlab.io/posts/week-6-working-at-zlib-inflation#multithreading-zlib-inflation 
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

travis build: https://travis-ci.org/matheustavares/git/builds/561200398


Matheus Tavares (3):
  object-store: make read_object_file_extended() thread-safe
  grep: remove locks on object reading operations
  grep: re-enable threads in non-worktree case

 builtin/grep.c | 20 ++++--------------
 grep.c         |  3 ---
 object-store.h |  4 ++++
 packfile.c     |  7 +++++++
 sha1-file.c    | 56 +++++++++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 66 insertions(+), 24 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [GSoC][RFC WIP PATCH 1/3] object-store: make read_object_file_extended() thread-safe
  2019-07-19 23:08 [GSoC][RFC WIP PATCH 0/3] grep: allow parallelism in zlib inflation Matheus Tavares
@ 2019-07-19 23:08 ` Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 2/3] grep: remove locks on object reading operations Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 3/3] grep: re-enable threads in non-worktree case Matheus Tavares
  2 siblings, 0 replies; 4+ messages in thread
From: Matheus Tavares @ 2019-07-19 23:08 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, kernel-usp, Stefan Beller,
	Jeff King, Denton Liu, brian m. carlson, Junio C Hamano

Allow read_object_file_extended() to be called by multiple threads
protecting it with a lock. The lock usage can be toggled with
enable_obj_read_lock() and disable_obj_read_lock().

Probably there are many spots in read_object_file_extended()'s call
chain that could be executed unlocked (and thus, in parallel). But, just
to make sure, let's protect everthing for now, and go refining the lock
step-by-step in the future.

The only (and perhaps most important) exception is git_inflate. Since it
is already thread-safe and takes a significant amount of time, the lock
is released when entering this function, so that it can be performed in
parallel. This should already bring good performance, because of
inflation's time cost.

Note that add_delta_base_cache() was also modified to skip adding
already present cache entries. This wouldn't happen in the past, but now
it's possible as phase I and phase III of unpack_entry() may execute
concurrently.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h |  4 ++++
 packfile.c     |  7 +++++++
 sha1-file.c    | 56 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 49f56ab8d9..8330ff3988 100644
--- a/object-store.h
+++ b/object-store.h
@@ -157,6 +157,10 @@ const char *loose_object_path(struct repository *r, struct strbuf *buf,
 void *map_loose_object(struct repository *r, const struct object_id *oid,
 		       unsigned long *size);
 
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+void obj_read_lock(void);
+void obj_read_unlock(void);
 void *read_object_file_extended(struct repository *r,
 				const struct object_id *oid,
 				enum object_type *type,
diff --git a/packfile.c b/packfile.c
index c0d83fdfed..a50be4e5e5 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1087,7 +1087,9 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1440,6 +1442,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	if (get_delta_base_cache_entry(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1569,7 +1574,9 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index 888b6024d5..37cde4a494 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1463,16 +1463,49 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type,
 	return 0;
 }
 
+static pthread_mutex_t obj_read_mutex;
+static int obj_read_use_lock = 0;
+
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	pthread_mutex_init(&obj_read_mutex, NULL);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
+void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 /*
  * This function dies on corrupt objects; the callers who want to
  * deal with them should arrange to call read_object() and give error
  * messages themselves.
  */
-void *read_object_file_extended(struct repository *r,
-				const struct object_id *oid,
-				enum object_type *type,
-				unsigned long *size,
-				int lookup_replace)
+static void *do_read_object_file_extended(struct repository *r,
+					  const struct object_id *oid,
+					  enum object_type *type,
+					  unsigned long *size,
+					  int lookup_replace)
 {
 	void *data;
 	const struct packed_git *p;
@@ -1505,6 +1538,19 @@ void *read_object_file_extended(struct repository *r,
 	return NULL;
 }
 
+void *read_object_file_extended(struct repository *r,
+				const struct object_id *oid,
+				enum object_type *type,
+				unsigned long *size,
+				int lookup_replace)
+{	
+	void *data;
+	obj_read_lock();
+	data = do_read_object_file_extended(r, oid, type, size, lookup_replace);
+	obj_read_unlock();
+	return data;
+}
+
 void *read_object_with_reference(const struct object_id *oid,
 				 const char *required_type_name,
 				 unsigned long *size,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [GSoC][RFC WIP PATCH 2/3] grep: remove locks on object reading operations
  2019-07-19 23:08 [GSoC][RFC WIP PATCH 0/3] grep: allow parallelism in zlib inflation Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 1/3] object-store: make read_object_file_extended() thread-safe Matheus Tavares
@ 2019-07-19 23:08 ` Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 3/3] grep: re-enable threads in non-worktree case Matheus Tavares
  2 siblings, 0 replies; 4+ messages in thread
From: Matheus Tavares @ 2019-07-19 23:08 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, kernel-usp, Junio C Hamano,
	Stefan Beller, brian m. carlson, Brandon Williams, Jeff King

git-grep has locks around the object reading operations it performs. But
these have an internal lock now, which can be enabled through
enable_obj_read_lock(). So let's use it and drop git-grep's ones.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 18 +++---------------
 grep.c         |  3 ---
 2 files changed, 3 insertions(+), 18 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 580fd38f41..682e2461d0 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -206,6 +206,7 @@ static void start_threads(struct grep_opt *opt)
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
 	grep_use_locks = 1;
+	enable_obj_read_lock();
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -263,6 +264,7 @@ static int wait_all(void)
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	disable_obj_read_lock();
 
 	return hit;
 }
@@ -295,16 +297,6 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 	return st;
 }
 
-static void *lock_and_read_oid_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
-{
-	void *data;
-
-	grep_read_lock();
-	data = read_object_file(oid, type, size);
-	grep_read_unlock();
-	return data;
-}
-
 static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 		     const char *filename, int tree_name_len,
 		     const char *path)
@@ -457,10 +449,8 @@ static int grep_submodule(struct grep_opt *opt,
 
 		object = parse_object_or_die(oid, oid_to_hex(oid));
 
-		grep_read_lock();
 		data = read_object_with_reference(&object->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
 
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
@@ -585,7 +575,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			void *data;
 			unsigned long size;
 
-			data = lock_and_read_oid_file(&entry.oid, &type, &size);
+			data = read_object_file(&entry.oid, &type, &size);
 			if (!data)
 				die(_("unable to read tree (%s)"),
 				    oid_to_hex(&entry.oid));
@@ -622,10 +612,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		struct strbuf base;
 		int hit, len;
 
-		grep_read_lock();
 		data = read_object_with_reference(&obj->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
 
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
diff --git a/grep.c b/grep.c
index f7c3a5803e..984a9e99cf 100644
--- a/grep.c
+++ b/grep.c
@@ -2109,10 +2109,7 @@ static int grep_source_load_oid(struct grep_source *gs)
 {
 	enum object_type type;
 
-	grep_read_lock();
 	gs->buf = read_object_file(gs->identifier, &type, &gs->size);
-	grep_read_unlock();
-
 	if (!gs->buf)
 		return error(_("'%s': unable to read %s"),
 			     gs->name,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [GSoC][RFC WIP PATCH 3/3] grep: re-enable threads in non-worktree case
  2019-07-19 23:08 [GSoC][RFC WIP PATCH 0/3] grep: allow parallelism in zlib inflation Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 1/3] object-store: make read_object_file_extended() thread-safe Matheus Tavares
  2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 2/3] grep: remove locks on object reading operations Matheus Tavares
@ 2019-07-19 23:08 ` Matheus Tavares
  2 siblings, 0 replies; 4+ messages in thread
From: Matheus Tavares @ 2019-07-19 23:08 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya,
	Nguyễn Thái Ngọc Duy, kernel-usp, Jeff King,
	Junio C Hamano, Brandon Williams

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable perfomance drops. But now that
zlib inflation can be performed in parallel, we can regain the speedup.

Grep'ing 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.5815s  | 21.7217s
    2    |   9.7983s  | 11.3965s
    8    |   6.3097s  |  6.9667s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on a i7-7700HQ with 16GB of RAM and SSD.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 682e2461d0..9309dea833 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1046,7 +1046,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-19 23:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 23:08 [GSoC][RFC WIP PATCH 0/3] grep: allow parallelism in zlib inflation Matheus Tavares
2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 1/3] object-store: make read_object_file_extended() thread-safe Matheus Tavares
2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 2/3] grep: remove locks on object reading operations Matheus Tavares
2019-07-19 23:08 ` [GSoC][RFC WIP PATCH 3/3] grep: re-enable threads in non-worktree case Matheus Tavares

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).