[PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update"

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update"
@ 2015-09-17  1:38 Stefan Beller
  2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
                   ` (10 more replies)
  0 siblings, 11 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:38 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

> I didn't say this in the previous round because it smelled like an
> RFC, but for a real submission, 2/2 may be doing too many things at
> once.  I suspect this is more or less "taste" thing, so I won't mind
> too much as long as the reviewers are OK with it.

The patch 2/2 is now broken up into the first five patches.

There are only 2 minor changes:
* If a number of tasks <= 0 is specified, use the number of cpus instead.
* Renamed the command line option in test-run-command.c to "run-command-parallel-4"
  as the 4 is hardcoded there.
  
The patch 6,7 are small cleanups (6 should add a test in a reroll)

Patches 7,8,9 are a preview of how I want to proceed in the near future:
After these functions are split out, we can add another patch on top which
rewrites the short main loop in cmd_update to be in C in submodule--helper
running in parallel.
It took me a while to get the idea how to realize parallelism with the
parallel run command structure now as opposed to the thread pool I proposed
earlier, but I think it will be straightforward from here.

Stefan

Stefan Beller (10):
  strbuf: Add strbuf_read_noblock
  run-command: factor out return value computation
  run-command: add an asynchronous parallel child processor
  fetch_populated_submodules: use new parallel job processing
  submodules: Allow parallel fetching, add tests and documentation
  git submodule update: Redirect any output to stderr
  git submodule update: pass --prefix only with a non empty prefix
  git submodule update: cmd_update_recursive
  git submodule update: cmd_update_recursive
  git submodule update: cmd_update_fetch

 Documentation/fetch-options.txt |   7 +
 builtin/fetch.c                 |   6 +-
 builtin/pull.c                  |   6 +
 git-submodule.sh                | 242 ++++++++++++++++++----------------
 run-command.c                   | 281 ++++++++++++++++++++++++++++++++++++----
 run-command.h                   |  36 +++++
 strbuf.c                        |  25 +++-
 strbuf.h                        |   6 +
 submodule.c                     | 119 ++++++++++++-----
 submodule.h                     |   2 +-
 t/t0061-run-command.sh          |  20 +++
 t/t5526-fetch-submodules.sh     |  19 +++
 test-run-command.c              |  24 ++++
 13 files changed, 620 insertions(+), 173 deletions(-)

-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
@ 2015-09-17  1:38 ` Stefan Beller
  2015-09-17 16:13   ` Junio C Hamano
  2015-09-17  1:39 ` [PATCH 02/10] run-command: factor out return value computation Stefan Beller
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:38 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

We need to read from pipes without blocking in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 strbuf.c | 25 +++++++++++++++++++++++--
 strbuf.h |  6 ++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index cce5eed..4130ee2 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE *f)
 	return res;
 }
 
-ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
+#define IGNORE_EAGAIN (1)
+
+static ssize_t strbuf_read_internal(struct strbuf *sb, int fd,
+				    size_t hint, int flags)
 {
 	size_t oldlen = sb->len;
 	size_t oldalloc = sb->alloc;
@@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
 	for (;;) {
 		ssize_t cnt;
 
-		cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
+		cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
 		if (cnt < 0) {
+			if (errno == EINTR)
+				continue;
+			if (errno == EAGAIN) {
+				if (flags & IGNORE_EAGAIN)
+					break;
+				else
+					continue;
+			}
 			if (oldalloc == 0)
 				strbuf_release(sb);
 			else
@@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
 	return sb->len - oldlen;
 }
 
+ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
+{
+	return strbuf_read_internal(sb, fd, hint, 0);
+}
+
+ssize_t strbuf_read_noblock(struct strbuf *sb, int fd, size_t hint)
+{
+	return strbuf_read_internal(sb, fd, hint, IGNORE_EAGAIN);
+}
+
 #define STRBUF_MAXLINK (2*PATH_MAX)
 
 int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
diff --git a/strbuf.h b/strbuf.h
index aef2794..23ca7aa 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -367,6 +367,12 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
 extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 
 /**
+ * Same as strbuf_read, just returns non-blockingly by ignoring EAGAIN.
+ * The fd must have set O_NONBLOCK.
+ */
+extern ssize_t strbuf_read_noblock(struct strbuf *, int fd, size_t hint);
+
+/**
  * Read the contents of a file, specified by its path. The third argument
  * can be used to give a hint about the file size, to avoid reallocs.
  */
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/10] run-command: factor out return value computation
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
  2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 10:33   ` Jeff King
  2015-09-17  1:39 ` [PATCH 03/10] run-command: add an asynchronous parallel child processor Stefan Beller
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

We will need computing the return value in a later patch without the
wait.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 run-command.c | 54 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/run-command.c b/run-command.c
index 28e1d55..c892e9a 100644
--- a/run-command.c
+++ b/run-command.c
@@ -232,6 +232,35 @@ static inline void set_cloexec(int fd)
 		fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
 }
 
+static int determine_return_value(int wait_status,
+				  int *result,
+				  int *error_code,
+				  const char *argv0)
+{
+	if (WIFSIGNALED(wait_status)) {
+		*result = WTERMSIG(wait_status);
+		if (*result != SIGINT && *result != SIGQUIT)
+			error("%s died of signal %d", argv0, *result);
+		/*
+		 * This return value is chosen so that code & 0xff
+		 * mimics the exit code that a POSIX shell would report for
+		 * a program that died from this signal.
+		 */
+		*result += 128;
+	} else if (WIFEXITED(wait_status)) {
+		*result = WEXITSTATUS(wait_status);
+		/*
+		 * Convert special exit code when execvp failed.
+		 */
+		if (*result == 127) {
+			*result = -1;
+			*error_code = ENOENT;
+		}
+	} else
+		return 1;
+	return 0;
+}
+
 static int wait_or_whine(pid_t pid, const char *argv0)
 {
 	int status, code = -1;
@@ -244,29 +273,10 @@ static int wait_or_whine(pid_t pid, const char *argv0)
 	if (waiting < 0) {
 		failed_errno = errno;
 		error("waitpid for %s failed: %s", argv0, strerror(errno));
-	} else if (waiting != pid) {
-		error("waitpid is confused (%s)", argv0);
-	} else if (WIFSIGNALED(status)) {
-		code = WTERMSIG(status);
-		if (code != SIGINT && code != SIGQUIT)
-			error("%s died of signal %d", argv0, code);
-		/*
-		 * This return value is chosen so that code & 0xff
-		 * mimics the exit code that a POSIX shell would report for
-		 * a program that died from this signal.
-		 */
-		code += 128;
-	} else if (WIFEXITED(status)) {
-		code = WEXITSTATUS(status);
-		/*
-		 * Convert special exit code when execvp failed.
-		 */
-		if (code == 127) {
-			code = -1;
-			failed_errno = ENOENT;
-		}
 	} else {
-		error("waitpid is confused (%s)", argv0);
+		if (waiting != pid
+		   || (determine_return_value(status, &code, &failed_errno, argv0) < 0))
+			error("waitpid is confused (%s)", argv0);
 	}
 
 	clear_child_for_cleanup(pid);
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/10] run-command: add an asynchronous parallel child processor
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
  2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
  2015-09-17  1:39 ` [PATCH 02/10] run-command: factor out return value computation Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 21:44   ` Junio C Hamano
  2015-09-17  1:39 ` [PATCH 04/10] fetch_populated_submodules: use new parallel job processing Stefan Beller
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

This allows to run external commands in parallel with ordered output
on stderr.

If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.

Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:

 time -->
 output: |---A---|   |-B-|   |----C-----------|   |-D-|   |-E-|

When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:

thread 1: |---A---|   |-D-|   |-E-|

thread 2: |-B-|   |----C-----------|

output:   |---A---|B|------C-------|DE

So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no time.
Once that is done, C is determined to be the visible child and its progress
will be reported in real time.

So this way of output is really good for human consumption,
as it only changes the timing, not the actual output.

For machine consumption the output needs to be prepared in
the tasks, by either having a prefix per line or per block
to indicate whose tasks output is displayed.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 run-command.c          | 228 +++++++++++++++++++++++++++++++++++++++++++++++++
 run-command.h          |  36 ++++++++
 t/t0061-run-command.sh |  20 +++++
 test-run-command.c     |  24 ++++++
 4 files changed, 308 insertions(+)

diff --git a/run-command.c b/run-command.c
index c892e9a..3af97ab 100644
--- a/run-command.c
+++ b/run-command.c
@@ -3,6 +3,7 @@
 #include "exec_cmd.h"
 #include "sigchain.h"
 #include "argv-array.h"
+#include "thread-utils.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -862,3 +863,230 @@ int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint)
 	close(cmd->out);
 	return finish_command(cmd);
 }
+
+struct parallel_processes {
+	int max_number_processes;
+	void *data;
+	get_next_task fn;
+	handle_child_starting_failure fn_err;
+	handle_child_return_value fn_exit;
+
+	int nr_processes;
+	int all_tasks_started;
+	int foreground_child;
+	char *slots;
+	struct child_process *children;
+	struct pollfd *pfd;
+	struct strbuf *err;
+	struct strbuf finished_children;
+};
+
+static void run_processes_parallel_init(struct parallel_processes *pp,
+					int n, void *data,
+					get_next_task fn,
+					handle_child_starting_failure fn_err,
+					handle_child_return_value fn_exit)
+{
+	int i;
+
+	if (n < 1)
+		n = online_cpus();
+
+	pp->max_number_processes = n;
+	pp->data = data;
+	pp->fn = fn;
+	pp->fn_err = fn_err;
+	pp->fn_exit = fn_exit;
+
+	pp->nr_processes = 0;
+	pp->all_tasks_started = 0;
+	pp->foreground_child = 0;
+	pp->slots = xcalloc(n, sizeof(*pp->slots));
+	pp->children = xcalloc(n, sizeof(*pp->children));
+	pp->pfd = xcalloc(n, sizeof(*pp->pfd));
+	pp->err = xcalloc(n, sizeof(*pp->err));
+	strbuf_init(&pp->finished_children, 0);
+
+	for (i = 0; i < n; i++) {
+		strbuf_init(&pp->err[i], 0);
+		pp->pfd[i].events = POLLIN;
+		pp->pfd[i].fd = -1;
+	}
+}
+
+static void run_processes_parallel_cleanup(struct parallel_processes *pp)
+{
+	int i;
+	for (i = 0; i < pp->max_number_processes; i++)
+		strbuf_release(&pp->err[i]);
+
+	free(pp->children);
+	free(pp->slots);
+	free(pp->pfd);
+	free(pp->err);
+	strbuf_release(&pp->finished_children);
+}
+
+static void unblock_fd(int fd)
+{
+	int flags = fcntl(fd, F_GETFL);
+	if (flags < 0) {
+		warning("Could not get file status flags, "
+			"output will be degraded");
+		return;
+	}
+	if (fcntl(fd, F_SETFL, flags | O_NONBLOCK)) {
+			warning("Could not set file status flags, "
+			"output will be degraded");
+		return;
+	}
+}
+
+static void run_processes_parallel_start_new(struct parallel_processes *pp)
+{
+	int i;
+	/* Start new processes. */
+	while (!pp->all_tasks_started
+	       && pp->nr_processes < pp->max_number_processes) {
+		for (i = 0; i < pp->max_number_processes; i++)
+			if (!pp->slots[i])
+				break; /* found an empty slot */
+		if (i == pp->max_number_processes)
+			die("BUG: bookkeeping is hard");
+
+		if (pp->fn(pp->data, &pp->children[i], &pp->err[i])) {
+			pp->all_tasks_started = 1;
+			break;
+		}
+		if (start_command(&pp->children[i]))
+			pp->fn_err(pp->data, &pp->children[i], &pp->err[i]);
+
+		unblock_fd(pp->children[i].err);
+
+		pp->nr_processes++;
+		pp->slots[i] = 1;
+		pp->pfd[i].fd = pp->children[i].err;
+	}
+}
+
+static int run_processes_parallel_buffer_stderr(struct parallel_processes *pp)
+{
+	int i;
+	i = poll(pp->pfd, pp->max_number_processes, 100);
+	if (i < 0) {
+		if (errno == EINTR)
+			/* A signal was caught; try again */
+			return -1;
+		else {
+			run_processes_parallel_cleanup(pp);
+			die_errno("poll");
+		}
+	}
+
+	/* Buffer output from all pipes. */
+	for (i = 0; i < pp->max_number_processes; i++) {
+		if (!pp->slots[i])
+			continue;
+		if (pp->pfd[i].revents & POLLIN)
+			strbuf_read_noblock(&pp->err[i], pp->children[i].err, 0);
+		if (pp->foreground_child == i) {
+			fputs(pp->err[i].buf, stderr);
+			strbuf_reset(&pp->err[i]);
+		}
+	}
+	return 0;
+}
+
+
+static void run_processes_parallel_collect_finished(struct parallel_processes *pp)
+{
+	int i = 0;
+	pid_t pid;
+	int wait_status, code;
+	int n = pp->max_number_processes;
+	/* Collect finished child processes. */
+	while (pp->nr_processes > 0) {
+		pid = waitpid(-1, &wait_status, WNOHANG);
+		if (pid == 0)
+			return; /* no child finished */
+
+		if (pid < 0) {
+			if (errno == EINTR)
+				return; /* just try again  next time */
+			if (errno == EINVAL || errno == ECHILD)
+				die_errno("wait");
+		} else {
+			/* Find the finished child. */
+			for (i = 0; i < pp->max_number_processes; i++)
+				if (pp->slots[i] && pid == pp->children[i].pid)
+					break;
+			if (i == pp->max_number_processes)
+				/*
+				 * waitpid returned another process id
+				 * which we are not waiting for.
+				 */
+				return;
+		}
+		strbuf_read_noblock(&pp->err[i], pp->children[i].err, 0);
+
+		if (determine_return_value(wait_status, &code, &errno,
+					   pp->children[i].argv[0]) < 0)
+			error("waitpid is confused (%s)",
+			      pp->children[i].argv[0]);
+
+		pp->fn_exit(pp->data, &pp->children[i], code);
+
+		argv_array_clear(&pp->children[i].args);
+		argv_array_clear(&pp->children[i].env_array);
+
+		pp->nr_processes--;
+		pp->slots[i] = 0;
+		pp->pfd[i].fd = -1;
+
+		if (i != pp->foreground_child) {
+			strbuf_addbuf(&pp->finished_children, &pp->err[i]);
+			strbuf_reset(&pp->err[i]);
+		} else {
+			fputs(pp->err[i].buf, stderr);
+			strbuf_reset(&pp->err[i]);
+
+			/* Output all other finished child processes */
+			fputs(pp->finished_children.buf, stderr);
+			strbuf_reset(&pp->finished_children);
+
+			/*
+			 * Pick next process to output live.
+			 * NEEDSWORK:
+			 * For now we pick it randomly by doing a round
+			 * robin. Later we may want to pick the one with
+			 * the most output or the longest or shortest
+			 * running process time.
+			 */
+			for (i = 0; i < n; i++)
+				if (pp->slots[(pp->foreground_child + i) % n])
+					break;
+			pp->foreground_child = (pp->foreground_child + i) % n;
+			fputs(pp->err[pp->foreground_child].buf, stderr);
+			strbuf_reset(&pp->err[pp->foreground_child]);
+		}
+	}
+}
+
+int run_processes_parallel(int n, void *data,
+			   get_next_task fn,
+			   handle_child_starting_failure fn_err,
+			   handle_child_return_value fn_exit)
+{
+	struct parallel_processes pp;
+	run_processes_parallel_init(&pp, n, data, fn, fn_err, fn_exit);
+
+	while (!pp.all_tasks_started || pp.nr_processes > 0) {
+		run_processes_parallel_start_new(&pp);
+		if (run_processes_parallel_buffer_stderr(&pp))
+			continue;
+		run_processes_parallel_collect_finished(&pp);
+	}
+	run_processes_parallel_cleanup(&pp);
+
+	return 0;
+}
diff --git a/run-command.h b/run-command.h
index 5b4425a..0487f71 100644
--- a/run-command.h
+++ b/run-command.h
@@ -119,4 +119,40 @@ struct async {
 int start_async(struct async *async);
 int finish_async(struct async *async);
 
+/**
+ * This callback should initialize the child process and preload the
+ * error channel. The preloading of is useful if you want to have a message
+ * printed directly before the output of the child process.
+ * You MUST set stdout_to_stderr.
+ *
+ * Return 0 if the next child is ready to run.
+ * Return != 0 if there are no more tasks to be processed.
+ */
+typedef int (*get_next_task)(void *data,
+			     struct child_process *cp,
+			     struct strbuf *err);
+
+typedef void (*handle_child_starting_failure)(void *data,
+					      struct child_process *cp,
+					      struct strbuf *err);
+
+typedef void (*handle_child_return_value)(void *data,
+					  struct child_process *cp,
+					  int result);
+
+/**
+ * Runs up to n processes at the same time. Whenever a process can be
+ * started, the callback `get_next_task` is called to obtain the data
+ * fed to the child process.
+ *
+ * The children started via this function run in parallel and their output
+ * to stderr is buffered, while one of the children will directly output
+ * to stderr.
+ */
+
+int run_processes_parallel(int n, void *data,
+			   get_next_task fn,
+			   handle_child_starting_failure,
+			   handle_child_return_value);
+
 #endif
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 9acf628..49aa3db 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -47,4 +47,24 @@ test_expect_success POSIXPERM,SANITY 'unreadable directory in PATH' '
 	test_cmp expect actual
 '
 
+cat >expect <<-EOF
+preloaded output of a child
+Hello
+World
+preloaded output of a child
+Hello
+World
+preloaded output of a child
+Hello
+World
+preloaded output of a child
+Hello
+World
+EOF
+
+test_expect_success 'run_command runs in parallel' '
+	test-run-command run-command-parallel-4 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/test-run-command.c b/test-run-command.c
index 89c7de2..70b6c7a 100644
--- a/test-run-command.c
+++ b/test-run-command.c
@@ -10,9 +10,29 @@
 
 #include "git-compat-util.h"
 #include "run-command.h"
+#include "argv-array.h"
+#include "strbuf.h"
 #include <string.h>
 #include <errno.h>
 
+static int number_callbacks;
+int parallel_next(void *data,
+		  struct child_process *cp,
+		  struct strbuf *err)
+{
+	struct child_process *d = data;
+	if (number_callbacks >= 4)
+		return 1;
+
+	argv_array_pushv(&cp->args, d->argv);
+	cp->stdout_to_stderr = 1;
+	cp->no_stdin = 1;
+	cp->err = -1;
+	strbuf_addf(err, "preloaded output of a child\n");
+	number_callbacks++;
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	struct child_process proc = CHILD_PROCESS_INIT;
@@ -30,6 +50,10 @@ int main(int argc, char **argv)
 	if (!strcmp(argv[1], "run-command"))
 		exit(run_command(&proc));
 
+	if (!strcmp(argv[1], "run-command-parallel-4"))
+		exit(run_processes_parallel(4, &proc, parallel_next,
+					 NULL, NULL));
+
 	fprintf(stderr, "check usage\n");
 	return 1;
 }
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/10] fetch_populated_submodules: use new parallel job processing
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (2 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 03/10] run-command: add an asynchronous parallel child processor Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17  1:39 ` [PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation Stefan Beller
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

In a later patch we enable parallel processing of submodules, this
only adds the possibility for it. So this change should not change
any user facing behavior.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch.c |   3 +-
 submodule.c     | 119 +++++++++++++++++++++++++++++++++++++++-----------------
 submodule.h     |   2 +-
 3 files changed, 87 insertions(+), 37 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee1f1a9..6620ed0 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1217,7 +1217,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_populated_submodules(&options,
 						    submodule_prefix,
 						    recurse_submodules,
-						    verbosity < 0);
+						    verbosity < 0,
+						    0);
 		argv_array_clear(&options);
 	}
 
diff --git a/submodule.c b/submodule.c
index 1d64e57..a0e06e8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "argv-array.h"
 #include "blob.h"
+#include "thread-utils.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static struct string_list changed_submodule_paths;
@@ -615,37 +616,79 @@ static void calculate_changed_submodule_paths(void)
 	initialized_fetch_ref_tips = 0;
 }
 
+struct submodule_parallel_fetch {
+	int count;
+	struct argv_array args;
+	const char *work_tree;
+	const char *prefix;
+	int command_line_option;
+	int quiet;
+	int result;
+};
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+
+int get_next_submodule(void *data, struct child_process *cp,
+		       struct strbuf *err);
+
+void handle_submodule_fetch_start_err(void *data, struct child_process *cp, struct strbuf *err)
+{
+	struct submodule_parallel_fetch *spf = data;
+	spf->result = 1;
+}
+
+void handle_submodule_fetch_finish( void *data, struct child_process *cp, int retvalue)
+{
+	struct submodule_parallel_fetch *spf = data;
+
+	if (retvalue)
+		spf->result = 1;
+}
+
 int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
-			       int quiet)
+			       int quiet, int max_parallel_jobs)
 {
-	int i, result = 0;
-	struct child_process cp = CHILD_PROCESS_INIT;
-	struct argv_array argv = ARGV_ARRAY_INIT;
-	const char *work_tree = get_git_work_tree();
-	if (!work_tree)
+	int i;
+	struct submodule_parallel_fetch spf = SPF_INIT;
+	spf.work_tree = get_git_work_tree();
+	spf.command_line_option = command_line_option;
+	spf.quiet = quiet;
+	spf.prefix = prefix;
+	if (!spf.work_tree)
 		goto out;
 
 	if (read_cache() < 0)
 		die("index file corrupt");
 
-	argv_array_push(&argv, "fetch");
+	argv_array_push(&spf.args, "fetch");
 	for (i = 0; i < options->argc; i++)
-		argv_array_push(&argv, options->argv[i]);
-	argv_array_push(&argv, "--recurse-submodules-default");
+		argv_array_push(&spf.args, options->argv[i]);
+	argv_array_push(&spf.args, "--recurse-submodules-default");
 	/* default value, "--submodule-prefix" and its value are added later */
 
-	cp.env = local_repo_env;
-	cp.git_cmd = 1;
-	cp.no_stdin = 1;
-
 	calculate_changed_submodule_paths();
+	run_processes_parallel(max_parallel_jobs, &spf,
+			       get_next_submodule,
+			       handle_submodule_fetch_start_err,
+			       handle_submodule_fetch_finish);
+
+	argv_array_clear(&spf.args);
+out:
+	string_list_clear(&changed_submodule_paths, 1);
+	return spf.result;
+}
+
+int get_next_submodule(void *data, struct child_process *cp,
+		       struct strbuf *err)
+{
+	int ret = 0;
+	struct submodule_parallel_fetch *spf = data;
 
-	for (i = 0; i < active_nr; i++) {
+	for ( ; spf->count < active_nr; spf->count++) {
 		struct strbuf submodule_path = STRBUF_INIT;
 		struct strbuf submodule_git_dir = STRBUF_INIT;
 		struct strbuf submodule_prefix = STRBUF_INIT;
-		const struct cache_entry *ce = active_cache[i];
+		const struct cache_entry *ce = active_cache[spf->count];
 		const char *git_dir, *default_argv;
 		const struct submodule *submodule;
 
@@ -657,7 +700,7 @@ int fetch_populated_submodules(const struct argv_array *options,
 			submodule = submodule_from_name(null_sha1, ce->name);
 
 		default_argv = "yes";
-		if (command_line_option == RECURSE_SUBMODULES_DEFAULT) {
+		if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
 			if (submodule &&
 			    submodule->fetch_recurse !=
 						RECURSE_SUBMODULES_NONE) {
@@ -680,40 +723,46 @@ int fetch_populated_submodules(const struct argv_array *options,
 					default_argv = "on-demand";
 				}
 			}
-		} else if (command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
+		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
 			if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
 				continue;
 			default_argv = "on-demand";
 		}
 
-		strbuf_addf(&submodule_path, "%s/%s", work_tree, ce->name);
+		strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
 		strbuf_addf(&submodule_git_dir, "%s/.git", submodule_path.buf);
-		strbuf_addf(&submodule_prefix, "%s%s/", prefix, ce->name);
+		strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, ce->name);
 		git_dir = read_gitfile(submodule_git_dir.buf);
 		if (!git_dir)
 			git_dir = submodule_git_dir.buf;
 		if (is_directory(git_dir)) {
-			if (!quiet)
-				fprintf(stderr, "Fetching submodule %s%s\n", prefix, ce->name);
-			cp.dir = submodule_path.buf;
-			argv_array_push(&argv, default_argv);
-			argv_array_push(&argv, "--submodule-prefix");
-			argv_array_push(&argv, submodule_prefix.buf);
-			cp.argv = argv.argv;
-			if (run_command(&cp))
-				result = 1;
-			argv_array_pop(&argv);
-			argv_array_pop(&argv);
-			argv_array_pop(&argv);
+			child_process_init(cp);
+			cp->dir = strbuf_detach(&submodule_path, NULL);
+			cp->git_cmd = 1;
+			cp->no_stdout = 1;
+			cp->no_stdin = 1;
+			cp->stdout_to_stderr = 1;
+			cp->err = -1;
+			cp->env = local_repo_env;
+			if (!spf->quiet)
+				strbuf_addf(err, "Fetching submodule %s%s\n",
+					    spf->prefix, ce->name);
+			argv_array_init(&cp->args);
+			argv_array_pushv(&cp->args, spf->args.argv);
+			argv_array_push(&cp->args, default_argv);
+			argv_array_push(&cp->args, "--submodule-prefix");
+			argv_array_push(&cp->args, submodule_prefix.buf);
+			ret = 1;
 		}
 		strbuf_release(&submodule_path);
 		strbuf_release(&submodule_git_dir);
 		strbuf_release(&submodule_prefix);
+		if (ret) {
+			spf->count++;
+			return 0;
+		}
 	}
-	argv_array_clear(&argv);
-out:
-	string_list_clear(&changed_submodule_paths, 1);
-	return result;
+	return 1;
 }
 
 unsigned is_submodule_modified(const char *path, int ignore_untracked)
diff --git a/submodule.h b/submodule.h
index 5507c3d..cbc0003 100644
--- a/submodule.h
+++ b/submodule.h
@@ -31,7 +31,7 @@ void set_config_fetch_recurse_submodules(int value);
 void check_for_new_submodule_commits(unsigned char new_sha1[20]);
 int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
-			       int quiet);
+			       int quiet, int max_parallel_jobs);
 unsigned is_submodule_modified(const char *path, int ignore_untracked);
 int submodule_uses_gitfile(const char *path);
 int ok_to_remove_submodule(const char *path);
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (3 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 04/10] fetch_populated_submodules: use new parallel job processing Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17  1:39 ` [PATCH 06/10] git submodule update: Redirect any output to stderr Stefan Beller
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/fetch-options.txt |  7 +++++++
 builtin/fetch.c                 |  5 ++++-
 builtin/pull.c                  |  6 ++++++
 t/t5526-fetch-submodules.sh     | 19 +++++++++++++++++++
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 45583d8..6b109f6 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -100,6 +100,13 @@ ifndef::git-pull[]
 	reference to a commit that isn't already in the local submodule
 	clone.
 
+-j::
+--jobs=<n>::
+	Number of parallel children to be used for fetching submodules.
+	Each will fetch from different submodules, such that fetching many
+	submodules will be faster. By default submodules will be fetched
+	one at a time.
+
 --no-recurse-submodules::
 	Disable recursive fetching of submodules (this has the same effect as
 	using the '--recurse-submodules=no' option).
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 6620ed0..f28eac6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -37,6 +37,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, verbosity;
 static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow;
+static int max_children = 1;
 static const char *depth;
 static const char *upload_pack;
 static struct strbuf default_rla = STRBUF_INIT;
@@ -99,6 +100,8 @@ static struct option builtin_fetch_options[] = {
 		    N_("fetch all tags and associated objects"), TAGS_SET),
 	OPT_SET_INT('n', NULL, &tags,
 		    N_("do not fetch all tags (--no-tags)"), TAGS_UNSET),
+	OPT_INTEGER('j', "jobs", &max_children,
+		    N_("number of submodules fetched in parallel")),
 	OPT_BOOL('p', "prune", &prune,
 		 N_("prune remote-tracking branches no longer on remote")),
 	{ OPTION_CALLBACK, 0, "recurse-submodules", NULL, N_("on-demand"),
@@ -1218,7 +1221,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 						    submodule_prefix,
 						    recurse_submodules,
 						    verbosity < 0,
-						    0);
+						    max_children);
 		argv_array_clear(&options);
 	}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index 722a83c..f0af196 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -94,6 +94,7 @@ static int opt_force;
 static char *opt_tags;
 static char *opt_prune;
 static char *opt_recurse_submodules;
+static char *max_children;
 static int opt_dry_run;
 static char *opt_keep;
 static char *opt_depth;
@@ -177,6 +178,9 @@ static struct option pull_options[] = {
 		N_("on-demand"),
 		N_("control recursive fetching of submodules"),
 		PARSE_OPT_OPTARG),
+	OPT_PASSTHRU('j', "jobs", &max_children, N_("n"),
+		N_("number of submodules pulled in parallel"),
+		PARSE_OPT_OPTARG),
 	OPT_BOOL(0, "dry-run", &opt_dry_run,
 		N_("dry run")),
 	OPT_PASSTHRU('k', "keep", &opt_keep, NULL,
@@ -524,6 +528,8 @@ static int run_fetch(const char *repo, const char **refspecs)
 		argv_array_push(&args, opt_prune);
 	if (opt_recurse_submodules)
 		argv_array_push(&args, opt_recurse_submodules);
+	if (max_children)
+		argv_array_push(&args, max_children);
 	if (opt_dry_run)
 		argv_array_push(&args, "--dry-run");
 	if (opt_keep)
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 17759b1..1b4ce69 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -71,6 +71,16 @@ test_expect_success "fetch --recurse-submodules recurses into submodules" '
 	test_i18ncmp expect.err actual.err
 '
 
+test_expect_success "fetch --recurse-submodules -j2 has the same output behaviour" '
+	add_upstream_commit &&
+	(
+		cd downstream &&
+		git fetch --recurse-submodules -j2 2>../actual.err
+	) &&
+	test_must_be_empty actual.out &&
+	test_i18ncmp expect.err actual.err
+'
+
 test_expect_success "fetch alone only fetches superproject" '
 	add_upstream_commit &&
 	(
@@ -140,6 +150,15 @@ test_expect_success "--quiet propagates to submodules" '
 	! test -s actual.err
 '
 
+test_expect_success "--quiet propagates to parallel submodules" '
+	(
+		cd downstream &&
+		git fetch --recurse-submodules -j 2 --quiet  >../actual.out 2>../actual.err
+	) &&
+	! test -s actual.out &&
+	! test -s actual.err
+'
+
 test_expect_success "--dry-run propagates to submodules" '
 	add_upstream_commit &&
 	(
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/10] git submodule update: Redirect any output to stderr
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (4 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 20:31   ` Eric Sunshine
  2015-09-17  1:39 ` [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix Stefan Beller
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

There are no tests, which fail by this.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 8b0eb9a..7ef3247 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -663,7 +663,7 @@ cmd_update()
 		die_if_unmatched "$mode"
 		if test "$stage" = U
 		then
-			echo >&2 "Skipping unmerged submodule $prefix$sm_path"
+			say >&2 "Skipping unmerged submodule $prefix$sm_path"
 			continue
 		fi
 		name=$(git submodule--helper name "$sm_path") || exit
@@ -684,7 +684,7 @@ cmd_update()
 
 		if test "$update_module" = "none"
 		then
-			echo "Skipping submodule '$displaypath'"
+			say >&2 "Skipping submodule '$displaypath'"
 			continue
 		fi
 
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (5 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 06/10] git submodule update: Redirect any output to stderr Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 20:33   ` Eric Sunshine
  2015-09-17  1:39 ` [PATCH 08/10] git submodule update: cmd_update_recursive Stefan Beller
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

We should not pass --prefix NULL into the helper. Although the helper
can deal with it, it's just messy.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 7ef3247..3ccb0b6 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -700,7 +700,7 @@ Maybe you want to use 'update --init'?")"
 
 		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
 		then
-			git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
+			git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
 			cloned_modules="$cloned_modules;$name"
 			subsha1=
 		else
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/10] git submodule update: cmd_update_recursive
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (6 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17  1:39 ` [PATCH 09/10] " Stefan Beller
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

split the recursion part out to its own function

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 47 ++++++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 3ccb0b6..52c2967 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -582,6 +582,31 @@ cmd_deinit()
 	done
 }
 
+
+cmd_update_recursive()
+{
+	if test -n "$recursive"
+	then
+		(
+			prefix="$prefix$sm_path/"
+			clear_local_git_env
+			cd "$sm_path" &&
+			eval cmd_update
+		)
+		res=$?
+		if test $res -gt 0
+		then
+			die_msg="$(eval_gettext "Failed to recurse into submodule path '\$displaypath'")"
+			if test $res -eq 1
+			then
+				err="${err};$die_msg"
+			else
+				die_with_status $res "$die_msg"
+			fi
+		fi
+	fi
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as needed
 #
@@ -790,27 +815,7 @@ Maybe you want to use 'update --init'?")"
 			fi
 		fi
 
-		if test -n "$recursive"
-		then
-			(
-				prefix="$prefix$sm_path/"
-				clear_local_git_env
-				cd "$sm_path" &&
-				eval cmd_update
-			)
-			res=$?
-			if test $res -gt 0
-			then
-				die_msg="$(eval_gettext "Failed to recurse into submodule path '\$displaypath'")"
-				if test $res -eq 1
-				then
-					err="${err};$die_msg"
-					continue
-				else
-					die_with_status $res "$die_msg"
-				fi
-			fi
-		fi
+		cmd_update_recursive
 	done
 
 	if test -n "$err"
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/10] git submodule update: cmd_update_recursive
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (7 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 08/10] git submodule update: cmd_update_recursive Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 20:37   ` Eric Sunshine
  2015-09-17  1:39 ` [PATCH 10/10] git submodule update: cmd_update_fetch Stefan Beller
  2015-09-17 17:06 ` [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Jacob Keller
  10 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 52c2967..c40d60f 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -607,6 +607,24 @@ cmd_update_recursive()
 	fi
 }
 
+cmd_update_clone()
+{
+	command="git checkout $subforce -q"
+	die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")"
+	say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")"
+
+	git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
+
+	if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+	then
+		say "$say_msg"
+	else
+		err="${err};$die_msg"
+		return
+	fi
+	cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as needed
 #
@@ -680,7 +698,6 @@ cmd_update()
 		cmd_init "--" "$@" || return
 	fi
 
-	cloned_modules=
 	git submodule--helper list --prefix "$wt_prefix" "$@" | {
 	err=
 	while read mode sha1 stage sm_path
@@ -725,9 +742,8 @@ Maybe you want to use 'update --init'?")"
 
 		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
 		then
-			git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
-			cloned_modules="$cloned_modules;$name"
-			subsha1=
+			cmd_update_clone
+			continue
 		else
 			subsha1=$(clear_local_git_env; cd "$sm_path" &&
 				git rev-parse --verify HEAD) ||
@@ -767,13 +783,6 @@ Maybe you want to use 'update --init'?")"
 				die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")"
 			fi
 
-			# Is this something we just cloned?
-			case ";$cloned_modules;" in
-			*";$name;"*)
-				# then there is no local change to integrate
-				update_module=checkout ;;
-			esac
-
 			must_die_on_failure=
 			case "$update_module" in
 			checkout)
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/10] git submodule update: cmd_update_fetch
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (8 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 09/10] " Stefan Beller
@ 2015-09-17  1:39 ` Stefan Beller
  2015-09-17 17:06 ` [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Jacob Keller
  10 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17  1:39 UTC (permalink / raw)
  To: git
  Cc: peff, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 164 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 84 insertions(+), 80 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index c40d60f..2c9f1f2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -625,6 +625,89 @@ cmd_update_clone()
 	cmd_update_recursive
 }
 
+cmd_update_fetch()
+{
+	subsha1=$(clear_local_git_env; cd "$sm_path" &&
+		git rev-parse --verify HEAD) ||
+	die "$(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")"
+
+	if test -n "$remote"
+	then
+		if test -z "$nofetch"
+		then
+			# Fetch remote before determining tracking $sha1
+			(clear_local_git_env; cd "$sm_path" && git-fetch) ||
+			die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
+		fi
+		remote_name=$(clear_local_git_env; cd "$sm_path" && get_default_remote)
+		sha1=$(clear_local_git_env; cd "$sm_path" &&
+			git rev-parse --verify "${remote_name}/${branch}") ||
+		die "$(eval_gettext "Unable to find current ${remote_name}/${branch} revision in submodule path '\$sm_path'")"
+	fi
+
+	if test "$subsha1" != "$sha1" || test -n "$force"
+	then
+		subforce=$force
+		# If we don't already have a -f flag and the submodule has never been checked out
+		if test -z "$subsha1" && test -z "$force"
+		then
+			subforce="-f"
+		fi
+
+		if test -z "$nofetch"
+		then
+			# Run fetch only if $sha1 isn't present or it
+			# is not reachable from a ref.
+			(clear_local_git_env; cd "$sm_path" &&
+				( (rev=$(git rev-list -n 1 $sha1 --not --all 2>/dev/null) &&
+				 test -z "$rev") || git-fetch)) ||
+			die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")"
+		fi
+
+		must_die_on_failure=
+		case "$update_module" in
+		checkout)
+			command="git checkout $subforce -q"
+			die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")"
+			say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")"
+			;;
+		rebase)
+			command="git rebase"
+			die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")"
+			say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")"
+			must_die_on_failure=yes
+			;;
+		merge)
+			command="git merge"
+			die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")"
+			say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")"
+			must_die_on_failure=yes
+			;;
+		!*)
+			command="${update_module#!}"
+			die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule  path '\$prefix\$sm_path'")"
+			say_msg="$(eval_gettext "Submodule path '\$prefix\$sm_path': '\$command \$sha1'")"
+			must_die_on_failure=yes
+			;;
+		*)
+			die "$(eval_gettext "Invalid update mode '$update_module' for submodule '$name'")"
+		esac
+
+		if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
+		then
+			say "$say_msg"
+		elif test -n "$must_die_on_failure"
+		then
+			die_with_status 2 "$die_msg"
+		else
+			err="${err};$die_msg"
+			return
+		fi
+	fi
+
+	cmd_update_recursive
+}
+
 #
 # Update each submodule path to correct revision, using clone and checkout as needed
 #
@@ -743,88 +826,9 @@ Maybe you want to use 'update --init'?")"
 		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
 		then
 			cmd_update_clone
-			continue
 		else
-			subsha1=$(clear_local_git_env; cd "$sm_path" &&
-				git rev-parse --verify HEAD) ||
-			die "$(eval_gettext "Unable to find current revision in submodule path '\$displaypath'")"
+			cmd_update_fetch
 		fi
-
-		if test -n "$remote"
-		then
-			if test -z "$nofetch"
-			then
-				# Fetch remote before determining tracking $sha1
-				(clear_local_git_env; cd "$sm_path" && git-fetch) ||
-				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
-			fi
-			remote_name=$(clear_local_git_env; cd "$sm_path" && get_default_remote)
-			sha1=$(clear_local_git_env; cd "$sm_path" &&
-				git rev-parse --verify "${remote_name}/${branch}") ||
-			die "$(eval_gettext "Unable to find current ${remote_name}/${branch} revision in submodule path '\$sm_path'")"
-		fi
-
-		if test "$subsha1" != "$sha1" || test -n "$force"
-		then
-			subforce=$force
-			# If we don't already have a -f flag and the submodule has never been checked out
-			if test -z "$subsha1" && test -z "$force"
-			then
-				subforce="-f"
-			fi
-
-			if test -z "$nofetch"
-			then
-				# Run fetch only if $sha1 isn't present or it
-				# is not reachable from a ref.
-				(clear_local_git_env; cd "$sm_path" &&
-					( (rev=$(git rev-list -n 1 $sha1 --not --all 2>/dev/null) &&
-					 test -z "$rev") || git-fetch)) ||
-				die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")"
-			fi
-
-			must_die_on_failure=
-			case "$update_module" in
-			checkout)
-				command="git checkout $subforce -q"
-				die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")"
-				say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")"
-				;;
-			rebase)
-				command="git rebase"
-				die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")"
-				say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")"
-				must_die_on_failure=yes
-				;;
-			merge)
-				command="git merge"
-				die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")"
-				say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")"
-				must_die_on_failure=yes
-				;;
-			!*)
-				command="${update_module#!}"
-				die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule  path '\$prefix\$sm_path'")"
-				say_msg="$(eval_gettext "Submodule path '\$prefix\$sm_path': '\$command \$sha1'")"
-				must_die_on_failure=yes
-				;;
-			*)
-				die "$(eval_gettext "Invalid update mode '$update_module' for submodule '$name'")"
-			esac
-
-			if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
-			then
-				say "$say_msg"
-			elif test -n "$must_die_on_failure"
-			then
-				die_with_status 2 "$die_msg"
-			else
-				err="${err};$die_msg"
-				continue
-			fi
-		fi
-
-		cmd_update_recursive
 	done
 
 	if test -n "$err"
-- 
2.6.0.rc0.131.gf624c3d

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/10] run-command: factor out return value computation
  2015-09-17  1:39 ` [PATCH 02/10] run-command: factor out return value computation Stefan Beller
@ 2015-09-17 10:33   ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 10:33 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, gitster, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

On Wed, Sep 16, 2015 at 06:39:00PM -0700, Stefan Beller wrote:

> +static int determine_return_value(int wait_status,
> +				  int *result,
> +				  int *error_code,
> +				  const char *argv0)
> +{
> +	if (WIFSIGNALED(wait_status)) {
> +		*result = WTERMSIG(wait_status);
> +		if (*result != SIGINT && *result != SIGQUIT)
> +			error("%s died of signal %d", argv0, *result);
> +		/*
> +		 * This return value is chosen so that code & 0xff
> +		 * mimics the exit code that a POSIX shell would report for
> +		 * a program that died from this signal.
> +		 */
> +		*result += 128;
> +	} else if (WIFEXITED(wait_status)) {
> +		*result = WEXITSTATUS(wait_status);
> +		/*
> +		 * Convert special exit code when execvp failed.
> +		 */
> +		if (*result == 127) {
> +			*result = -1;
> +			*error_code = ENOENT;
> +		}
> +	} else
> +		return 1;
> +	return 0;
> +}

Looks like we can return "0" or "1" here, and the exit code goes into
"result". But our caller:

>  static int wait_or_whine(pid_t pid, const char *argv0)
>  {
>  	int status, code = -1;
> @@ -244,29 +273,10 @@ static int wait_or_whine(pid_t pid, const char *argv0)
>  	if (waiting < 0) {
>  		failed_errno = errno;
>  		error("waitpid for %s failed: %s", argv0, strerror(errno));
> -	} else if (waiting != pid) {
> -		error("waitpid is confused (%s)", argv0);
> -	} else if (WIFSIGNALED(status)) {
> -		code = WTERMSIG(status);
> -		if (code != SIGINT && code != SIGQUIT)
> -			error("%s died of signal %d", argv0, code);
> -		/*
> -		 * This return value is chosen so that code & 0xff
> -		 * mimics the exit code that a POSIX shell would report for
> -		 * a program that died from this signal.
> -		 */
> -		code += 128;
> -	} else if (WIFEXITED(status)) {
> -		code = WEXITSTATUS(status);
> -		/*
> -		 * Convert special exit code when execvp failed.
> -		 */
> -		if (code == 127) {
> -			code = -1;
> -			failed_errno = ENOENT;
> -		}
>  	} else {
> -		error("waitpid is confused (%s)", argv0);
> +		if (waiting != pid
> +		   || (determine_return_value(status, &code, &failed_errno, argv0) < 0))
> +			error("waitpid is confused (%s)", argv0);
>  	}

...is looking for "< 0", which will never happen. Should the "1" above
have been "-1"?

I also wondered what happened to "code" and "failed_errno" in that case.
They are OK to access because wait_or_whine() has set them to defaults,
but I wonder if determine_return_value should do so in every branch (so
it is is clear that the values are always defined when it returns).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
@ 2015-09-17 16:13   ` Junio C Hamano
  2015-09-17 16:30     ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 16:13 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, peff, jrnieder, johannes.schindelin, Jens.Lehmann, vlovich

Stefan Beller <sbeller@google.com> writes:

> Subject: Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock

s/Add/add/;

> We need to read from pipes without blocking in a later patch.

I am hoping that you are at least not spinning---i.e. do a poll 
first to make sure there is at least some progress to be made
before calling this.

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  strbuf.c | 25 +++++++++++++++++++++++--
>  strbuf.h |  6 ++++++
>  2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index cce5eed..4130ee2 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE *f)
>  	return res;
>  }
>  
> -ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
> +#define IGNORE_EAGAIN (1)
> +
> +static ssize_t strbuf_read_internal(struct strbuf *sb, int fd,
> +				    size_t hint, int flags)
>  {
>  	size_t oldlen = sb->len;
>  	size_t oldalloc = sb->alloc;
> @@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  	for (;;) {
>  		ssize_t cnt;
>  
> -		cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
> +		cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>  		if (cnt < 0) {
> +			if (errno == EINTR)
> +				continue;
> +			if (errno == EAGAIN) {
> +				if (flags & IGNORE_EAGAIN)
> +					break;
> +				else
> +					continue;
> +			}

In order to ensure that this is not breaking the normal case, I had
to look at the implementation of xread() to see it behaves identically
when the flags is not passed.  That one-time review burden implies
that this is adding a maintenance burden to keep this copied function
in sync.

We should also handle EWOULDBLOCK not just EAGAIN.

More importantly, I am not sure if this helper is even necessary.

Looking at xread (reproduced in its full glory):

/*
 * xread() is the same a read(), but it automatically restarts read()
 * operations with a recoverable error (EAGAIN and EINTR). xread()
 * DOES NOT GUARANTEE that "len" bytes is read even if the data is available.
 */
ssize_t xread(int fd, void *buf, size_t len)
{
	ssize_t nr;
	if (len > MAX_IO_SIZE)
	    len = MAX_IO_SIZE;
	while (1) {
		nr = read(fd, buf, len);
		if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
			continue;
		return nr;
	}
}

Are we doing the right thing for EAGAIN here?

Now, man read(2) says this:

       EAGAIN 
		The file descriptor fd refers to a file other than a
                socket and has been marked nonblocking (O_NONBLOCK),
                and the read would block.

       EAGAIN or EWOULDBLOCK
		The file descriptor fd refers to a socket and has
                been marked nonblocking (O_NONBLOCK), and the read
                would block.  POSIX.1-2001 allows either error to be
                returned for this case, and does not require these
                con‐ stants to have the same value, so a portable
                application should check for both possibilities.

If the fd is not marked nonblocking, then we will not get EAGAIN (or
EWOULDBLOCK).

When fd is set to nonblocking, the current xread() spins if read()
says that the operation would block.  What would we achieve by
spinning ourselves here, though?  The caller must have set the fd to
be nonblocking for a reason, and that reason cannot be "if the data
is not ready, we do not have anything better than just spinning for
new data to arrive"---if so, the caller wouldn't have set the fd to
be nonblocking in the first place.

Even if there is such a stupid caller that only wants us to loop,
because we explicitly allow xread() to return a short read, all of
our callers must call it in a loop if they know how much they want
to read.  We can just return from here and let them loop around us.

And your new caller that does O_NONBLOCK wants to do more than
looping upon EWOULDBLOCK.  It certainly would not want us to loop
here.

So I wonder if you can just O_NONBLOCK the fd and use the usual
strbuf_read(), i.e. without any change in this patch, and update
xread() to _unconditionally_ return when read(2) says EAGAIN or
EWOULDBLOCK.

What would that break?

>  			if (oldalloc == 0)
>  				strbuf_release(sb);
>  			else
> @@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  	return sb->len - oldlen;
>  }
>  
> +ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
> +{
> +	return strbuf_read_internal(sb, fd, hint, 0);
> +}
> +
> +ssize_t strbuf_read_noblock(struct strbuf *sb, int fd, size_t hint)
> +{
> +	return strbuf_read_internal(sb, fd, hint, IGNORE_EAGAIN);
> +}
> +
>  #define STRBUF_MAXLINK (2*PATH_MAX)
>  
>  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> diff --git a/strbuf.h b/strbuf.h
> index aef2794..23ca7aa 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -367,6 +367,12 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
>  extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
>  
>  /**
> + * Same as strbuf_read, just returns non-blockingly by ignoring EAGAIN.
> + * The fd must have set O_NONBLOCK.
> + */
> +extern ssize_t strbuf_read_noblock(struct strbuf *, int fd, size_t hint);
> +
> +/**
>   * Read the contents of a file, specified by its path. The third argument
>   * can be used to give a hint about the file size, to avoid reallocs.
>   */

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:13   ` Junio C Hamano
@ 2015-09-17 16:30     ` Jeff King
  2015-09-17 16:44       ` Junio C Hamano
  2015-09-17 16:58       ` Junio C Hamano
  0 siblings, 2 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 16:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, git, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote:

> And your new caller that does O_NONBLOCK wants to do more than
> looping upon EWOULDBLOCK.  It certainly would not want us to loop
> here.
> 
> So I wonder if you can just O_NONBLOCK the fd and use the usual
> strbuf_read(), i.e. without any change in this patch, and update
> xread() to _unconditionally_ return when read(2) says EAGAIN or
> EWOULDBLOCK.
> 
> What would that break?

Certainly anybody who does not realize their descriptor is O_NONBLOCK
and is using the spinning for correctness. I tend to think that such
sites are wrong, though, and would benefit from us realizing they are
spinning.

But I think you can't quite get away with leaving strbuf_read untouched
in this case. On error, it wants to restore the original value of the
strbuf before the strbuf_read call. Which means that we throw away
anything read into the strbuf before we get EAGAIN, and the caller never
gets to see it.

So I think we would probably want to treat EAGAIN specially: return -1
to signal to the caller but _don't_ truncate the strbuf.

Arguably we should actually return the number of bytes we _did_ read,
but then caller cannot easily tell the difference between EOF and
EAGAIN.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:30     ` Jeff King
@ 2015-09-17 16:44       ` Junio C Hamano
  2015-09-17 16:51         ` Stefan Beller
  2015-09-17 16:57         ` Jeff King
  2015-09-17 16:58       ` Junio C Hamano
  1 sibling, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 16:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Git Mailing List, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 9:30 AM, Jeff King <peff@peff.net> wrote:
>
> So I think we would probably want to treat EAGAIN specially: return -1
> to signal to the caller but _don't_ truncate the strbuf.

Yeah, "don't truncate" is needed.

> Arguably we should actually return the number of bytes we _did_ read,
> but then caller cannot easily tell the difference between EOF and
> EAGAIN.

Why can't it check errno==EAGAIN/EWOULDBLOCK?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:44       ` Junio C Hamano
@ 2015-09-17 16:51         ` Stefan Beller
  2015-09-17 16:57         ` Jeff King
  1 sibling, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 16:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Git Mailing List, Jonathan Nieder, Johannes Schindelin,
	Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 9:44 AM, Junio C Hamano <gitster@pobox.com> wrote:
> On Thu, Sep 17, 2015 at 9:30 AM, Jeff King <peff@peff.net> wrote:
>>
>> So I think we would probably want to treat EAGAIN specially: return -1
>> to signal to the caller but _don't_ truncate the strbuf.
>
> Yeah, "don't truncate" is needed.
>
>> Arguably we should actually return the number of bytes we _did_ read,
>> but then caller cannot easily tell the difference between EOF and
>> EAGAIN.
>
> Why can't it check errno==EAGAIN/EWOULDBLOCK?

Grepping through Gits sources, there are no occurrences of
explicitly setting  O_NONBLOCK except for the newly introduced
spot in the followup patch in run-command (yes we do poll there).

So how would I find out if a fd is blocking or not in the 82 cases
of strbuf_read? (Now I naively assume they would all block.)

We could also expose the flags in the API. I think IGNORE_EAGAIN
might not be the best now, but rather a NO_TRUNCATE flag would do.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:44       ` Junio C Hamano
  2015-09-17 16:51         ` Stefan Beller
@ 2015-09-17 16:57         ` Jeff King
  1 sibling, 0 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 16:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Git Mailing List, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 09:44:19AM -0700, Junio C Hamano wrote:

> > Arguably we should actually return the number of bytes we _did_ read,
> > but then caller cannot easily tell the difference between EOF and
> > EAGAIN.
> 
> Why can't it check errno==EAGAIN/EWOULDBLOCK?

Is it trustworthy to check errno if read() has not actually ever
returned -1? E.g., consider this sequence:

  1. somebody (maybe even us calling strbuf_read) calls read() and it
     returns -1, setting errno to EAGAIN

  2. we call strbuf_read()

    2a. it calls read(), which returns N bytes

    2b. it calls read() again, which returns 0 for EOF

    2c. it returns N, because that's how many bytes it read

  3. we wonder if we hit EOF, or if we simply need to read again. We
     check errno == EAGAIN

I don't think the read calls in step 2 are guaranteed to clear errno,
and we might read the cruft from step 1.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:30     ` Jeff King
  2015-09-17 16:44       ` Junio C Hamano
@ 2015-09-17 16:58       ` Junio C Hamano
  2015-09-17 17:13         ` Jeff King
  2015-09-17 17:20         ` Stefan Beller
  1 sibling, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 16:58 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, git, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

Jeff King <peff@peff.net> writes:

> On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote:
>
>> And your new caller that does O_NONBLOCK wants to do more than
>> looping upon EWOULDBLOCK.  It certainly would not want us to loop
>> here.
>> 
>> So I wonder if you can just O_NONBLOCK the fd and use the usual
>> strbuf_read(), i.e. without any change in this patch, and update
>> xread() to _unconditionally_ return when read(2) says EAGAIN or
>> EWOULDBLOCK.
>> 
>> What would that break?
>
> Certainly anybody who does not realize their descriptor is O_NONBLOCK
> and is using the spinning for correctness. I tend to think that such
> sites are wrong, though, and would benefit from us realizing they are
> spinning.

With or without O_NONBLOCK, not looping around xread() _and_ relying
on the spinning for "correctness" means the caller is not getting
correctness anyway, I think, because xread() does return a short
read, and we deliberately and explicitly do so since 0b6806b9
(xread, xwrite: limit size of IO to 8MB, 2013-08-20).

> But I think you can't quite get away with leaving strbuf_read untouched
> in this case. On error, it wants to restore the original value of the
> strbuf before the strbuf_read call. Which means that we throw away
> anything read into the strbuf before we get EAGAIN, and the caller never
> gets to see it.

I agree we need to teach strbuf_read() that xread() is now nicer on
O_NONBLOCK; perhaps like this?

 strbuf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index cce5eed..49104d7 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
 
 		cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
 		if (cnt < 0) {
+			if (errno == EAGAIN || errno == EWOULDBLOCK)
+				break;
 			if (oldalloc == 0)
 				strbuf_release(sb);
 			else

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update"
  2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
                   ` (9 preceding siblings ...)
  2015-09-17  1:39 ` [PATCH 10/10] git submodule update: cmd_update_fetch Stefan Beller
@ 2015-09-17 17:06 ` Jacob Keller
  10 siblings, 0 replies; 37+ messages in thread
From: Jacob Keller @ 2015-09-17 17:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Jeff King, Junio C Hamano, jrnieder,
	johannes.schindelin, Jens.Lehmann, vlovich

On Wed, Sep 16, 2015 at 6:38 PM, Stefan Beller <sbeller@google.com> wrote:
> It took me a while to get the idea how to realize parallelism with the
> parallel run command structure now as opposed to the thread pool I proposed
> earlier, but I think it will be straightforward from here.
>

Yea at least from a cursory review this seems significantly simpler.
I'm still trying to give a better deep dive of the code, so hopefully
I will have some more thoughts soon.

Regards,
Jake

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:58       ` Junio C Hamano
@ 2015-09-17 17:13         ` Jeff King
  2015-09-17 17:26           ` Stefan Beller
  2015-09-17 17:54           ` Junio C Hamano
  2015-09-17 17:20         ` Stefan Beller
  1 sibling, 2 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 17:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, git, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

On Thu, Sep 17, 2015 at 09:58:00AM -0700, Junio C Hamano wrote:

> > Certainly anybody who does not realize their descriptor is O_NONBLOCK
> > and is using the spinning for correctness. I tend to think that such
> > sites are wrong, though, and would benefit from us realizing they are
> > spinning.
> 
> With or without O_NONBLOCK, not looping around xread() _and_ relying
> on the spinning for "correctness" means the caller is not getting
> correctness anyway, I think, because xread() does return a short
> read, and we deliberately and explicitly do so since 0b6806b9
> (xread, xwrite: limit size of IO to 8MB, 2013-08-20).

I think they have to loop for correctness, but they may do this:

  if (xread(fd, buf, len) < 0)
	die_errno("OMG, an error!");

which is not correct if "fd" is unknowingly non-blocking. As Stefan
mentioned, we do not set O_NONBLOCK ourselves very much, but I wonder if
we could inherit it from the environment in some cases.

The spinning behavior is not great, but does mean that we spin and
continue rather than bailing with an error.

> > But I think you can't quite get away with leaving strbuf_read untouched
> > in this case. On error, it wants to restore the original value of the
> > strbuf before the strbuf_read call. Which means that we throw away
> > anything read into the strbuf before we get EAGAIN, and the caller never
> > gets to see it.
> 
> I agree we need to teach strbuf_read() that xread() is now nicer on
> O_NONBLOCK; perhaps like this?
> 
>  strbuf.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/strbuf.c b/strbuf.c
> index cce5eed..49104d7 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  
>  		cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>  		if (cnt < 0) {
> +			if (errno == EAGAIN || errno == EWOULDBLOCK)
> +				break;
>  			if (oldalloc == 0)
>  				strbuf_release(sb);
>  			else

If we get EAGAIN on the first read, this will return "0", and I think we
end up in the "was it EOF, or EAGAIN?" situation I mentioned earlier.
If we reset errno to "0" at the top of the function, we could get around
one problem, but it still makes an annoying interface: the caller has to
check errno for any 0-return to figure out if it was really EOF, or just
EAGAIN.

If we return -1, though, we have a similar annoyance. If the caller
notices a -1 return value and finds EAGAIN, they still may need to check
sb->len to see if they made forward progress and have data they should
be dealing with.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 16:58       ` Junio C Hamano
  2015-09-17 17:13         ` Jeff King
@ 2015-09-17 17:20         ` Stefan Beller
  1 sibling, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 17:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 9:58 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote:
>>
>>> And your new caller that does O_NONBLOCK wants to do more than
>>> looping upon EWOULDBLOCK.  It certainly would not want us to loop
>>> here.
>>>
>>> So I wonder if you can just O_NONBLOCK the fd and use the usual
>>> strbuf_read(), i.e. without any change in this patch, and update
>>> xread() to _unconditionally_ return when read(2) says EAGAIN or
>>> EWOULDBLOCK.
>>>
>>> What would that break?
>>
>> Certainly anybody who does not realize their descriptor is O_NONBLOCK
>> and is using the spinning for correctness. I tend to think that such
>> sites are wrong, though, and would benefit from us realizing they are
>> spinning.
>
> With or without O_NONBLOCK, not looping around xread() _and_ relying
> on the spinning for "correctness" means the caller is not getting
> correctness anyway, I think, because xread() does return a short
> read, and we deliberately and explicitly do so since 0b6806b9
> (xread, xwrite: limit size of IO to 8MB, 2013-08-20).
>
>> But I think you can't quite get away with leaving strbuf_read untouched
>> in this case. On error, it wants to restore the original value of the
>> strbuf before the strbuf_read call. Which means that we throw away
>> anything read into the strbuf before we get EAGAIN, and the caller never
>> gets to see it.
>
> I agree we need to teach strbuf_read() that xread() is now nicer on
> O_NONBLOCK; perhaps like this?
>
>  strbuf.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index cce5eed..49104d7 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>
>                 cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>                 if (cnt < 0) {
> +                       if (errno == EAGAIN || errno == EWOULDBLOCK)
> +                               break;

This would need xread to behave differently too. (if EAGAIN, return)

Looking at xread to answer: "Why did we have a spinning loop in case of
EAGAIN in the first place?" I ended up with 1c15afb9343b (2005-12-19,
xread/xwrite:
do not worry about EINTR at calling sites.)

There we had lots of EAGAIN checks sprinkled through the code base, so we had
non blocking IO back then?

I think I chose to not use xread, as it contradicts everything we want
in the non
blocking case. We want to ignore any operations with a recoverable error (EAGAIN
and EINTR) and keep going with the rest of the program. In the blocking case
(as it is used currently) we can just have the checks from xread moved to
strbuf_read.

>                         if (oldalloc == 0)
>                                 strbuf_release(sb);
>                         else

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:13         ` Jeff King
@ 2015-09-17 17:26           ` Stefan Beller
  2015-09-17 17:35             ` Jeff King
  2015-09-17 17:54           ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 17:26 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 10:13 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Sep 17, 2015 at 09:58:00AM -0700, Junio C Hamano wrote:
>
>> > Certainly anybody who does not realize their descriptor is O_NONBLOCK
>> > and is using the spinning for correctness. I tend to think that such
>> > sites are wrong, though, and would benefit from us realizing they are
>> > spinning.
>>
>> With or without O_NONBLOCK, not looping around xread() _and_ relying
>> on the spinning for "correctness" means the caller is not getting
>> correctness anyway, I think, because xread() does return a short
>> read, and we deliberately and explicitly do so since 0b6806b9
>> (xread, xwrite: limit size of IO to 8MB, 2013-08-20).
>
> I think they have to loop for correctness, but they may do this:
>
>   if (xread(fd, buf, len) < 0)
>         die_errno("OMG, an error!");
>
> which is not correct if "fd" is unknowingly non-blocking. As Stefan
> mentioned, we do not set O_NONBLOCK ourselves very much, but I wonder if
> we could inherit it from the environment in some cases.
>
> The spinning behavior is not great, but does mean that we spin and
> continue rather than bailing with an error.
>
>> > But I think you can't quite get away with leaving strbuf_read untouched
>> > in this case. On error, it wants to restore the original value of the
>> > strbuf before the strbuf_read call. Which means that we throw away
>> > anything read into the strbuf before we get EAGAIN, and the caller never
>> > gets to see it.
>>
>> I agree we need to teach strbuf_read() that xread() is now nicer on
>> O_NONBLOCK; perhaps like this?
>>
>>  strbuf.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/strbuf.c b/strbuf.c
>> index cce5eed..49104d7 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>>
>>               cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>>               if (cnt < 0) {
>> +                     if (errno == EAGAIN || errno == EWOULDBLOCK)
>> +                             break;
>>                       if (oldalloc == 0)
>>                               strbuf_release(sb);
>>                       else
>
> If we get EAGAIN on the first read, this will return "0", and I think we
> end up in the "was it EOF, or EAGAIN?" situation I mentioned earlier.
> If we reset errno to "0" at the top of the function, we could get around
> one problem, but it still makes an annoying interface: the caller has to
> check errno for any 0-return to figure out if it was really EOF, or just
> EAGAIN.
>
> If we return -1, though, we have a similar annoyance. If the caller
> notices a -1 return value and finds EAGAIN, they still may need to check
> sb->len to see if they made forward progress and have data they should
> be dealing with.

If errno == EAGAIN, we know it is a non blocking fd, so we could encode
the length read as (- 2 - length), such that

...-2 the length read if EAGAIN was ignored
-1 for error, check errno!
0 for EOF
+1... length if we just read it or restarted it due to EINTR.

The call site should know if it is non blocking (i.e. if the <-2 case can
happen) and handle it appropriately.

Any callsite of today which is unaware of the fd being non blocking would break
by this (as we want to fix the spinning anyway eventually)




>
> -Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:26           ` Stefan Beller
@ 2015-09-17 17:35             ` Jeff King
  2015-09-17 17:45               ` Stefan Beller
  2015-09-17 17:57               ` Junio C Hamano
  0 siblings, 2 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 17:35 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 10:26:19AM -0700, Stefan Beller wrote:

> > If we return -1, though, we have a similar annoyance. If the caller
> > notices a -1 return value and finds EAGAIN, they still may need to check
> > sb->len to see if they made forward progress and have data they should
> > be dealing with.
> 
> If errno == EAGAIN, we know it is a non blocking fd, so we could encode
> the length read as (- 2 - length), such that
> 
> ...-2 the length read if EAGAIN was ignored
> -1 for error, check errno!
> 0 for EOF
> +1... length if we just read it or restarted it due to EINTR.
> 
> The call site should know if it is non blocking (i.e. if the <-2 case can
> happen) and handle it appropriately.

I actually wonder if callers who are _expecting_ non-blocking want to
loop in strbuf_read() at all.

strbuf_read() is really about reading to EOF, and growing the buffer to
fit all of the input. But that's not at all what you want to do with
non-blocking. There you believe for some reason that data may be
available (probably due to poll), and you want to read one chunk of it,
maybe act, and then go back to polling.

You _can_ loop on read until you hit EAGAIN, but I think in general you
shouldn't; if you get a lot of input on this fd, you'll starve all of
the other descriptors you're polling.  You're better off to read a
finite amount from each descriptor, and then check again who is ready to
read.

And then the return value becomes a no-brainer, because it's just the
return value of read(). Either you got some data, you got EOF, or you
get an error, which might be EAGAIN. You never have the case where you
got some data, but then you also got EOF and EAGAIN, and the caller has
to figure out which.

So I think you really want something like:

  ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint)
  {
	ssize_t cnt;

	strbuf_grow(hint ? hint : 8192);
	cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
	if (cnt > 0)
		strbuf_setlen(sb->len + cnt);
	return cnt;
  }

(where I'm assuming xread passes us back EAGAIN; we could also replace
it with read and loop on EINTR ourselves).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:35             ` Jeff King
@ 2015-09-17 17:45               ` Stefan Beller
  2015-09-17 17:50                 ` Jeff King
  2015-09-17 17:57               ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 17:45 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 10:35 AM, Jeff King <peff@peff.net> wrote:
>

>
> You _can_ loop on read until you hit EAGAIN, but I think in general you
> shouldn't; if you get a lot of input on this fd, you'll starve all of
> the other descriptors you're polling.  You're better off to read a
> finite amount from each descriptor, and then check again who is ready to
> read.

That's what I do with the current implementation. Except it's not as clear and
concise as I patched it into the strbuf_read.

>
> And then the return value becomes a no-brainer, because it's just the
> return value of read(). Either you got some data, you got EOF, or you
> get an error, which might be EAGAIN. You never have the case where you
> got some data, but then you also got EOF and EAGAIN, and the caller has
> to figure out which.
>
> So I think you really want something like:
>
>   ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint)
>   {
>         ssize_t cnt;
>
>         strbuf_grow(hint ? hint : 8192);
>         cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>         if (cnt > 0)
>                 strbuf_setlen(sb->len + cnt);
>         return cnt;
>   }
>
> (where I'm assuming xread passes us back EAGAIN; we could also replace
> it with read and loop on EINTR ourselves).

Yeah that's exactly what I am looking for (the hint may even be over
engineered here, as I have no clue how much data comes back).

So I guess I could just use that new method now.


> I actually wonder if callers who are _expecting_ non-blocking want to
> loop in strbuf_read() at all.
>
> strbuf_read() is really about reading to EOF, and growing the buffer to
> fit all of the input. But that's not at all what you want to do with
> non-blocking. There you believe for some reason that data may be
> available (probably due to poll), and you want to read one chunk of it,
> maybe act, and then go back to polling.

As a micro project (leftover bit maybe?):
When strbuf_read is reading data out from a non blocking pipe, we're currently
spinning (with the EAGAIN/EWOULDBLOCK). Introduce a call to poll
to reduce the spinning.

>
> -Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:45               ` Stefan Beller
@ 2015-09-17 17:50                 ` Jeff King
  2015-09-17 17:53                   ` Stefan Beller
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2015-09-17 17:50 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 10:45:40AM -0700, Stefan Beller wrote:

> > You _can_ loop on read until you hit EAGAIN, but I think in general you
> > shouldn't; if you get a lot of input on this fd, you'll starve all of
> > the other descriptors you're polling.  You're better off to read a
> > finite amount from each descriptor, and then check again who is ready to
> > read.
> 
> That's what I do with the current implementation. Except it's not as clear and
> concise as I patched it into the strbuf_read.

Is it? I thought the implementation you posted bumped the existing
strbuf_read to strbuf_buf_internal, including the loop. So as long as we
are not getting EAGAIN, it will keep reading forever. Actually not quite
true, as any read shorter than 8192 bytes will cause us to jump out of
the loop, too, but if we assume that the caller is feeding us data
faster than we can read it, we'll never exit strbuf_read_nonblock() and
serve any of the other descriptors.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:50                 ` Jeff King
@ 2015-09-17 17:53                   ` Stefan Beller
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 17:53 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 10:50 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Sep 17, 2015 at 10:45:40AM -0700, Stefan Beller wrote:
>
>> > You _can_ loop on read until you hit EAGAIN, but I think in general you
>> > shouldn't; if you get a lot of input on this fd, you'll starve all of
>> > the other descriptors you're polling.  You're better off to read a
>> > finite amount from each descriptor, and then check again who is ready to
>> > read.
>>
>> That's what I do with the current implementation. Except it's not as clear and
>> concise as I patched it into the strbuf_read.
>
> Is it? I thought the implementation you posted bumped the existing
> strbuf_read to strbuf_buf_internal, including the loop. So as long as we
> are not getting EAGAIN, it will keep reading forever.

You'll get EAGAIN pretty fast though, as all you do is reading as fast
as you can.

> Actually not quite
> true, as any read shorter than 8192 bytes will cause us to jump out of
> the loop, too, but if we assume that the caller is feeding us data
> faster than we can read it, we'll never exit strbuf_read_nonblock() and
> serve any of the other descriptors.

I see the difference now. That makes sense.

>
> -Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:13         ` Jeff King
  2015-09-17 17:26           ` Stefan Beller
@ 2015-09-17 17:54           ` Junio C Hamano
  2015-09-17 18:02             ` Jeff King
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 17:54 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, git, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

Jeff King <peff@peff.net> writes:

> I think they have to loop for correctness, but they may do this:
>
>   if (xread(fd, buf, len) < 0)
> 	die_errno("OMG, an error!");
>
> which is not correct if "fd" is unknowingly non-blocking.
> ...
> The spinning behavior is not great, but does mean that we spin and
> continue rather than bailing with an error.

OK, that is a problem (I just read through "git grep xread -- \*.c".
There aren't that many codepaths that read from a fd that we didn't
open ourselves, but there indeed are some.

So it seems that we would want a xread() that spins to help such
codepaths, and xread_nonblock() that gives a short-read upon
EWOULDBLOCK.  They can share a helper function, of course.

> If we reset errno to "0" at the top of the function, we could get around
> one problem, but it still makes an annoying interface: the caller has to
> check errno for any 0-return to figure out if it was really EOF, or just
> EAGAIN.
>
> If we return -1, though, we have a similar annoyance. If the caller
> notices a -1 return value and finds EAGAIN, they still may need to check
> sb->len to see if they made forward progress and have data they should
> be dealing with.

OK.  Trying to repurpose strbuf_read() for non-blocking FD was a
silly idea to begin with, as it wants to read thru to the EOF, and
setting FD explicitly to O_NONBLOCK is a sign that the caller wants
to grab as much as possible and does not want to wait for the EOF.

So assuming we want strbuf_read_nonblock(), what interface do we
want from it?  We could:

 * Have it grab as much as possible into sb as long as it does not
   block?

 * Have it grab reasonably large amount into sb, and not blocking is
   an absolute requirement, but the function is not required to read
   everything that is available on the FD (i.e. the caller is
   expected to loop)?

If we choose the latter, then your "EAGAIN? EOF?" becomes easier,
no?  We only have to do a single call to xread(), and then we:

 - get EAGAIN or EWOULDBLOCK; leave sb as-is, set errno==EAGAIN and
   return -1.

 - get something (in which case that is not an EOF yet); append to
   sb, return the number of bytes.

 - get EOF; leave sb as-is, return 0.

ssize_t strbuf_read_nonblock(struct strbuf *sb, int fd, size_t hint)
{
	strbuf_grow(sb, hint ? hint : 8192);
	ssize_t want = sb->alloc - sb->len - 1;
	ssize_t got = xread_nonblock(fd, sb->buf + sb->len, want);
	if (got < 0) {
		if (errno == EWOULDBLOCK)
			errno = EAGAIN; /* make life easier for the caller */
		return got;
	}
	sb->len += got;
	sb->buf[sb->len] = '\0';
	return got;
}

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:35             ` Jeff King
  2015-09-17 17:45               ` Stefan Beller
@ 2015-09-17 17:57               ` Junio C Hamano
  1 sibling, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 17:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, git@vger.kernel.org, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

Jeff King <peff@peff.net> writes:

> I actually wonder if callers who are _expecting_ non-blocking want to
> loop in strbuf_read() at all.
>
> strbuf_read() is really about reading to EOF, and growing the buffer to
> fit all of the input. But that's not at all what you want to do with
> non-blocking. There you believe for some reason that data may be
> available (probably due to poll), and you want to read one chunk of it,
> maybe act, and then go back to polling.

I think I could have gone to grab a cup of coffee instead of typing
another message that essentially said the same thing ;-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
  2015-09-17 17:54           ` Junio C Hamano
@ 2015-09-17 18:02             ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2015-09-17 18:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, git, jrnieder, johannes.schindelin, Jens.Lehmann,
	vlovich

On Thu, Sep 17, 2015 at 10:54:39AM -0700, Junio C Hamano wrote:

> OK.  Trying to repurpose strbuf_read() for non-blocking FD was a
> silly idea to begin with, as it wants to read thru to the EOF, and
> setting FD explicitly to O_NONBLOCK is a sign that the caller wants
> to grab as much as possible and does not want to wait for the EOF.
> 
> So assuming we want strbuf_read_nonblock(), what interface do we
> want from it?  We could:
> 
>  * Have it grab as much as possible into sb as long as it does not
>    block?
> 
>  * Have it grab reasonably large amount into sb, and not blocking is
>    an absolute requirement, but the function is not required to read
>    everything that is available on the FD (i.e. the caller is
>    expected to loop)?

I think we are crossing emails, but I would definitely argue for the
latter.

> If we choose the latter, then your "EAGAIN? EOF?" becomes easier,
> no?  We only have to do a single call to xread(), and then we:
> 
>  - get EAGAIN or EWOULDBLOCK; leave sb as-is, set errno==EAGAIN and
>    return -1.
> 
>  - get something (in which case that is not an EOF yet); append to
>    sb, return the number of bytes.
> 
>  - get EOF; leave sb as-is, return 0.

Yes, exactly.

> ssize_t strbuf_read_nonblock(struct strbuf *sb, int fd, size_t hint)
> {
> 	strbuf_grow(sb, hint ? hint : 8192);
> 	ssize_t want = sb->alloc - sb->len - 1;
> 	ssize_t got = xread_nonblock(fd, sb->buf + sb->len, want);
> 	if (got < 0) {
> 		if (errno == EWOULDBLOCK)
> 			errno = EAGAIN; /* make life easier for the caller */
> 		return got;
> 	}

I like the normalizing of errno, but that should probably be part of
xread_nonblock(), I would think.

> 	sb->len += got;
> 	sb->buf[sb->len] = '\0';

Use strbuf_setlen() here?

> 	return got;

If "got == 0", we naturally do not change sb->len at all, and the strbuf
is left unchanged. But do we want to de-allocate what we allocated in
strbuf_grow() above? That is what strbuf_read() does, but I think it is
even less likely to help anybody here.  With the original strbuf_read(),
you might do:

  if (strbuf_read(&buf, fd, hint) <= 0)
	return; /* got nothing */

but because the nature of strbuf_read_nonblock() is to call it from a
loop, you'd want to strbuf_release() when you leave the loop anyway.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 06/10] git submodule update: Redirect any output to stderr
  2015-09-17  1:39 ` [PATCH 06/10] git submodule update: Redirect any output to stderr Stefan Beller
@ 2015-09-17 20:31   ` Eric Sunshine
  2015-09-17 20:38     ` Stefan Beller
  0 siblings, 1 reply; 37+ messages in thread
From: Eric Sunshine @ 2015-09-17 20:31 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Jeff King, Junio C Hamano, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, vlovich

On Wed, Sep 16, 2015 at 9:39 PM, Stefan Beller <sbeller@google.com> wrote:
> git submodule update: Redirect any output to stderr

This commit message seems to be lacking an explanation of why this is
being done.

> There are no tests, which fail by this.

Not sure what this means. I suppose you're trying to say that this
patch doesn't break any existing tests, but isn't that an implied goal
of all patches posted to this list?

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  git-submodule.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 8b0eb9a..7ef3247 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -663,7 +663,7 @@ cmd_update()
>                 die_if_unmatched "$mode"
>                 if test "$stage" = U
>                 then
> -                       echo >&2 "Skipping unmerged submodule $prefix$sm_path"
> +                       say >&2 "Skipping unmerged submodule $prefix$sm_path"
>                         continue
>                 fi
>                 name=$(git submodule--helper name "$sm_path") || exit
> @@ -684,7 +684,7 @@ cmd_update()
>
>                 if test "$update_module" = "none"
>                 then
> -                       echo "Skipping submodule '$displaypath'"
> +                       say >&2 "Skipping submodule '$displaypath'"

These changes seem to be doing more than what the commit message
claims. The changed code isn't just redirecting to stderr, but is also
now respecting $GIT_QUIET.

>                         continue
>                 fi
>
> --
> 2.6.0.rc0.131.gf624c3d

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix
  2015-09-17  1:39 ` [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix Stefan Beller
@ 2015-09-17 20:33   ` Eric Sunshine
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Sunshine @ 2015-09-17 20:33 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Jeff King, Junio C Hamano, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, vlovich

On Wed, Sep 16, 2015 at 9:39 PM, Stefan Beller <sbeller@google.com> wrote:
> We should not pass --prefix NULL into the helper. Although the helper
> can deal with it, it's just messy.

Perhaps the commit message can explain under what conditions $prefix
will be null...

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 7ef3247..3ccb0b6 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -700,7 +700,7 @@ Maybe you want to use 'update --init'?")"
>
>                 if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
>                 then
> -                       git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
> +                       git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
>                         cloned_modules="$cloned_modules;$name"
>                         subsha1=
>                 else
> --
> 2.6.0.rc0.131.gf624c3d

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/10] git submodule update: cmd_update_recursive
  2015-09-17  1:39 ` [PATCH 09/10] " Stefan Beller
@ 2015-09-17 20:37   ` Eric Sunshine
  0 siblings, 0 replies; 37+ messages in thread
From: Eric Sunshine @ 2015-09-17 20:37 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Jeff King, Junio C Hamano, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, vlovich

On Wed, Sep 16, 2015 at 9:39 PM, Stefan Beller <sbeller@google.com> wrote:
> git submodule update: cmd_update_recursive

Commit message doesn't seem to match the patch...

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  git-submodule.sh | 31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 52c2967..c40d60f 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -607,6 +607,24 @@ cmd_update_recursive()
>         fi
>  }
>
> +cmd_update_clone()
> +{
> +       command="git checkout $subforce -q"
> +       die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")"
> +       say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")"
> +
> +       git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
> +
> +       if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
> +       then
> +               say "$say_msg"
> +       else
> +               err="${err};$die_msg"
> +               return
> +       fi
> +       cmd_update_recursive
> +}
> +
>  #
>  # Update each submodule path to correct revision, using clone and checkout as needed
>  #
> @@ -680,7 +698,6 @@ cmd_update()
>                 cmd_init "--" "$@" || return
>         fi
>
> -       cloned_modules=
>         git submodule--helper list --prefix "$wt_prefix" "$@" | {
>         err=
>         while read mode sha1 stage sm_path
> @@ -725,9 +742,8 @@ Maybe you want to use 'update --init'?")"
>
>                 if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
>                 then
> -                       git submodule--helper clone ${GIT_QUIET:+--quiet} ${prefix:+--prefix "$prefix"} --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
> -                       cloned_modules="$cloned_modules;$name"
> -                       subsha1=
> +                       cmd_update_clone
> +                       continue
>                 else
>                         subsha1=$(clear_local_git_env; cd "$sm_path" &&
>                                 git rev-parse --verify HEAD) ||
> @@ -767,13 +783,6 @@ Maybe you want to use 'update --init'?")"
>                                 die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")"
>                         fi
>
> -                       # Is this something we just cloned?
> -                       case ";$cloned_modules;" in
> -                       *";$name;"*)
> -                               # then there is no local change to integrate
> -                               update_module=checkout ;;
> -                       esac
> -
>                         must_die_on_failure=
>                         case "$update_module" in
>                         checkout)
> --
> 2.6.0.rc0.131.gf624c3d

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 06/10] git submodule update: Redirect any output to stderr
  2015-09-17 20:31   ` Eric Sunshine
@ 2015-09-17 20:38     ` Stefan Beller
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 20:38 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Jeff King, Junio C Hamano, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 1:31 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Wed, Sep 16, 2015 at 9:39 PM, Stefan Beller <sbeller@google.com> wrote:
>> git submodule update: Redirect any output to stderr
>
> This commit message seems to be lacking an explanation of why this is
> being done.
>
>> There are no tests, which fail by this.
>
> Not sure what this means. I suppose you're trying to say that this
> patch doesn't break any existing tests, but isn't that an implied goal
> of all patches posted to this list?

Yes they should (but they don't yet).

What I was trying to say:

In a reroll I want to add tests to this as I was surprised
of not breaking a test by this commit (so the behavior
of this is untested which is bad)

>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>  git-submodule.sh | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 8b0eb9a..7ef3247 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -663,7 +663,7 @@ cmd_update()
>>                 die_if_unmatched "$mode"
>>                 if test "$stage" = U
>>                 then
>> -                       echo >&2 "Skipping unmerged submodule $prefix$sm_path"
>> +                       say >&2 "Skipping unmerged submodule $prefix$sm_path"
>>                         continue
>>                 fi
>>                 name=$(git submodule--helper name "$sm_path") || exit
>> @@ -684,7 +684,7 @@ cmd_update()
>>
>>                 if test "$update_module" = "none"
>>                 then
>> -                       echo "Skipping submodule '$displaypath'"
>> +                       say >&2 "Skipping submodule '$displaypath'"
>
> These changes seem to be doing more than what the commit message
> claims. The changed code isn't just redirecting to stderr, but is also
> now respecting $GIT_QUIET.

Right, I need to redo the commit message anyways, so I'll mention that.

>
>>                         continue
>>                 fi
>>
>> --
>> 2.6.0.rc0.131.gf624c3d

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor
  2015-09-17  1:39 ` [PATCH 03/10] run-command: add an asynchronous parallel child processor Stefan Beller
@ 2015-09-17 21:44   ` Junio C Hamano
  2015-09-17 23:19     ` Stefan Beller
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2015-09-17 21:44 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, peff, jrnieder, johannes.schindelin, Jens.Lehmann, vlovich

Stefan Beller <sbeller@google.com> writes:

>  time -->
>  output: |---A---|   |-B-|   |----C-----------|   |-D-|   |-E-|

Be nice and distribute the line evenly around "C".  Same for thread
2 below.

> diff --git a/run-command.c b/run-command.c
> index c892e9a..3af97ab 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -3,6 +3,7 @@
>  #include "exec_cmd.h"
>  #include "sigchain.h"
>  #include "argv-array.h"
> +#include "thread-utils.h"
>  
>  void child_process_init(struct child_process *child)
>  {
> @@ -862,3 +863,230 @@ int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint)
>  	close(cmd->out);
>  	return finish_command(cmd);
>  }
> +
> +struct parallel_processes {
> +	int max_number_processes;
> +	void *data;

> +	get_next_task fn;
> +	handle_child_starting_failure fn_err;
> +	handle_child_return_value fn_exit;

The 'fn' feels really misnamed, especially when compared with the
other fields.  fn_task or something, perhaps.

Also I think we call a function type we define with a name that ends
with _fn, e.g.

    typedef void (*show_commit_fn)(struct commit *, void *);
    void traverse_commit_list(struct rev_info *revs,
                              show_commit_fn show_commit,
                              show_object_fn show_object,
                              void *data)

So perhaps

	get_next_task_fn get_next_task;
        start_failure_fn start_failure;
        return_value_fn return_value;

or something like that.

> +	int nr_processes;
> +	int all_tasks_started;
> +	int foreground_child;
> +	char *slots;

What does slots[i] mean?  Whatever explanation you would use as an
answer to that question, I'd name the field after the key words used
in the explanation.  For example, if it means "children[i] is in use
with a process", then the code would be a lot happier if the field
is called in_use[] or something.

But do not just rename the field yet...

> +	struct child_process *children;
> +	struct pollfd *pfd;
> +	struct strbuf *err;

struct pollfd needs to be a contiguous array of nr_processes
elements because that is what poll(2) takes, but other per-child
fields would be easier to grasp if you did it like so:

	struct parallel_processes {
        	...
                struct {
                	int in_use;
                        struct child_process child;
                        struct strbuf err;
                        ... /* maybe other per-child field later */
		} *children;
		...
	}

> +	struct strbuf finished_children;

A strbuf that holds "finished_children"?  Does it hold "what
finished_children said"?

We care about good naming because clearly named variables and fields
make the code easier to read.

> +static void unblock_fd(int fd)

I would probably have called this "set_nonblocking()".

> +{
> +	int flags = fcntl(fd, F_GETFL);
> +	if (flags < 0) {
> +		warning("Could not get file status flags, "
> +			"output will be degraded");
> +		return;
> +	}
> +	if (fcntl(fd, F_SETFL, flags | O_NONBLOCK)) {
> +			warning("Could not set file status flags, "
> +			"output will be degraded");

The first line of "warning(" is indented one level too deep.

> +		return;
> +	}

Wouldn't it be easier to follow if you did

	fn(...)
	{
                if (flags < 0)
                        warn();
                else if (fcntrl() < 0)
                        warn();
	}

> +static void run_processes_parallel_start_new(struct parallel_processes *pp)
> +{
> +	int i;
> +	/* Start new processes. */

Drop this comment and replace it with a blank line.

> +	while (!pp->all_tasks_started
> +	       && pp->nr_processes < pp->max_number_processes) {

Remove "&& " and add " &&" at the end of the previous line.

> +		for (i = 0; i < pp->max_number_processes; i++)
> +			if (!pp->slots[i])
> +				break; /* found an empty slot */

The comment does not help us at all.  if (...) break; tells as much,
and it does not tell us what it means that slot[] being empty at all.

		for (...)
			if (!pp->children[i].in_use)
				break;

would be a lot easier to follow without any comment.

> +		if (i == pp->max_number_processes)
> +			die("BUG: bookkeeping is hard");
> +
> +		if (pp->fn(pp->data, &pp->children[i], &pp->err[i])) {

This use pattern (and the explanation in run-command.h) suggests
that the name of the field should be s/fn/more_task/; or something
along that line (and flip the return value, i.e. more_task() returns
yes if it did grab another task, no if there is no more task).

> +			pp->all_tasks_started = 1;
> +			break;
> +		}
> +		if (start_command(&pp->children[i]))
> +			pp->fn_err(pp->data, &pp->children[i], &pp->err[i]);

Can fn_err be NULL here?  Shouldn't it (to give a default behaviour
to lazy or bog-standard callers)?

> +		unblock_fd(pp->children[i].err);
> +
> +		pp->nr_processes++;
> +		pp->slots[i] = 1;
> +		pp->pfd[i].fd = pp->children[i].err;
> +	}
> +}
> +
> +static int run_processes_parallel_buffer_stderr(struct parallel_processes *pp)
> +{
> +	int i;

Have a blank line here between decls and the first statement.

> +	i = poll(pp->pfd, pp->max_number_processes, 100);

Give a symbolic constant for this 100ms, e.g. OUTPUT_POLL_INTERVAL
or something.

> +	if (i < 0) {
> +		if (errno == EINTR)
> +			/* A signal was caught; try again */
> +			return -1;
> +		else {
> +			run_processes_parallel_cleanup(pp);
> +			die_errno("poll");
> +		}
> +	}

Shouldn't this be more like

	while ((i = poll()) < 0)  {
        	if (errno == EINTR)
                	continue;
		cleanup;
		die;
	}

The caller after all was willing to wait for some time, and we were
interrupted before that time came.

> +	/* Buffer output from all pipes. */
> +	for (i = 0; i < pp->max_number_processes; i++) {
> +		if (!pp->slots[i])
> +			continue;
> +		if (pp->pfd[i].revents & POLLIN)
> +			strbuf_read_noblock(&pp->err[i], pp->children[i].err, 0);
> +		if (pp->foreground_child == i) {
> +			fputs(pp->err[i].buf, stderr);
> +			strbuf_reset(&pp->err[i]);
> +		}

Even if we own the output channel, we may not have read anything
yet---poll() may have said that pfd[i] is not ready, or
read_nonblock() may have returned EWOULDBLOCK.  Perhaps check not
just that i owns the output but err[i].len is not zero?

I think output should be done outside the loop and make the comment
before the loop match what the loop actually does.

	/* Buffer output from all pipes. */
	for (i = 0; ...) {
		if (pp->children[i].in_use && (pp->pfd[i].revents & POLLIN))
			strbuf_read_nonblock();
	}

	/* Drain the output from the owner of the output channel */
	if (pp->children[pp->output_owner].in_use &&
	    pp->children[pp->output_owner].err.len) {
	    	fputs(...);
                strbuf_reset(...);
	}

> +static void run_processes_parallel_collect_finished(struct parallel_processes *pp)
> +{
> +	int i = 0;
> +	pid_t pid;
> +	int wait_status, code;
> +	int n = pp->max_number_processes;
> +	/* Collect finished child processes. */

Drop this comment and replace it with a blank line.

> +	while (pp->nr_processes > 0) {
> +		pid = waitpid(-1, &wait_status, WNOHANG);
> +		if (pid == 0)
> +			return; /* no child finished */

Do we need that comment?

> +		if (pid < 0) {
> +			if (errno == EINTR)
> +				return; /* just try again  next time */

Can we get EINTR here (we are passing WNOHANG above)?

> +			if (errno == EINVAL || errno == ECHILD)
> +				die_errno("wait");

What should happen when we get an error not listed here?  'i' is
left as initialized to 0 and we do strbuf_read_nonblock() for the
first child (which may not even be running)?

You can sweep this bug under the rug by returning here, but I
suspect that you would just want

	if (pid < 0)
		die_errno();

And that would allow you to dedent the else clause below.

> +		} else {
> +			/* Find the finished child. */
> +			for (i = 0; i < pp->max_number_processes; i++)
> +				if (pp->slots[i] && pid == pp->children[i].pid)
> +					break;

Hmm, you are relying on the fact that a valid pid can never be 0, so
you can just use pp->children[i].child.pid to see if a "slot" is
occupied without even using pp->slots[] (or pp->children[i].in_use).

> +			if (i == pp->max_number_processes)
> +				/*
> +				 * waitpid returned another process id
> +				 * which we are not waiting for.
> +				 */
> +				return;
> +		}
> +		strbuf_read_noblock(&pp->err[i], pp->children[i].err, 0);

This is to read leftover output?

As discussed elsewhere, read_nonblock() will have to have "read
some, not necessarily to the end" semantics to serve the caller in
run_processes_parallel_buffer_stderr(), so you'd need a loop around
it here to read until you see EOF.

Or you may be able to just call strbuf_read() and the function may
do the right thing to read things through to the EOF.  It depends on
how you redo the patch [2/10].

> +		if (determine_return_value(wait_status, &code, &errno,
> +					   pp->children[i].argv[0]) < 0)
> +			error("waitpid is confused (%s)",
> +			      pp->children[i].argv[0]);
> +
> +		pp->fn_exit(pp->data, &pp->children[i], code);

You are clobbering errno by calling determine_return_value() but you
do not use the returned value anywhere.  Intended?  Or should that
be given to fn_exit() for error reporting?

> +		argv_array_clear(&pp->children[i].args);
> +		argv_array_clear(&pp->children[i].env_array);
> +
> +		pp->nr_processes--;
> +		pp->slots[i] = 0;
> +		pp->pfd[i].fd = -1;

Mental note: here the "slot" is cleared for the child 'i'.

> +		if (i != pp->foreground_child) {
> +			strbuf_addbuf(&pp->finished_children, &pp->err[i]);
> +			strbuf_reset(&pp->err[i]);

OK, so the idea is that pp->child[i].err holds the entire output for
any process that does not own the output channel until it dies, and
they are appended to pp->finished_children.  That suggests that the
name of the "finished" field should have "output" somewhere in it.

> +		} else {

Mental note: this side of if/else is what happens to the process
that used to own the output channel.

> +			fputs(pp->err[i].buf, stderr);
> +			strbuf_reset(&pp->err[i]);

... and it just flushes the final part of the output.

> +			/* Output all other finished child processes */
> +			fputs(pp->finished_children.buf, stderr);
> +			strbuf_reset(&pp->finished_children);

If there is any, that is.

> +			/*
> +			 * Pick next process to output live.
> +			 * NEEDSWORK:
> +			 * For now we pick it randomly by doing a round
> +			 * robin. Later we may want to pick the one with
> +			 * the most output or the longest or shortest
> +			 * running process time.
> +			 */
> +			for (i = 0; i < n; i++)
> +				if (pp->slots[(pp->foreground_child + i) % n])
> +					break;
> +			pp->foreground_child = (pp->foreground_child + i) % n;

... and then picks a new owner of the output channel.

Up to this point it looks sensible.

> +			fputs(pp->err[pp->foreground_child].buf, stderr);
> +			strbuf_reset(&pp->err[pp->foreground_child]);

I do not think these two lines need to be here, especially if you
follow the above advice of separating buffering and draining.

> +int run_processes_parallel(int n, void *data,
> +			   get_next_task fn,
> +			   handle_child_starting_failure fn_err,
> +			   handle_child_return_value fn_exit)
> +{
> +	struct parallel_processes pp;
> +	run_processes_parallel_init(&pp, n, data, fn, fn_err, fn_exit);
> +
> +	while (!pp.all_tasks_started || pp.nr_processes > 0) {

The former is true as long as more_task() says there may be more.
The latter is true as long as we have something already running.

In either case, we should keep collecting and spawning as needed.

> +		run_processes_parallel_start_new(&pp);

But calling start_new() unconditionally feels sloppy.  It should at
least be something like

	if (pp.nr_processes < pp.max_processes &&
	    !pp.all_task_started)
		start_new_process()

no?

> diff --git a/test-run-command.c b/test-run-command.c
> index 89c7de2..70b6c7a 100644
> --- a/test-run-command.c
> +++ b/test-run-command.c
> @@ -30,6 +50,10 @@ int main(int argc, char **argv)
>  	if (!strcmp(argv[1], "run-command"))
>  		exit(run_command(&proc));
>  
> +	if (!strcmp(argv[1], "run-command-parallel-4"))
> +		exit(run_processes_parallel(4, &proc, parallel_next,
> +					 NULL, NULL));

So not only fn_err, but fn_exit is optional.  You'd need to update
the code above to allow these.

> +
>  	fprintf(stderr, "check usage\n");
>  	return 1;
>  }

It was a fun read.  There were tons of niggles, but I didn't see
anything fundamentally unsalvageable.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor
  2015-09-17 21:44   ` Junio C Hamano
@ 2015-09-17 23:19     ` Stefan Beller
  2015-09-18  1:05       ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Beller @ 2015-09-17 23:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Jeff King, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 2:44 PM, Junio C Hamano <gitster@pobox.com> wrote:

> Hmm, you are relying on the fact that a valid pid can never be 0, so
> you can just use pp->children[i].child.pid to see if a "slot" is
> occupied without even using pp->slots[] (or pp->children[i].in_use).

We could either use the pid as accessed via children[i].process.pid
or we could rely on the children[i].process.err != -1 as start_process
will set it to an actual fd, and when it's done we reset it to -1.

I am unsure if this make it less readable though (all your other
suggestions improve readability a lot!)

>
>> +                     if (i == pp->max_number_processes)
>> +                             /*
>> +                              * waitpid returned another process id
>> +                              * which we are not waiting for.
>> +                              */
>> +                             return;
>> +             }
>> +             strbuf_read_noblock(&pp->err[i], pp->children[i].err, 0);
>
> This is to read leftover output?
>
> As discussed elsewhere, read_nonblock() will have to have "read
> some, not necessarily to the end" semantics to serve the caller in
> run_processes_parallel_buffer_stderr(), so you'd need a loop around
> it here to read until you see EOF.
>
> Or you may be able to just call strbuf_read() and the function may
> do the right thing to read things through to the EOF.  It depends on
> how you redo the patch [2/10].

strbuf_read sounds like the logical choice here (and the redo of 2/10
supports that).


>> +                      * NEEDSWORK:
>> +                      * For now we pick it randomly by doing a round
>> +                      * robin. Later we may want to pick the one with
>> +                      * the most output or the longest or shortest
>> +                      * running process time.
>> +                      */
>> +                     for (i = 0; i < n; i++)
>> +                             if (pp->slots[(pp->foreground_child + i) % n])
>> +                                     break;
>> +                     pp->foreground_child = (pp->foreground_child + i) % n;
>
> ... and then picks a new owner of the output channel.
>
> Up to this point it looks sensible.
>
>> +                     fputs(pp->err[pp->foreground_child].buf, stderr);
>> +                     strbuf_reset(&pp->err[pp->foreground_child]);
>
> I do not think these two lines need to be here, especially if you
> follow the above advice of separating buffering and draining.

They are outputting the buffer, if the next selected child is still running.
I mean it would output eventually anyway, but it would only output after
the next start of processes and poll() is done. Yeah maybe that's what we
want (starting new processes earlier is better than outputting earlier as we're
talking microseconds here).

>
>> +int run_processes_parallel(int n, void *data,
>> +                        get_next_task fn,
>> +                        handle_child_starting_failure fn_err,
>> +                        handle_child_return_value fn_exit)
>> +{
>> +     struct parallel_processes pp;
>> +     run_processes_parallel_init(&pp, n, data, fn, fn_err, fn_exit);
>> +
>> +     while (!pp.all_tasks_started || pp.nr_processes > 0) {
>
> The former is true as long as more_task() says there may be more.
> The latter is true as long as we have something already running.
>
> In either case, we should keep collecting and spawning as needed.
>
>> +             run_processes_parallel_start_new(&pp);
>
> But calling start_new() unconditionally feels sloppy.  It should at
> least be something like
>
>         if (pp.nr_processes < pp.max_processes &&
>             !pp.all_task_started)
>                 start_new_process()
>
> no?

That's the exact condition we have in start_new_process
so I don't want to repeat it here? We could move the while
loop outside of it though. That looks better, done.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor
  2015-09-17 23:19     ` Stefan Beller
@ 2015-09-18  1:05       ` Junio C Hamano
  2015-09-18 16:36         ` Stefan Beller
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2015-09-18  1:05 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Jeff King, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

Stefan Beller <sbeller@google.com> writes:

> On Thu, Sep 17, 2015 at 2:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> Hmm, you are relying on the fact that a valid pid can never be 0, so
>> you can just use pp->children[i].child.pid to see if a "slot" is
>> occupied without even using pp->slots[] (or pp->children[i].in_use).
>
> We could either use the pid as accessed via children[i].process.pid
> or we could rely on the children[i].process.err != -1 as start_process
> will set it to an actual fd, and when it's done we reset it to -1.
>
> I am unsure if this make it less readable though (all your other
> suggestions improve readability a lot!)

Sorry, the above was not a suggestion but merely an observation with
a bit of thinking aloud mixed in.  I should have marked it as such
more clearly.

>> ... and then picks a new owner of the output channel.
>>
>> Up to this point it looks sensible.
>>
>>> +                     fputs(pp->err[pp->foreground_child].buf, stderr);
>>> +                     strbuf_reset(&pp->err[pp->foreground_child]);
>>
>> I do not think these two lines need to be here, especially if you
>> follow the above advice of separating buffering and draining.
>
> They are outputting the buffer, if the next selected child is still running.
> I mean it would output eventually anyway, but it would only output after
> the next start of processes and poll() is done. Yeah maybe that's what we
> want (starting new processes earlier is better than outputting earlier as we're
> talking microseconds here).

I am not talking about microseconds but refraining from doing the
same thing in multiple places.

>> But calling start_new() unconditionally feels sloppy.  It should at
>> least be something like
>>
>>         if (pp.nr_processes < pp.max_processes &&
>>             !pp.all_task_started)
>>                 start_new_process()
>>
>> no?
>
> That's the exact condition we have in start_new_process
> so I don't want to repeat it here?

You are advocating to have a function whose definition is "this may
or may not start a new process and the caller should not care", and
then make the caller keep calling it, knowing/hoping that the caller
does not care if we spawn a new thing or not.

I somehow find it a highly questionable design taste to base the
decision on "don't want to repeat it here".

Stepping back and thinking about it, what is suggested is more
explicit in the caller.  "If we know we need a new worker and we can
have a new worker, then start it." is the logic in the caller in the
suggested structure, and we would have a well defined helper whose
sole task is to start a new worker to be called from the caller.
The helper may want to check if the request to start a new one makes
sense (e.g. if slots[] are all full, it may even want to return an
error), but the checks are primarily for error checking (i.e. "we
can have max N processes, so make sure we do not exceed that limit"),
not a semantic one (i.e. the caller _could_ choose to spawn less
than that max when there is a good reason to do so).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor
  2015-09-18  1:05       ` Junio C Hamano
@ 2015-09-18 16:36         ` Stefan Beller
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Beller @ 2015-09-18 16:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Jeff King, Jonathan Nieder,
	Johannes Schindelin, Jens Lehmann, Vitali Lovich

On Thu, Sep 17, 2015 at 6:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Thu, Sep 17, 2015 at 2:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> Hmm, you are relying on the fact that a valid pid can never be 0, so
>>> you can just use pp->children[i].child.pid to see if a "slot" is
>>> occupied without even using pp->slots[] (or pp->children[i].in_use).
>>
>> We could either use the pid as accessed via children[i].process.pid
>> or we could rely on the children[i].process.err != -1 as start_process
>> will set it to an actual fd, and when it's done we reset it to -1.
>>
>> I am unsure if this make it less readable though (all your other
>> suggestions improve readability a lot!)
>
> Sorry, the above was not a suggestion but merely an observation with
> a bit of thinking aloud mixed in.  I should have marked it as such
> more clearly.

Ok, I see. no harm done, I did not take that as a suggestion.

>
>>> ... and then picks a new owner of the output channel.
>>>
>>> Up to this point it looks sensible.
>>>
>>>> +                     fputs(pp->err[pp->foreground_child].buf, stderr);
>>>> +                     strbuf_reset(&pp->err[pp->foreground_child]);
>>>
>>> I do not think these two lines need to be here, especially if you
>>> follow the above advice of separating buffering and draining.
>>
>> They are outputting the buffer, if the next selected child is still running.
>> I mean it would output eventually anyway, but it would only output after
>> the next start of processes and poll() is done. Yeah maybe that's what we
>> want (starting new processes earlier is better than outputting earlier as we're
>> talking microseconds here).
>
> I am not talking about microseconds but refraining from doing the
> same thing in multiple places.

ok

>
>>> But calling start_new() unconditionally feels sloppy.  It should at
>>> least be something like
>>>
>>>         if (pp.nr_processes < pp.max_processes &&
>>>             !pp.all_task_started)
>>>                 start_new_process()
>>>
>>> no?
>>
>> That's the exact condition we have in start_new_process
>> so I don't want to repeat it here?
>
> You are advocating to have a function whose definition is "this may
> or may not start a new process and the caller should not care", and
> then make the caller keep calling it, knowing/hoping that the caller
> does not care if we spawn a new thing or not.
>
> I somehow find it a highly questionable design taste to base the
> decision on "don't want to repeat it here".
>
> Stepping back and thinking about it, what is suggested is more
> explicit in the caller.  "If we know we need a new worker and we can
> have a new worker, then start it." is the logic in the caller in the
> suggested structure, and we would have a well defined helper whose
> sole task is to start a new worker to be called from the caller.
> The helper may want to check if the request to start a new one makes
> sense (e.g. if slots[] are all full, it may even want to return an
> error), but the checks are primarily for error checking (i.e. "we
> can have max N processes, so make sure we do not exceed that limit"),
> not a semantic one (i.e. the caller _could_ choose to spawn less
> than that max when there is a good reason to do so).

I would not put the decision to spawn fewer processes in the caller either,
My understanding is to have one high level function which outlines the algorithm
like:

    loop:
        spawn_children_as_necessary
        make_sure_pipes_don't_overflow
        offer_early_output
        take_care_of_finished_children

such that the main function reads more like a bullet point list explaining
how it roughly works. Each helper function should come up with its own strategy,
so spawn_children_as_necessary could be

    spawn_children_as_necessary:
        while less than n children and there are more tasks:
            spawn_one_child

for now. Later we can add more logic if necessary there. But I'd prefer to have
these decisions put into the helpers.

Having written this, I think I'll separate the function to drain the pipes and
the early output and separate the helper to start children into two:
one to make the decision and one to actually start just one child.



>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2015-09-18 16:36 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
2015-09-17 16:13   ` Junio C Hamano
2015-09-17 16:30     ` Jeff King
2015-09-17 16:44       ` Junio C Hamano
2015-09-17 16:51         ` Stefan Beller
2015-09-17 16:57         ` Jeff King
2015-09-17 16:58       ` Junio C Hamano
2015-09-17 17:13         ` Jeff King
2015-09-17 17:26           ` Stefan Beller
2015-09-17 17:35             ` Jeff King
2015-09-17 17:45               ` Stefan Beller
2015-09-17 17:50                 ` Jeff King
2015-09-17 17:53                   ` Stefan Beller
2015-09-17 17:57               ` Junio C Hamano
2015-09-17 17:54           ` Junio C Hamano
2015-09-17 18:02             ` Jeff King
2015-09-17 17:20         ` Stefan Beller
2015-09-17  1:39 ` [PATCH 02/10] run-command: factor out return value computation Stefan Beller
2015-09-17 10:33   ` Jeff King
2015-09-17  1:39 ` [PATCH 03/10] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-17 21:44   ` Junio C Hamano
2015-09-17 23:19     ` Stefan Beller
2015-09-18  1:05       ` Junio C Hamano
2015-09-18 16:36         ` Stefan Beller
2015-09-17  1:39 ` [PATCH 04/10] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-09-17  1:39 ` [PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation Stefan Beller
2015-09-17  1:39 ` [PATCH 06/10] git submodule update: Redirect any output to stderr Stefan Beller
2015-09-17 20:31   ` Eric Sunshine
2015-09-17 20:38     ` Stefan Beller
2015-09-17  1:39 ` [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix Stefan Beller
2015-09-17 20:33   ` Eric Sunshine
2015-09-17  1:39 ` [PATCH 08/10] git submodule update: cmd_update_recursive Stefan Beller
2015-09-17  1:39 ` [PATCH 09/10] " Stefan Beller
2015-09-17 20:37   ` Eric Sunshine
2015-09-17  1:39 ` [PATCH 10/10] git submodule update: cmd_update_fetch Stefan Beller
2015-09-17 17:06 ` [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Jacob Keller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).