git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCHv15 0/5] Expose submodule parallelism to the user
@ 2016-02-24  3:20 Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 1/5] run-command: expose default_{start_failure, task_finished} Stefan Beller
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

This build on top of 163b9b1f919c762a4bfb693b3aa05ef1aa627fee
(origin/sb/submodule-parallel-update~3) and replaces the commits 
origin/sb/submodule-parallel-update~2.. 

* Renamed inspect_clone_next_submodule to prepare_to_clone_next_submodule
  and reordered the arguments thereof
* Comments for the struct submodule_update_clone which is passed around
* Better handling around LFs.
* Renamed struct child_process *cp to *child
* Print #unmatched in git-submodule.sh instead of the helper

> 
> 	if (pp->update.type == SM_UPDATE_NONE
> 	    || (pp->update.type == SM_UPDATE_UNSPECIFIED
> 	        && sub->update_strategy.type == SM_UPDATE_NONE)) {
> 
> What does pp stand for?

I think I took it as parallel_process when starting off from the parallel
processing machinery. I'll rename it (probably to suc as short for struct
submodule_update_clone).

> > +	if (pp->recursive_prefix)
> > +		displaypath = relative_path(pp->recursive_prefix,
> > +					    ce->name, &displaypath_sb);
> 
> Nit: could use braces.

Why? I would understand a few lines above where we have an if nested in an
if with braces. But here we have a pretty straighforward one statement per case
condition.


> > +	sub = submodule_from_path(null_sha1, ce->name);
> 
> It's common to call submodule_from_path with null_sha1 as a parameter
> but I have trouble continuing to remember what that means.  Maybe
> there should be a separate function that handles that?  As a
> side-effect, the name and docstring of that function could explain
> what it means, which I still am not sure about. :)

I'll do that as a followup cleanup patch as it affects more than just the
new code.

>> +		OPT_STRING(0, "reference", &pp.reference, "<repository>",
>> +			   N_("Use the local reference repository "
>> +			      "instead of a full clone")),

> Is this allowed to be relative?  If so, what is it relative to?

It is passing on the argument to clone, so I assume the same rules apply as for
git-clone.

Thanks,
Stefan

Stefan Beller (5):
  run-command: expose default_{start_failure, task_finished}
  run_processes_parallel: add LF when caller is sloppy
  git submodule update: have a dedicated helper for cloning
  submodule update: expose parallelism to the user
  clone: allow an explicit argument for parallel submodule clones

 Documentation/git-clone.txt     |   6 +-
 Documentation/git-submodule.txt |   7 +-
 builtin/clone.c                 |  19 ++-
 builtin/submodule--helper.c     | 255 +++++++++++++++++++++++++++++++++++++++-
 git-submodule.sh                |  56 ++++-----
 run-command.c                   |  35 ++++--
 run-command.h                   |  19 +++
 t/t0061-run-command.sh          |  26 ++++
 t/t7406-submodule-update.sh     |  27 +++++
 test-run-command.c              |  18 +++
 10 files changed, 412 insertions(+), 56 deletions(-)

-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCHv15 1/5] run-command: expose default_{start_failure, task_finished}
  2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
@ 2016-02-24  3:20 ` Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy Stefan Beller
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

We want to reuse the error reporting facilities in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 run-command.c | 18 +++++++++---------
 run-command.h | 19 +++++++++++++++++++
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/run-command.c b/run-command.c
index 51fd72c..d03ecaa 100644
--- a/run-command.c
+++ b/run-command.c
@@ -902,10 +902,10 @@ struct parallel_processes {
 	struct strbuf buffered_output; /* of finished children */
 };
 
-static int default_start_failure(struct child_process *cp,
-				 struct strbuf *err,
-				 void *pp_cb,
-				 void *pp_task_cb)
+int default_start_failure(struct child_process *cp,
+			  struct strbuf *err,
+			  void *pp_cb,
+			  void *pp_task_cb)
 {
 	int i;
 
@@ -916,11 +916,11 @@ static int default_start_failure(struct child_process *cp,
 	return 0;
 }
 
-static int default_task_finished(int result,
-				 struct child_process *cp,
-				 struct strbuf *err,
-				 void *pp_cb,
-				 void *pp_task_cb)
+int default_task_finished(int result,
+			  struct child_process *cp,
+			  struct strbuf *err,
+			  void *pp_cb,
+			  void *pp_task_cb)
 {
 	int i;
 
diff --git a/run-command.h b/run-command.h
index d5a57f9..a054fa6 100644
--- a/run-command.h
+++ b/run-command.h
@@ -164,6 +164,15 @@ typedef int (*start_failure_fn)(struct child_process *cp,
 				void *pp_task_cb);
 
 /**
+ * If a command fails to start, then print an error message stating the
+ * exact command which failed.
+ */
+int default_start_failure(struct child_process *cp,
+			  struct strbuf *err,
+			  void *pp_cb,
+			  void *pp_task_cb);
+
+/**
  * This callback is called on every child process that finished processing.
  *
  * You must not write to stdout or stderr in this function. Add your
@@ -184,6 +193,16 @@ typedef int (*task_finished_fn)(int result,
 				void *pp_task_cb);
 
 /**
+ * If the child process returns with a non zero error code, print
+ * an error message of the exact command which failed.
+ */
+int default_task_finished(int result,
+			  struct child_process *cp,
+			  struct strbuf *err,
+			  void *pp_cb,
+			  void *pp_task_cb);
+
+/**
  * Runs up to n processes at the same time. Whenever a process can be
  * started, the callback get_next_task_fn is called to obtain the data
  * required to start another child process.
-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 1/5] run-command: expose default_{start_failure, task_finished} Stefan Beller
@ 2016-02-24  3:20 ` Stefan Beller
  2016-02-24 20:07   ` Junio C Hamano
  2016-02-24 21:19   ` Jonathan Nieder
  2016-02-24  3:20 ` [PATCHv15 3/5] git submodule update: have a dedicated helper for cloning Stefan Beller
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

When the callers of parallel processing machine are sloppy with their
messages, make sure the output is terminated with LF after one child
process is handled. This will not mess with the internal state of the
output, i.e. after all messages for one child process are process
including the callbacks as well as the actual output of the child
we may add an LF if the output is not ended with an LF.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

>From a discussion on a later patch, Jonathan said:
> It looks like pp_start_one takes the content of err without adding
> a \n.  That's a bug in pp_start_one and pp_collect_finished IMHO.

 run-command.c          | 17 +++++++++++++++--
 t/t0061-run-command.sh | 26 ++++++++++++++++++++++++++
 test-run-command.c     | 18 ++++++++++++++++++
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/run-command.c b/run-command.c
index d03ecaa..8abaae4 100644
--- a/run-command.c
+++ b/run-command.c
@@ -897,6 +897,7 @@ struct parallel_processes {
 	struct pollfd *pfd;
 
 	unsigned shutdown : 1;
+	unsigned ended_with_newline: 1;
 
 	int output_owner;
 	struct strbuf buffered_output; /* of finished children */
@@ -979,6 +980,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->nr_processes = 0;
 	pp->output_owner = 0;
 	pp->shutdown = 0;
+	pp->ended_with_newline = 1;
 	pp->children = xcalloc(n, sizeof(*pp->children));
 	pp->pfd = xcalloc(n, sizeof(*pp->pfd));
 	strbuf_init(&pp->buffered_output, 0);
@@ -1053,6 +1055,7 @@ static int pp_start_one(struct parallel_processes *pp)
 					 pp->data,
 					 &pp->children[i].data);
 		strbuf_addbuf(&pp->buffered_output, &pp->children[i].err);
+		strbuf_addch(&pp->buffered_output, '\n');
 		strbuf_reset(&pp->children[i].err);
 		if (code)
 			pp->shutdown = 1;
@@ -1095,9 +1098,11 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 static void pp_output(struct parallel_processes *pp)
 {
 	int i = pp->output_owner;
-	if (pp->children[i].state == GIT_CP_WORKING &&
-	    pp->children[i].err.len) {
+	size_t len = pp->children[i].err.len;
+	if (pp->children[i].state == GIT_CP_WORKING && len) {
 		fputs(pp->children[i].err.buf, stderr);
+		pp->ended_with_newline =
+			(pp->children[i].err.buf[len - 1] == '\n');
 		strbuf_reset(&pp->children[i].err);
 	}
 }
@@ -1107,6 +1112,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 	int i, code;
 	int n = pp->max_processes;
 	int result = 0;
+	ssize_t len;
 
 	while (pp->nr_processes > 0) {
 		for (i = 0; i < pp->max_processes; i++)
@@ -1131,12 +1137,19 @@ static int pp_collect_finished(struct parallel_processes *pp)
 		pp->pfd[i].fd = -1;
 		child_process_init(&pp->children[i].process);
 
+		len = pp->children[i].err.len - 1;
+		if (len >= 0 && pp->children[i].err.buf[len] != '\n')
+			strbuf_addch(&pp->children[i].err, '\n');
+
 		if (i != pp->output_owner) {
 			strbuf_addbuf(&pp->buffered_output, &pp->children[i].err);
 			strbuf_reset(&pp->children[i].err);
 		} else {
+			if (len == -1 && !pp->ended_with_newline)
+				strbuf_addch(&pp->children[i].err, '\n');
 			fputs(pp->children[i].err.buf, stderr);
 			strbuf_reset(&pp->children[i].err);
+			pp->ended_with_newline = 1;
 
 			/* Output all other finished child processes */
 			fputs(pp->buffered_output.buf, stderr);
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 12228b4..5c6822c 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -77,6 +77,32 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command ensures each command ends in LF' '
+	test-run-command run-command-parallel 3 sh -c "printf \"%s\n%s\" Hello World" 2>actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<-EOF
+preloaded output of a child
+preloaded output of a child
+preloaded output of a child
+preloaded output of a child
+EOF
+
+test_expect_success 'run_command ensures each command ends in LF when output is only in starting cb' '
+	test-run-command run-command-parallel 3 sh -c true  2>actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<-EOF
+EOF
+
+test_expect_success 'run_command ensures each command ends in LF except when there is no output' '
+	test-run-command run-command-parallel-silent 3 sh -c true  2>actual &&
+	test_cmp expect actual
+'
+
+
 cat >expect <<-EOF
 preloaded output of a child
 asking for a quick stop
diff --git a/test-run-command.c b/test-run-command.c
index fbe0a27..ca1ee97 100644
--- a/test-run-command.c
+++ b/test-run-command.c
@@ -31,6 +31,20 @@ static int parallel_next(struct child_process *cp,
 	return 1;
 }
 
+static int parallel_next_silent(struct child_process *cp,
+			 struct strbuf *err,
+			 void *cb,
+			 void **task_cb)
+{
+	struct child_process *d = cb;
+	if (number_callbacks >= 4)
+		return 0;
+
+	argv_array_pushv(&cp->args, d->argv);
+	number_callbacks++;
+	return 1;
+}
+
 static int no_job(struct child_process *cp,
 		  struct strbuf *err,
 		  void *cb,
@@ -71,6 +85,10 @@ int main(int argc, char **argv)
 	jobs = atoi(argv[2]);
 	proc.argv = (const char **)argv + 3;
 
+	if (!strcmp(argv[1], "run-command-parallel-silent"))
+		exit(run_processes_parallel(jobs, parallel_next_silent,
+					    NULL, NULL, &proc));
+
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
 					    NULL, NULL, &proc));
-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv15 3/5] git submodule update: have a dedicated helper for cloning
  2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 1/5] run-command: expose default_{start_failure, task_finished} Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy Stefan Beller
@ 2016-02-24  3:20 ` Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 4/5] submodule update: expose parallelism to the user Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 5/5] clone: allow an explicit argument for parallel submodule clones Stefan Beller
  4 siblings, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

This introduces a new helper function in git submodule--helper
which takes care of cloning all submodules, which we want to
parallelize eventually.

Some tests (such as empty URL, update_mode=none) are required in the
helper to make the decision for cloning. These checks have been
moved into the C function as well (no need to repeat them in the
shell script).

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/submodule--helper.c | 243 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  47 +++------
 2 files changed, 256 insertions(+), 34 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..85fb702 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -255,6 +255,248 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+struct submodule_update_clone {
+	/* index into 'list', the list of submodules to look into for cloning */
+	int current;
+	struct module_list list;
+	int warn_if_uninitialized : 1;
+	/* update parameter passed via commandline*/
+	struct submodule_update_strategy update;
+	/* configuration parameters which are passed on to the children */
+	int quiet;
+	const char *reference;
+	const char *depth;
+	const char *recursive_prefix;
+	const char *prefix;
+	/* lines to be output */
+	struct string_list projectlines;
+	/* If we want to stop as fast as possible and return an error */
+	int quickstop : 1;
+};
+#define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
+	SUBMODULE_UPDATE_STRATEGY_INIT, 0, NULL, NULL, NULL, NULL, \
+	STRING_LIST_INIT_DUP, 0}
+
+/**
+ * Inspect if 'ce' needs to be cloned. If so, prepare the 'child' to be running
+ * the clone and return non zero.
+ */
+static int prepare_to_clone_next_submodule(const struct cache_entry *ce,
+					   struct child_process *child,
+					   struct submodule_update_clone *suc,
+					   struct strbuf *err)
+{
+	const struct submodule *sub = NULL;
+	struct strbuf displaypath_sb = STRBUF_INIT;
+	struct strbuf sb = STRBUF_INIT;
+	const char *displaypath = NULL;
+	char *url = NULL;
+	int needs_cloning = 0;
+
+	if (ce_stage(ce)) {
+		if (suc->recursive_prefix) {
+			strbuf_addf(err, "Skipping unmerged submodule %s/%s\n",
+				    suc->recursive_prefix, ce->name);
+		} else {
+			strbuf_addf(err, "Skipping unmerged submodule %s\n",
+				    ce->name);
+		}
+		goto cleanup;
+	}
+
+	sub = submodule_from_path(null_sha1, ce->name);
+
+	if (suc->recursive_prefix)
+		displaypath = relative_path(suc->recursive_prefix,
+					    ce->name, &displaypath_sb);
+	else
+		displaypath = ce->name;
+
+	if (suc->update.type == SM_UPDATE_NONE
+	    || (suc->update.type == SM_UPDATE_UNSPECIFIED
+		&& sub->update_strategy.type == SM_UPDATE_NONE)) {
+		strbuf_addf(err, "Skipping submodule '%s'\n",
+			    displaypath);
+		goto cleanup;
+	}
+
+	/*
+	 * Looking up the url in .git/config.
+	 * We must not fall back to .gitmodules as we only want
+	 * to process configured submodules.
+	 */
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "submodule.%s.url", sub->name);
+	git_config_get_string(sb.buf, &url);
+	if (!url) {
+		/*
+		 * Only mention uninitialized submodules when its
+		 * path have been specified
+		 */
+		if (suc->warn_if_uninitialized)
+			strbuf_addf(err, _("Submodule path '%s' not initialized\n"
+				    "Maybe you want to use 'update --init'?\n"),
+				    displaypath);
+		goto cleanup;
+	}
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s/.git", ce->name);
+	needs_cloning = !file_exists(sb.buf);
+
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%06o %s %d %d\t%s\n", ce->ce_mode,
+			sha1_to_hex(ce->sha1), ce_stage(ce),
+			needs_cloning, ce->name);
+	string_list_append(&suc->projectlines, sb.buf);
+
+	if (!needs_cloning)
+		goto cleanup;
+
+	child->git_cmd = 1;
+	child->no_stdin = 1;
+	child->stdout_to_stderr = 1;
+	child->err = -1;
+	argv_array_push(&child->args, "submodule--helper");
+	argv_array_push(&child->args, "clone");
+	if (suc->quiet)
+		argv_array_push(&child->args, "--quiet");
+	if (suc->prefix)
+		argv_array_pushl(&child->args, "--prefix", suc->prefix, NULL);
+	argv_array_pushl(&child->args, "--path", sub->path, NULL);
+	argv_array_pushl(&child->args, "--name", sub->name, NULL);
+	argv_array_pushl(&child->args, "--url", url, NULL);
+	if (suc->reference)
+		argv_array_push(&child->args, suc->reference);
+	if (suc->depth)
+		argv_array_push(&child->args, suc->depth);
+
+cleanup:
+	free(url);
+	strbuf_reset(&displaypath_sb);
+	strbuf_reset(&sb);
+
+	return needs_cloning;
+}
+
+static int update_clone_get_next_task(struct child_process *child,
+				      struct strbuf *err,
+				      void *suc_cb,
+				      void **void_task_cb)
+{
+	struct submodule_update_clone *suc = suc_cb;
+
+	for (; suc->current < suc->list.nr; suc->current++) {
+		const struct cache_entry *ce = suc->list.entries[suc->current];
+		if (prepare_to_clone_next_submodule(ce, child, suc, err)) {
+			suc->current++;
+			return 1;
+		}
+	}
+	return 0;
+}
+
+static int update_clone_start_failure(struct child_process *child,
+				      struct strbuf *err,
+				      void *suc_cb,
+				      void *void_task_cb)
+{
+	struct submodule_update_clone *suc = suc_cb;
+
+	default_start_failure(child, err, suc_cb, void_task_cb);
+	suc->quickstop = 1;
+
+	return 1;
+}
+
+static int update_clone_task_finished(int result,
+				      struct child_process *child,
+				      struct strbuf *err,
+				      void *suc_cb,
+				      void *void_task_cb)
+{
+	struct submodule_update_clone *suc = suc_cb;
+
+	if (!result)
+		return 0;
+
+	default_task_finished(result, child, err, suc_cb, void_task_cb);
+	suc->quickstop = 1;
+
+	return 1;
+}
+
+static int update_clone(int argc, const char **argv, const char *prefix)
+{
+	const char *update = NULL;
+	struct string_list_item *item;
+	struct pathspec pathspec;
+	struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
+
+	struct option module_update_clone_options[] = {
+		OPT_STRING(0, "prefix", &prefix,
+			   N_("path"),
+			   N_("path into the working tree")),
+		OPT_STRING(0, "recursive-prefix", &suc.recursive_prefix,
+			   N_("path"),
+			   N_("path into the working tree, across nested "
+			      "submodule boundaries")),
+		OPT_STRING(0, "update", &update,
+			   N_("string"),
+			   N_("rebase, merge, checkout or none")),
+		OPT_STRING(0, "reference", &suc.reference, N_("repo"),
+			   N_("reference repository")),
+		OPT_STRING(0, "depth", &suc.depth, "<depth>",
+			   N_("Create a shallow clone truncated to the "
+			      "specified number of revisions")),
+		OPT__QUIET(&suc.quiet, N_("don't print cloning progress")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper update_clone [--prefix=<path>] [<path>...]"),
+		NULL
+	};
+	suc.prefix = prefix;
+
+	argc = parse_options(argc, argv, prefix, module_update_clone_options,
+			     git_submodule_helper_usage, 0);
+
+	if (update)
+		if (parse_submodule_update_strategy(update, &suc.update) < 0)
+			die(_("bad value for update parameter"));
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &suc.list) < 0)
+		return 1;
+
+	if (pathspec.nr)
+		suc.warn_if_uninitialized = 1;
+
+	gitmodules_config();
+	/* Overlay the parsed .gitmodules file with .git/config */
+	git_config(submodule_config, NULL);
+	run_processes_parallel(1, update_clone_get_next_task,
+				  update_clone_start_failure,
+				  update_clone_task_finished,
+				  &suc);
+
+	/*
+	 * We saved the output and put it out all at once now.
+	 * That means:
+	 * - the listener does not have to interleave their (checkout)
+	 *   work with our fetching.  The writes involved in a
+	 *   checkout involve more straightforward sequential I/O.
+	 * - the listener can avoid doing any work if fetching failed.
+	 */
+	if (suc.quickstop)
+		return 1;
+
+	for_each_string_list_item(item, &suc.projectlines)
+		utf8_fprintf(stdout, "%s", item->string);
+
+	return 0;
+}
+
 struct cmd_struct {
 	const char *cmd;
 	int (*fn)(int, const char **, const char *);
@@ -264,6 +506,7 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"update-clone", update_clone}
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9ee86d4..a6a82d2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -664,17 +664,20 @@ cmd_update()
 		cmd_init "--" "$@" || return
 	fi
 
-	cloned_modules=
-	git submodule--helper list --prefix "$wt_prefix" "$@" | {
+	{
+	git submodule--helper update-clone ${GIT_QUIET:+--quiet} \
+		${wt_prefix:+--prefix "$wt_prefix"} \
+		${prefix:+--recursive-prefix "$prefix"} \
+		${update:+--update "$update"} \
+		${reference:+--reference "$reference"} \
+		${depth:+--depth "$depth"} \
+		"$@" || echo "#unmatched"
+	} | {
 	err=
-	while read mode sha1 stage sm_path
+	while read mode sha1 stage just_cloned sm_path
 	do
 		die_if_unmatched "$mode"
-		if test "$stage" = U
-		then
-			echo >&2 "Skipping unmerged submodule $prefix$sm_path"
-			continue
-		fi
+
 		name=$(git submodule--helper name "$sm_path") || exit
 		url=$(git config submodule."$name".url)
 		branch=$(get_submodule_config "$name" branch master)
@@ -691,27 +694,10 @@ cmd_update()
 
 		displaypath=$(relative_path "$prefix$sm_path")
 
-		if test "$update_module" = "none"
-		then
-			echo >&2 "Skipping submodule '$displaypath'"
-			continue
-		fi
-
-		if test -z "$url"
-		then
-			# Only mention uninitialized submodules when its
-			# path have been specified
-			test "$#" != "0" &&
-			say >&2 "$(eval_gettext "Submodule path '\$displaypath' not initialized
-Maybe you want to use 'update --init'?")"
-			continue
-		fi
-
-		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
+		if test $just_cloned -eq 1
 		then
-			git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
-			cloned_modules="$cloned_modules;$name"
 			subsha1=
+			update_module=checkout
 		else
 			subsha1=$(clear_local_git_env; cd "$sm_path" &&
 				git rev-parse --verify HEAD) ||
@@ -751,13 +737,6 @@ Maybe you want to use 'update --init'?")"
 				die "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")"
 			fi
 
-			# Is this something we just cloned?
-			case ";$cloned_modules;" in
-			*";$name;"*)
-				# then there is no local change to integrate
-				update_module=checkout ;;
-			esac
-
 			must_die_on_failure=
 			case "$update_module" in
 			checkout)
-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv15 4/5] submodule update: expose parallelism to the user
  2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
                   ` (2 preceding siblings ...)
  2016-02-24  3:20 ` [PATCHv15 3/5] git submodule update: have a dedicated helper for cloning Stefan Beller
@ 2016-02-24  3:20 ` Stefan Beller
  2016-02-24  3:20 ` [PATCHv15 5/5] clone: allow an explicit argument for parallel submodule clones Stefan Beller
  4 siblings, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

Expose possible parallelism either via the "--jobs" CLI parameter or
the "submodule.fetchJobs" setting.

By having the variable initialized to -1, we make sure 0 can be passed
into the parallel processing machine, which will then pick as many parallel
workers as there are CPUs.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-submodule.txt |  7 ++++++-
 builtin/submodule--helper.c     | 16 ++++++++++++----
 git-submodule.sh                |  9 +++++++++
 t/t7406-submodule-update.sh     | 12 ++++++++++++
 4 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 1572f05..13adebf 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 'git submodule' [--quiet] deinit [-f|--force] [--] <path>...
 'git submodule' [--quiet] update [--init] [--remote] [-N|--no-fetch]
 	      [-f|--force] [--rebase|--merge] [--reference <repository>]
-	      [--depth <depth>] [--recursive] [--] [<path>...]
+	      [--depth <depth>] [--recursive] [--jobs <n>] [--] [<path>...]
 'git submodule' [--quiet] summary [--cached|--files] [(-n|--summary-limit) <n>]
 	      [commit] [--] [<path>...]
 'git submodule' [--quiet] foreach [--recursive] <command>
@@ -377,6 +377,11 @@ for linkgit:git-clone[1]'s `--reference` and `--shared` options carefully.
 	clone with a history truncated to the specified number of revisions.
 	See linkgit:git-clone[1]
 
+-j <n>::
+--jobs <n>::
+	This option is only valid for the update command.
+	Clone new submodules in parallel with as many jobs.
+	Defaults to the `submodule.fetchJobs` option.
 
 <path>...::
 	Paths to submodule(s). When specified this will restrict the command
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 85fb702..32254cd 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -429,6 +429,7 @@ static int update_clone_task_finished(int result,
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
 	const char *update = NULL;
+	int max_jobs = -1;
 	struct string_list_item *item;
 	struct pathspec pathspec;
 	struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
@@ -449,6 +450,8 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &suc.depth, "<depth>",
 			   N_("Create a shallow clone truncated to the "
 			      "specified number of revisions")),
+		OPT_INTEGER('j', "jobs", &max_jobs,
+			    N_("parallel jobs")),
 		OPT__QUIET(&suc.quiet, N_("don't print cloning progress")),
 		OPT_END()
 	};
@@ -475,10 +478,15 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 	gitmodules_config();
 	/* Overlay the parsed .gitmodules file with .git/config */
 	git_config(submodule_config, NULL);
-	run_processes_parallel(1, update_clone_get_next_task,
-				  update_clone_start_failure,
-				  update_clone_task_finished,
-				  &suc);
+
+	if (max_jobs < 0)
+		max_jobs = parallel_submodules();
+
+	run_processes_parallel(max_jobs,
+			       update_clone_get_next_task,
+			       update_clone_start_failure,
+			       update_clone_task_finished,
+			       &suc);
 
 	/*
 	 * We saved the output and put it out all at once now.
diff --git a/git-submodule.sh b/git-submodule.sh
index a6a82d2..86018ee 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -645,6 +645,14 @@ cmd_update()
 		--depth=*)
 			depth=$1
 			;;
+		-j|--jobs)
+			case "$2" in '') usage ;; esac
+			jobs="--jobs=$2"
+			shift
+			;;
+		--jobs=*)
+			jobs=$1
+			;;
 		--)
 			shift
 			break
@@ -671,6 +679,7 @@ cmd_update()
 		${update:+--update "$update"} \
 		${reference:+--reference "$reference"} \
 		${depth:+--depth "$depth"} \
+		${jobs:+$jobs} \
 		"$@" || echo "#unmatched"
 	} | {
 	err=
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index dda3929..7fd5142 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -774,4 +774,16 @@ test_expect_success 'submodule update --recursive drops module name before recur
 	 test_i18ngrep "Submodule path .deeper/submodule/subsubmodule.: checked out" actual
 	)
 '
+
+test_expect_success 'submodule update can be run in parallel' '
+	(cd super2 &&
+	 GIT_TRACE=$(pwd)/trace.out git submodule update --jobs 7 &&
+	 grep "7 tasks" trace.out &&
+	 git config submodule.fetchJobs 8 &&
+	 GIT_TRACE=$(pwd)/trace.out git submodule update &&
+	 grep "8 tasks" trace.out &&
+	 GIT_TRACE=$(pwd)/trace.out git submodule update --jobs 9 &&
+	 grep "9 tasks" trace.out
+	)
+'
 test_done
-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCHv15 5/5] clone: allow an explicit argument for parallel submodule clones
  2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
                   ` (3 preceding siblings ...)
  2016-02-24  3:20 ` [PATCHv15 4/5] submodule update: expose parallelism to the user Stefan Beller
@ 2016-02-24  3:20 ` Stefan Beller
  4 siblings, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2016-02-24  3:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine, Stefan Beller

Just pass it along to "git submodule update", which may pick reasonable
defaults if you don't specify an explicit number.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-clone.txt |  6 +++++-
 builtin/clone.c             | 19 +++++++++++++------
 t/t7406-submodule-update.sh | 15 +++++++++++++++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 6bf000d..6db7b6d 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -14,7 +14,7 @@ SYNOPSIS
 	  [-o <name>] [-b <name>] [-u <upload-pack>] [--reference <repository>]
 	  [--dissociate] [--separate-git-dir <git dir>]
 	  [--depth <depth>] [--[no-]single-branch]
-	  [--recursive | --recurse-submodules] [--] <repository>
+	  [--recursive | --recurse-submodules] [--jobs <n>] [--] <repository>
 	  [<directory>]
 
 DESCRIPTION
@@ -221,6 +221,10 @@ objects from the source repository into a pack in the cloned repository.
 	The result is Git repository can be separated from working
 	tree.
 
+-j <n>::
+--jobs <n>::
+	The number of submodules fetched at the same time.
+	Defaults to the `submodule.fetchJobs` option.
 
 <repository>::
 	The (possibly remote) repository to clone from.  See the
diff --git a/builtin/clone.c b/builtin/clone.c
index a0b3cd9..b004fb4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -50,6 +50,7 @@ static int option_progress = -1;
 static struct string_list option_config;
 static struct string_list option_reference;
 static int option_dissociate;
+static int max_jobs = -1;
 
 static struct option builtin_clone_options[] = {
 	OPT__VERBOSITY(&option_verbosity),
@@ -72,6 +73,8 @@ static struct option builtin_clone_options[] = {
 		    N_("initialize submodules in the clone")),
 	OPT_BOOL(0, "recurse-submodules", &option_recursive,
 		    N_("initialize submodules in the clone")),
+	OPT_INTEGER('j', "jobs", &max_jobs,
+		    N_("number of submodules cloned in parallel")),
 	OPT_STRING(0, "template", &option_template, N_("template-directory"),
 		   N_("directory from which templates will be used")),
 	OPT_STRING_LIST(0, "reference", &option_reference, N_("repo"),
@@ -95,10 +98,6 @@ static struct option builtin_clone_options[] = {
 	OPT_END()
 };
 
-static const char *argv_submodule[] = {
-	"submodule", "update", "--init", "--recursive", NULL
-};
-
 static const char *get_repo_path_1(struct strbuf *path, int *is_bundle)
 {
 	static char *suffix[] = { "/.git", "", ".git/.git", ".git" };
@@ -724,8 +723,16 @@ static int checkout(void)
 	err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1),
 			   sha1_to_hex(sha1), "1", NULL);
 
-	if (!err && option_recursive)
-		err = run_command_v_opt(argv_submodule, RUN_GIT_CMD);
+	if (!err && option_recursive) {
+		struct argv_array args = ARGV_ARRAY_INIT;
+		argv_array_pushl(&args, "submodule", "update", "--init", "--recursive", NULL);
+
+		if (max_jobs != -1)
+			argv_array_pushf(&args, "--jobs=%d", max_jobs);
+
+		err = run_command_v_opt(args.argv, RUN_GIT_CMD);
+		argv_array_clear(&args);
+	}
 
 	return err;
 }
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 7fd5142..090891e 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -786,4 +786,19 @@ test_expect_success 'submodule update can be run in parallel' '
 	 grep "9 tasks" trace.out
 	)
 '
+
+test_expect_success 'git clone passes the parallel jobs config on to submodules' '
+	test_when_finished "rm -rf super4" &&
+	GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules --jobs 7 . super4 &&
+	grep "7 tasks" trace.out &&
+	rm -rf super4 &&
+	git config --global submodule.fetchJobs 8 &&
+	GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules . super4 &&
+	grep "8 tasks" trace.out &&
+	rm -rf super4 &&
+	GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules --jobs 9 . super4 &&
+	grep "9 tasks" trace.out &&
+	rm -rf super4
+'
+
 test_done
-- 
2.7.0.rc0.34.ga06e0b3.dirty

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24  3:20 ` [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy Stefan Beller
@ 2016-02-24 20:07   ` Junio C Hamano
  2016-02-24 21:19     ` Stefan Beller
  2016-02-24 21:19   ` Jonathan Nieder
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2016-02-24 20:07 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, jrnieder, Jens.Lehmann, peff, sunshine

Stefan Beller <sbeller@google.com> writes:

> @@ -1095,9 +1098,11 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
>  static void pp_output(struct parallel_processes *pp)
>  {
>  	int i = pp->output_owner;
> -	if (pp->children[i].state == GIT_CP_WORKING &&
> -	    pp->children[i].err.len) {
> +	size_t len = pp->children[i].err.len;
> +	if (pp->children[i].state == GIT_CP_WORKING && len) {
>  		fputs(pp->children[i].err.buf, stderr);
> +		pp->ended_with_newline =
> +			(pp->children[i].err.buf[len - 1] == '\n');
>  		strbuf_reset(&pp->children[i].err);
>  	}
>  }

The child->err thing is treated as a counted byte array when the
code determines where the end of the buffer is, but is treated as a
nul-terminated string when it is output with fputs().

The inconsistency may not hurt as long as (1) the producers of the
message will never stuff a NUL in the middle, and (2) strbuf always
has the guard NUL after its contents.  Even though we know that the
latter will hold true for the foreseeable future, it also is easy to
do the right thing here, too, so why not?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24  3:20 ` [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy Stefan Beller
  2016-02-24 20:07   ` Junio C Hamano
@ 2016-02-24 21:19   ` Jonathan Nieder
  2016-02-24 21:59     ` Stefan Beller
  1 sibling, 1 reply; 13+ messages in thread
From: Jonathan Nieder @ 2016-02-24 21:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, Jens.Lehmann, peff, sunshine

Stefan Beller wrote:

> When the callers of parallel processing machine are sloppy with their
> messages, make sure the output is terminated with LF after one child
> process is handled.

Why not always add \n here?

That would make callers simpler and would make it easier for callers to
know what to do.  It also makes it possible to end with \n\n if the
caller wants.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24 20:07   ` Junio C Hamano
@ 2016-02-24 21:19     ` Stefan Beller
  2016-02-24 21:23       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2016-02-24 21:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git@vger.kernel.org, Jonathan Nieder, Jens Lehmann, Jeff King,
	Eric Sunshine

On Wed, Feb 24, 2016 at 12:07 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> @@ -1095,9 +1098,11 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
>>  static void pp_output(struct parallel_processes *pp)
>>  {
>>       int i = pp->output_owner;
>> -     if (pp->children[i].state == GIT_CP_WORKING &&
>> -         pp->children[i].err.len) {
>> +     size_t len = pp->children[i].err.len;
>> +     if (pp->children[i].state == GIT_CP_WORKING && len) {
>>               fputs(pp->children[i].err.buf, stderr);
>> +             pp->ended_with_newline =
>> +                     (pp->children[i].err.buf[len - 1] == '\n');
>>               strbuf_reset(&pp->children[i].err);
>>       }
>>  }
>
> The child->err thing is treated as a counted byte array when the
> code determines where the end of the buffer is, but is treated as a
> nul-terminated string when it is output with fputs().
>
> The inconsistency may not hurt as long as (1) the producers of the
> message will never stuff a NUL in the middle, and (2) strbuf always
> has the guard NUL after its contents.  Even though we know that the
> latter will hold true for the foreseeable future, it also is easy to
> do the right thing here, too, so why not?

What is the right thing? I asked myself and obviously it is treating the
child->err the same in both cases of checking for a trailing LF and
when outputting.

But what is the right way to look at child->err? I would argue that
we should allow for children to have a NUL in its output stream and
replay their output as literal as possible.

i.e. my conclusion is to replace the fputs by fwrite as opposed to
using strlen to determine the length of string output.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24 21:19     ` Stefan Beller
@ 2016-02-24 21:23       ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2016-02-24 21:23 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Jonathan Nieder, Jens Lehmann, Jeff King,
	Eric Sunshine

Stefan Beller <sbeller@google.com> writes:

> On Wed, Feb 24, 2016 at 12:07 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> The inconsistency may not hurt as long as (1) the producers of the
>> message will never stuff a NUL in the middle, and (2) strbuf always
>> has the guard NUL after its contents.  Even though we know that the
>> latter will hold true for the foreseeable future, it also is easy to
>> do the right thing here, too, so why not?
>
> What is the right thing? I asked myself and obviously it is treating the
> child->err the same in both cases of checking for a trailing LF and
> when outputting.
>
> But what is the right way to look at child->err? I would argue that
> we should allow for children to have a NUL in its output stream and
> replay their output as literal as possible.
>
> i.e. my conclusion is to replace the fputs by fwrite as opposed to
> using strlen to determine the length of string output.

Yup, that is what I meant; sorry if I were too oblique.

There are two fputs() that assumes there is no embedded NUL around
there, by the way.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24 21:19   ` Jonathan Nieder
@ 2016-02-24 21:59     ` Stefan Beller
  2016-02-25  0:55       ` Jonathan Nieder
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2016-02-24 21:59 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, git@vger.kernel.org, Jens Lehmann, Jeff King,
	Eric Sunshine

On Wed, Feb 24, 2016 at 1:19 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Stefan Beller wrote:
>
>> When the callers of parallel processing machine are sloppy with their
>> messages, make sure the output is terminated with LF after one child
>> process is handled.
>
> Why not always add \n here?

So you propose to always add a \n if the output length was > 0 ?

>
> That would make callers simpler and would make it easier for callers to
> know what to do.  It also makes it possible to end with \n\n if the
> caller wants.

If the caller wants a \n\n it can do so as well using this patch?

If we add an \n unconditionally we run the risk of having lots of empty lines
in there. Consider the tests which are already there:

    test-run-command run-command-parallel 3 sh -c "printf \"%s\n%s\n\"
Hello World" >actual

would produce

cat >expect <<-EOF
preloaded output of a child
Hello
World

preloaded output of a child
Hello
World

preloaded output of a child
Hello
World

preloaded output of a child
Hello
World
EOF

as both the child as well as we added a \n, so one empty line was born.
And as most child processes actually terminate with a reasonable \n
after their output,
I would not want to add another \n because it is simpler or easier to predict.

Thanks,
Stefan


>
> Thanks,
> Jonathan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-24 21:59     ` Stefan Beller
@ 2016-02-25  0:55       ` Jonathan Nieder
  2016-02-25  2:56         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Nieder @ 2016-02-25  0:55 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git@vger.kernel.org, Jens Lehmann, Jeff King,
	Eric Sunshine

Stefan Beller wrote:
> On Wed, Feb 24, 2016 at 1:19 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> > Stefan Beller wrote:

>>> When the callers of parallel processing machine are sloppy with their
>>> messages, make sure the output is terminated with LF after one child
>>> process is handled.
>>
>> Why not always add \n here?
>
> So you propose to always add a \n if the output length was > 0 ?

Ah, now I see where I was confused.

I was seeing an analogy to functions like ref_transaction_begin(),
which use a 'struct strbuf *err' argument to store the argument to
die() that describes why they failed.  They get used like this:

	struct strbuf err = STRBUF_INIT;
	if (ref_transaction_begin(..., &err))
		die(err.buf);

and die() appends a \n at the end.  They typically are implemented
like this:

	if (open(...)) {
		strbuf_addf(&err,
			    "cannot open '%s': %s", ..., strerror(errno));
		return -1;
	}

When the function doesn't fail, err doesn't need to be inspected at all.

get_next_task_fn et al looked similar to that pattern, but they are
doing something different.  The strbuf passed in is the same buffer
that is used to collect the child process's output.  Writing to that
strbuf is not a way to provide an error message for die() --- instead,
it is a way to provide additional output that should be combined with
the child process's output.

Renaming the parameter to something like 'struct strbuf *out' would
have avoided this clash of conventions and got me un-confused.  That
would make it clearer that the callback function should do

	strbuf_addf(out, "warning: foo\n");

including both its own 'warning: ' prefix and its own newline.  It is
providing output meant to be passed as-is to the terminal.

That is a convenient API since if you want to write multiple messages,
you can do

	if (foo)
		strbuf_addf(out, "warning: foo\n");

	... do some other things ...

	if (bar)
		strbuf_addf(out, "warning: bar\n");

The newlines avoid the messages running together.

The functions default_start_failure and default_task_finished are
buggy under that API, since they do not include newlines in their
output.

Once they're fixed, there wouldn't be any need to add \n, unless we
are worried about a child process that writes output that doesn't end
with a newline.  It can be convenient for child processes to do things
like

	printf '%s\t' "some information"

so I am not convinced this patch is helping.

Does that make sense?

Sorry for the confusion,
Jonathan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy
  2016-02-25  0:55       ` Jonathan Nieder
@ 2016-02-25  2:56         ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2016-02-25  2:56 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Stefan Beller, git@vger.kernel.org, Jens Lehmann, Jeff King,
	Eric Sunshine

Jonathan Nieder <jrnieder@gmail.com> writes:

> Stefan Beller wrote:
>> On Wed, Feb 24, 2016 at 1:19 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> > Stefan Beller wrote:
>
>>>> When the callers of parallel processing machine are sloppy with their
>>>> messages, make sure the output is terminated with LF after one child
>>>> process is handled.
>>>
>>> Why not always add \n here?
>>
>> So you propose to always add a \n if the output length was > 0 ?
>
> Ah, now I see where I was confused.
> ...
> get_next_task_fn et al looked similar to that pattern, but they are
> doing something different.
> ...
> The functions default_start_failure and default_task_finished are
> buggy under that API, since they do not include newlines in their
> output.
>
> Once they're fixed, there wouldn't be any need to add \n, unless we
> are worried about a child process that writes output that doesn't end
> with a newline.  It can be convenient for child processes to do things
> like
>
> 	printf '%s\t' "some information"
>
> so I am not convinced this patch is helping.
>
> Does that make sense?

Sounds very sensible analysis.  Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-02-25  2:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-24  3:20 [PATCHv15 0/5] Expose submodule parallelism to the user Stefan Beller
2016-02-24  3:20 ` [PATCHv15 1/5] run-command: expose default_{start_failure, task_finished} Stefan Beller
2016-02-24  3:20 ` [PATCHv15 2/5] run_processes_parallel: add LF when caller is sloppy Stefan Beller
2016-02-24 20:07   ` Junio C Hamano
2016-02-24 21:19     ` Stefan Beller
2016-02-24 21:23       ` Junio C Hamano
2016-02-24 21:19   ` Jonathan Nieder
2016-02-24 21:59     ` Stefan Beller
2016-02-25  0:55       ` Jonathan Nieder
2016-02-25  2:56         ` Junio C Hamano
2016-02-24  3:20 ` [PATCHv15 3/5] git submodule update: have a dedicated helper for cloning Stefan Beller
2016-02-24  3:20 ` [PATCHv15 4/5] submodule update: expose parallelism to the user Stefan Beller
2016-02-24  3:20 ` [PATCHv15 5/5] clone: allow an explicit argument for parallel submodule clones Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).