git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCHv1 00/12] git submodule update in C with parallel cloning
@ 2015-10-16  1:52 Stefan Beller
  2015-10-16  1:52 ` [PATCH 01/12] git submodule update: Announce skipping submodules on stderr Stefan Beller
                   ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

So eventually we want to have projects with lots of submodules such as Android
(which would have O(1000) submodules).

The very first thing a user does is cloning a project, so we want to impress
with speed there as well. git clone however is lazy and just calls
`git submodule update --init --recursive`. So we need to make that fast.

This series rewrites parts of git submodule update in C and in the second-last
patch it separates cloning and doing the other actions(checkout/rebase/merge etc)
by doing the cloning first and then the rest.

The last patch (which is broken in the first version of the series), then 
proceeds to put the cloning of the submodules into the get_next_task callback
of the parallel process API.

That said, the first few patches introduce some churn in the behavior and tests
of Git, so maybe put your eyes there?

Thanks for any advice,
Stefan

Stefan Beller (12):
  git submodule update: Announce skipping submodules on stderr
  git submodule update: Announce uninitialized modules on stderr
  git submodule update: Move branch calculation to where it's needed
  git submodule update: Announce outcome of submodule operation to
    stderr
  git submodule update: Use its own list implementation.
  git submodule update: Handle unmerged submodules in C
  submodule config: keep update strategy around
  git submodule update: check for "none" in C
  git submodule update: Check url in C
  git submodule update: Clone projects from within C
  submodule--helper: Do not emit submodules to process directly.
  WIP/broken Clone all outstanding submodules in parallel

 builtin/submodule--helper.c | 221 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  38 +++-----
 submodule-config.c          |  11 +++
 submodule-config.h          |   1 +
 t/t7400-submodule-basic.sh  |  12 +--
 t/t7406-submodule-update.sh |  12 +--
 6 files changed, 256 insertions(+), 39 deletions(-)

-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/12] git submodule update: Announce skipping submodules on stderr
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16 20:37   ` Junio C Hamano
  2015-10-16  1:52 ` [PATCH 02/12] git submodule update: Announce uninitialized modules " Stefan Beller
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 8b0eb9a..578ec48 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -684,7 +684,7 @@ cmd_update()
 
 		if test "$update_module" = "none"
 		then
-			echo "Skipping submodule '$displaypath'"
+			echo >&2 "Skipping submodule '$displaypath'"
 			continue
 		fi
 
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/12] git submodule update: Announce uninitialized modules on stderr
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
  2015-10-16  1:52 ` [PATCH 01/12] git submodule update: Announce skipping submodules on stderr Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16 20:54   ` Junio C Hamano
  2015-10-16  1:52 ` [PATCH 03/12] git submodule update: Move branch calculation to where it's needed Stefan Beller
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh           |  2 +-
 t/t7400-submodule-basic.sh | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 578ec48..eea27f8 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -693,7 +693,7 @@ cmd_update()
 			# Only mention uninitialized submodules when its
 			# path have been specified
 			test "$#" != "0" &&
-			say "$(eval_gettext "Submodule path '\$displaypath' not initialized
+			say >&2 "$(eval_gettext "Submodule path '\$displaypath' not initialized
 Maybe you want to use 'update --init'?")"
 			continue
 		fi
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 540771c..32a89b8 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -462,9 +462,9 @@ test_expect_success 'update --init' '
 	git config --remove-section submodule.example &&
 	test_must_fail git config submodule.example.url &&
 
-	git submodule update init > update.out &&
-	cat update.out &&
-	test_i18ngrep "not initialized" update.out &&
+	git submodule update init 2> update.err &&
+	cat update.err &&
+	test_i18ngrep "not initialized" update.err &&
 	test_must_fail git rev-parse --resolve-git-dir init/.git &&
 
 	git submodule update --init init &&
@@ -480,9 +480,9 @@ test_expect_success 'update --init from subdirectory' '
 	mkdir -p sub &&
 	(
 		cd sub &&
-		git submodule update ../init >update.out &&
-		cat update.out &&
-		test_i18ngrep "not initialized" update.out &&
+		git submodule update ../init 2>update.err &&
+		cat update.err &&
+		test_i18ngrep "not initialized" update.err &&
 		test_must_fail git rev-parse --resolve-git-dir ../init/.git &&
 
 		git submodule update --init ../init
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/12] git submodule update: Move branch calculation to where it's needed
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
  2015-10-16  1:52 ` [PATCH 01/12] git submodule update: Announce skipping submodules on stderr Stefan Beller
  2015-10-16  1:52 ` [PATCH 02/12] git submodule update: Announce uninitialized modules " Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16 20:54   ` Junio C Hamano
  2015-10-16  1:52 ` [PATCH 04/12] git submodule update: Announce outcome of submodule operation to stderr Stefan Beller
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

The branch variable is used only once so calculate it only when needed.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index eea27f8..56a0524 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -668,7 +668,6 @@ cmd_update()
 		fi
 		name=$(git submodule--helper name "$sm_path") || exit
 		url=$(git config submodule."$name".url)
-		branch=$(get_submodule_config "$name" branch master)
 		if ! test -z "$update"
 		then
 			update_module=$update
@@ -718,6 +717,7 @@ Maybe you want to use 'update --init'?")"
 				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
 			fi
 			remote_name=$(clear_local_git_env; cd "$sm_path" && get_default_remote)
+			branch=$(get_submodule_config "$name" branch master)
 			sha1=$(clear_local_git_env; cd "$sm_path" &&
 				git rev-parse --verify "${remote_name}/${branch}") ||
 			die "$(eval_gettext "Unable to find current ${remote_name}/${branch} revision in submodule path '\$sm_path'")"
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/12] git submodule update: Announce outcome of submodule operation to stderr
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (2 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 03/12] git submodule update: Move branch calculation to where it's needed Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 05/12] git submodule update: Use its own list implementation Stefan Beller
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 git-submodule.sh            |  2 +-
 t/t7406-submodule-update.sh | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 56a0524..bb8b2c7 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -780,7 +780,7 @@ Maybe you want to use 'update --init'?")"
 
 			if (clear_local_git_env; cd "$sm_path" && $command "$sha1")
 			then
-				say "$say_msg"
+				say >&2 "$say_msg"
 			elif test -n "$must_die_on_failure"
 			then
 				die_with_status 2 "$die_msg"
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index dda3929..f65b81c 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -111,8 +111,8 @@ test_expect_success 'submodule update does not fetch already present commits' '
 	(cd super &&
 	  git submodule update > ../actual 2> ../actual.err
 	) &&
-	test_i18ncmp expected actual &&
-	! test -s actual.err
+	test_i18ncmp expected actual.err &&
+	! test -s actual
 '
 
 test_expect_success 'submodule update should fail due to local changes' '
@@ -702,8 +702,8 @@ test_expect_success 'submodule update places git-dir in superprojects git-dir re
 	rm -rf super_update_r2 &&
 	git clone super_update_r super_update_r2 &&
 	(cd super_update_r2 &&
-	 git submodule update --init --recursive >actual &&
-	 test_i18ngrep "Submodule path .submodule/subsubmodule.: checked out" actual &&
+	 git submodule update --init --recursive 2>actual.err &&
+	 test_i18ngrep "Submodule path .submodule/subsubmodule.: checked out" actual.err &&
 	 (cd submodule/subsubmodule &&
 	  git log > ../../expected
 	 ) &&
@@ -770,8 +770,8 @@ test_expect_success 'submodule update --recursive drops module name before recur
 	 (cd deeper/submodule/subsubmodule &&
 	  git checkout HEAD^
 	 ) &&
-	 git submodule update --recursive deeper/submodule >actual &&
-	 test_i18ngrep "Submodule path .deeper/submodule/subsubmodule.: checked out" actual
+	 git submodule update --recursive deeper/submodule 2>actual.err &&
+	 test_i18ngrep "Submodule path .deeper/submodule/subsubmodule.: checked out" actual.err
 	)
 '
 test_done
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/12] git submodule update: Use its own list implementation.
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (3 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 04/12] git submodule update: Announce outcome of submodule operation to stderr Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16 21:02   ` Junio C Hamano
  2015-10-16  1:52 ` [PATCH 06/12] git submodule update: Handle unmerged submodules in C Stefan Beller
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Discussions turned out that we cannot parallelize the whole loop below
`git submodule--helper list` in `git submodule update`, because some
changes should be done only one at a time, such as messing up a submodule
and leave it up to the user to cleanup the conflicted rebase or merge.

The submodules which are need to be cloned however do not expect to create
problems which require attention by the user one at a time, so we want to
parallelize that first.

To do so we will start with a literal copy of `git submodule--helper list`
and port over features gradually.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 40 ++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  2 +-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..47dc9cb 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -255,6 +255,45 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int module_list_or_clone(int argc, const char **argv, const char *prefix)
+{
+	int i;
+	struct pathspec pathspec;
+	struct module_list list = MODULE_LIST_INIT;
+
+	struct option module_list_options[] = {
+		OPT_STRING(0, "prefix", &prefix,
+			   N_("path"),
+			   N_("alternative anchor for relative paths")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper list [--prefix=<path>] [<path>...]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_list_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0) {
+		printf("#unmatched\n");
+		return 1;
+	}
+
+	for (i = 0; i < list.nr; i++) {
+		const struct cache_entry *ce = list.entries[i];
+
+		if (ce_stage(ce))
+			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
+		else
+			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+
+		utf8_fprintf(stdout, "%s\n", ce->name);
+	}
+	return 0;
+}
+
 struct cmd_struct {
 	const char *cmd;
 	int (*fn)(int, const char **, const char *);
@@ -264,6 +303,7 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"list-or-clone", module_list_or_clone}
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index bb8b2c7..d2d80e2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -656,7 +656,7 @@ cmd_update()
 	fi
 
 	cloned_modules=
-	git submodule--helper list --prefix "$wt_prefix" "$@" | {
+	git submodule--helper list-or-clone --prefix "$wt_prefix" "$@" | {
 	err=
 	while read mode sha1 stage sm_path
 	do
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/12] git submodule update: Handle unmerged submodules in C
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (4 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 05/12] git submodule update: Use its own list implementation Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-20 21:11   ` Junio C Hamano
  2015-10-16  1:52 ` [PATCH 07/12] submodule config: keep update strategy around Stefan Beller
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 15 +++++++++++----
 git-submodule.sh            |  6 +-----
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 47dc9cb..f81f37a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -284,11 +284,18 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < list.nr; i++) {
 		const struct cache_entry *ce = list.entries[i];
 
-		if (ce_stage(ce))
-			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
-		else
-			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+		char *env_prefix = getenv("prefix");
+		if (ce_stage(ce)) {
+			if (env_prefix)
+				fprintf(stderr, "Skipping unmerged submodule %s/%s",
+					env_prefix, ce->name);
+			else
+				fprintf(stderr, "Skipping unmerged submodule %s",
+					ce->name);
+			continue;
+		}
 
+		printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
 		utf8_fprintf(stdout, "%s\n", ce->name);
 	}
 	return 0;
diff --git a/git-submodule.sh b/git-submodule.sh
index d2d80e2..0754ecd 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -661,11 +661,7 @@ cmd_update()
 	while read mode sha1 stage sm_path
 	do
 		die_if_unmatched "$mode"
-		if test "$stage" = U
-		then
-			echo >&2 "Skipping unmerged submodule $prefix$sm_path"
-			continue
-		fi
+
 		name=$(git submodule--helper name "$sm_path") || exit
 		url=$(git config submodule."$name".url)
 		if ! test -z "$update"
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/12] submodule config: keep update strategy around
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (5 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 06/12] git submodule update: Handle unmerged submodules in C Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 08/12] git submodule update: check for "none" in C Stefan Beller
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

We need the submodule update strategies in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule-config.c | 11 +++++++++++
 submodule-config.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/submodule-config.c b/submodule-config.c
index 393de53..175bcbb 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -194,6 +194,7 @@ static struct submodule *lookup_or_create_by_name(struct submodule_cache *cache,
 
 	submodule->path = NULL;
 	submodule->url = NULL;
+	submodule->update = NULL;
 	submodule->fetch_recurse = RECURSE_SUBMODULES_NONE;
 	submodule->ignore = NULL;
 
@@ -326,6 +327,16 @@ static int parse_config(const char *var, const char *value, void *data)
 		free((void *) submodule->url);
 		strbuf_addstr(&url, value);
 		submodule->url = strbuf_detach(&url, NULL);
+	} else if (!strcmp(item.buf, "update")) {
+		if (!value)
+			ret = config_error_nonbool(var);
+		else if (!me->overwrite && submodule->update != NULL)
+			warn_multiple_config(me->commit_sha1, submodule->name,
+					     "update");
+		else {
+			free((void *)submodule->update);
+			submodule->update = xstrdup(value);
+		}
 	}
 
 release_return:
diff --git a/submodule-config.h b/submodule-config.h
index 9061e4e..f9e2a29 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -14,6 +14,7 @@ struct submodule {
 	const char *url;
 	int fetch_recurse;
 	const char *ignore;
+	const char *update;
 	/* the sha1 blob id of the responsible .gitmodules file */
 	unsigned char gitmodules_sha1[20];
 };
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/12] git submodule update: check for "none" in C
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (6 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 07/12] submodule config: keep update strategy around Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 09/12] git submodule update: Check url " Stefan Beller
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 38 ++++++++++++++++++++++++++++++++++++--
 git-submodule.sh            |  8 +-------
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f81f37a..73954ac 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -255,9 +255,15 @@ static int module_clone(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int git_submodule_config(const char *var, const char *value, void *cb)
+{
+	return parse_submodule_config_option(var, value);
+}
+
 static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 {
 	int i;
+	char *update = NULL;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
 
@@ -265,6 +271,9 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "prefix", &prefix,
 			   N_("path"),
 			   N_("alternative anchor for relative paths")),
+		OPT_STRING(0, "update", &update,
+			   N_("string"),
+			   N_("update command for submodules")),
 		OPT_END()
 	};
 
@@ -281,20 +290,45 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		return 1;
 	}
 
+	gitmodules_config();
+	/* Overlay the parsed .gitmodules file with .git/config */
+	git_config(git_submodule_config, NULL);
+
 	for (i = 0; i < list.nr; i++) {
+		const struct submodule *sub = NULL;
+		const char *displaypath = NULL;
 		const struct cache_entry *ce = list.entries[i];
+		struct strbuf sb = STRBUF_INIT;
+		const char *update_module = NULL;
 
 		char *env_prefix = getenv("prefix");
 		if (ce_stage(ce)) {
 			if (env_prefix)
-				fprintf(stderr, "Skipping unmerged submodule %s/%s",
+				fprintf(stderr, "Skipping unmerged submodule %s/%s\n",
 					env_prefix, ce->name);
 			else
-				fprintf(stderr, "Skipping unmerged submodule %s",
+				fprintf(stderr, "Skipping unmerged submodule %s\n",
 					ce->name);
 			continue;
 		}
 
+		sub = submodule_from_path(null_sha1, ce->name);
+		if (env_prefix)
+			displaypath = relative_path(env_prefix, ce->name, &sb);
+		else
+			displaypath = ce->name;
+
+		if (update)
+			update_module = update;
+		if (!update_module)
+			update_module = sub->update;
+		if (!update_module)
+			update_module = "checkout";
+		if (!strcmp(update_module, "none")) {
+			fprintf(stderr, "Skipping submodule '%s'\n", displaypath);
+			continue;
+		}
+
 		printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
 		utf8_fprintf(stdout, "%s\n", ce->name);
 	}
diff --git a/git-submodule.sh b/git-submodule.sh
index 0754ecd..227fed6 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -656,7 +656,7 @@ cmd_update()
 	fi
 
 	cloned_modules=
-	git submodule--helper list-or-clone --prefix "$wt_prefix" "$@" | {
+	git submodule--helper list-or-clone --prefix "$wt_prefix" ${update:+--update "$update"} "$@" | {
 	err=
 	while read mode sha1 stage sm_path
 	do
@@ -677,12 +677,6 @@ cmd_update()
 
 		displaypath=$(relative_path "$prefix$sm_path")
 
-		if test "$update_module" = "none"
-		then
-			echo >&2 "Skipping submodule '$displaypath'"
-			continue
-		fi
-
 		if test -z "$url"
 		then
 			# Only mention uninitialized submodules when its
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/12] git submodule update: Check url in C
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (7 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 08/12] git submodule update: check for "none" in C Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 10/12] git submodule update: Clone projects from within C Stefan Beller
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 21 +++++++++++++++++++++
 git-submodule.sh            | 10 ----------
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 73954ac..7a2fd4e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -300,6 +300,7 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		const struct cache_entry *ce = list.entries[i];
 		struct strbuf sb = STRBUF_INIT;
 		const char *update_module = NULL;
+		const char *url = NULL;
 
 		char *env_prefix = getenv("prefix");
 		if (ce_stage(ce)) {
@@ -329,6 +330,26 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
+		/*
+		 * Looking up the url in .git/config.
+		 * We cannot fall back to .gitmodules as we only want to process
+		 * configured submodules. This renders the submodule lookup API
+		 * useless, as it cannot lookup without fallback.
+		 */
+		strbuf_reset(&sb);
+		strbuf_addf(&sb, "submodule.%s.url", sub->name);
+		git_config_get_string_const(sb.buf, &url);
+		if (!url) {
+			/*
+			 * Only mention uninitialized submodules when its
+			 * path have been specified
+			 */
+			if (pathspec.nr)
+				fprintf(stderr, _("Submodule path '%s' not initialized\n"
+					"Maybe you want to use 'update --init'?"), displaypath);
+			continue;
+		}
+
 		printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
 		utf8_fprintf(stdout, "%s\n", ce->name);
 	}
diff --git a/git-submodule.sh b/git-submodule.sh
index 227fed6..80f41b2 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -677,16 +677,6 @@ cmd_update()
 
 		displaypath=$(relative_path "$prefix$sm_path")
 
-		if test -z "$url"
-		then
-			# Only mention uninitialized submodules when its
-			# path have been specified
-			test "$#" != "0" &&
-			say >&2 "$(eval_gettext "Submodule path '\$displaypath' not initialized
-Maybe you want to use 'update --init'?")"
-			continue
-		fi
-
 		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
 		then
 			git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/12] git submodule update: Clone projects from within C
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (8 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 09/12] git submodule update: Check url " Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 11/12] submodule--helper: Do not emit submodules to process directly Stefan Beller
  2015-10-16  1:52 ` [PATCH 12/12] WIP/broken Clone all outstanding submodules in parallel Stefan Beller
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 56 ++++++++++++++++++++++++++++++++++++++++++++-
 git-submodule.sh            | 12 ++++++----
 2 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 7a2fd4e..d1684cf 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -260,9 +260,40 @@ static int git_submodule_config(const char *var, const char *value, void *cb)
 	return parse_submodule_config_option(var, value);
 }
 
+static void fill_clone_command(struct child_process *cp, int quiet,
+			       const char *prefix, const char *path,
+			       const char *name, const char *url,
+			       const char *reference, const char *depth)
+{
+	cp->git_cmd = 1;
+	argv_array_push(&cp->args, "submodule--helper");
+	argv_array_push(&cp->args, "clone");
+	if (quiet)
+		argv_array_push(&cp->args, "--quiet");
+
+	if (prefix) {
+		argv_array_push(&cp->args, "--prefix");
+		argv_array_push(&cp->args, prefix);
+	}
+	argv_array_push(&cp->args, "--path");
+	argv_array_push(&cp->args, path);
+
+	argv_array_push(&cp->args, "--name");
+	argv_array_push(&cp->args, name);
+
+	argv_array_push(&cp->args, "--url");
+	argv_array_push(&cp->args, url);
+	if (reference)
+		argv_array_push(&cp->args, reference);
+	if (depth)
+		argv_array_push(&cp->args, depth);
+}
+
 static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 {
 	int i;
+	int quiet;
+	char *reference = NULL, *depth = NULL;
 	char *update = NULL;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
@@ -274,6 +305,13 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "update", &update,
 			   N_("string"),
 			   N_("update command for submodules")),
+		OPT_STRING(0, "reference", &reference, "<repository>",
+			N_("Use the local reference repository "
+			   "instead of a full clone")),
+		OPT_STRING(0, "depth", &depth, "<depth>",
+			N_("Create a shallow clone truncated to the "
+			   "specified number of revisions")),
+		OPT__QUIET(&quiet, N_("do't print cloning progress")),
 		OPT_END()
 	};
 
@@ -301,6 +339,7 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		struct strbuf sb = STRBUF_INIT;
 		const char *update_module = NULL;
 		const char *url = NULL;
+		int just_cloned = 0;
 
 		char *env_prefix = getenv("prefix");
 		if (ce_stage(ce)) {
@@ -350,7 +389,22 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 			continue;
 		}
 
-		printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+		strbuf_reset(&sb);
+		strbuf_addf(&sb, "%s/.git", ce->name);
+		just_cloned = !file_exists(sb.buf);
+
+		if (just_cloned) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+			fill_clone_command(&cp, quiet, prefix, ce->name,
+					   sub->name, url, reference, depth);
+
+			if (run_command(&cp)) {
+				printf("#unmatched\n");
+				return 1;
+			}
+		}
+
+		printf("%06o %s %d %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce), just_cloned);
 		utf8_fprintf(stdout, "%s\n", ce->name);
 	}
 	return 0;
diff --git a/git-submodule.sh b/git-submodule.sh
index 80f41b2..28f1757 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -656,9 +656,14 @@ cmd_update()
 	fi
 
 	cloned_modules=
-	git submodule--helper list-or-clone --prefix "$wt_prefix" ${update:+--update "$update"} "$@" | {
+	git submodule--helper list-or-clone ${GIT_QUIET:+--quiet} \
+		--prefix "$wt_prefix" \
+		${update:+--update "$update"} \
+		${reference:+--reference "$reference"} \
+		${depth:+--depth "$depth"} \
+		"$@" | {
 	err=
-	while read mode sha1 stage sm_path
+	while read mode sha1 stage just_cloned sm_path
 	do
 		die_if_unmatched "$mode"
 
@@ -677,9 +682,8 @@ cmd_update()
 
 		displaypath=$(relative_path "$prefix$sm_path")
 
-		if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
+		if test "$just_cloned" = 1
 		then
-			git submodule--helper clone ${GIT_QUIET:+--quiet} --prefix "$prefix" --path "$sm_path" --name "$name" --url "$url" "$reference" "$depth" || exit
 			cloned_modules="$cloned_modules;$name"
 			subsha1=
 		else
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/12] submodule--helper: Do not emit submodules to process directly.
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (9 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 10/12] git submodule update: Clone projects from within C Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  2015-10-16  1:52 ` [PATCH 12/12] WIP/broken Clone all outstanding submodules in parallel Stefan Beller
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

This will allow us to refactor the loop to use the parallel process
API.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index d1684cf..fa8c008 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -297,6 +297,8 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 	char *update = NULL;
 	struct pathspec pathspec;
 	struct module_list list = MODULE_LIST_INIT;
+	struct string_list projectlines = STRING_LIST_INIT_DUP;
+	struct string_list_item *item;
 
 	struct option module_list_options[] = {
 		OPT_STRING(0, "prefix", &prefix,
@@ -403,9 +405,15 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 				return 1;
 			}
 		}
+		strbuf_reset(&sb);
+		strbuf_addf(&sb, "%06o %s %d %d\t%s\n", ce->ce_mode,
+				sha1_to_hex(ce->sha1), ce_stage(ce),
+				just_cloned, ce->name);
+		string_list_append(&projectlines, sb.buf);
+	}
 
-		printf("%06o %s %d %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce), just_cloned);
-		utf8_fprintf(stdout, "%s\n", ce->name);
+	for_each_string_list_item(item, &projectlines) {
+		utf8_fprintf(stdout, "%s", item->string);
 	}
 	return 0;
 }
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/12] WIP/broken Clone all outstanding submodules in parallel
  2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
                   ` (10 preceding siblings ...)
  2015-10-16  1:52 ` [PATCH 11/12] submodule--helper: Do not emit submodules to process directly Stefan Beller
@ 2015-10-16  1:52 ` Stefan Beller
  11 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16  1:52 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Our tests scream at this patch, it's just to show what I plan to do.
Essentially moving the content of the loop into the get_next_task
callback from the run_processes_parallel.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/submodule--helper.c | 181 +++++++++++++++++++++++++++++---------------
 1 file changed, 119 insertions(+), 62 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index fa8c008..c66aa53 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -289,65 +289,40 @@ static void fill_clone_command(struct child_process *cp, int quiet,
 		argv_array_push(&cp->args, depth);
 }
 
-static int module_list_or_clone(int argc, const char **argv, const char *prefix)
-{
-	int i;
-	int quiet;
-	char *reference = NULL, *depth = NULL;
-	char *update = NULL;
+struct submodule_list_or_clone {
 	struct pathspec pathspec;
-	struct module_list list = MODULE_LIST_INIT;
-	struct string_list projectlines = STRING_LIST_INIT_DUP;
-	struct string_list_item *item;
-
-	struct option module_list_options[] = {
-		OPT_STRING(0, "prefix", &prefix,
-			   N_("path"),
-			   N_("alternative anchor for relative paths")),
-		OPT_STRING(0, "update", &update,
-			   N_("string"),
-			   N_("update command for submodules")),
-		OPT_STRING(0, "reference", &reference, "<repository>",
-			N_("Use the local reference repository "
-			   "instead of a full clone")),
-		OPT_STRING(0, "depth", &depth, "<depth>",
-			N_("Create a shallow clone truncated to the "
-			   "specified number of revisions")),
-		OPT__QUIET(&quiet, N_("do't print cloning progress")),
-		OPT_END()
-	};
-
-	const char *const git_submodule_helper_usage[] = {
-		N_("git submodule--helper list [--prefix=<path>] [<path>...]"),
-		NULL
-	};
-
-	argc = parse_options(argc, argv, prefix, module_list_options,
-			     git_submodule_helper_usage, 0);
-
-	if (module_list_compute(argc, argv, prefix, &pathspec, &list) < 0) {
-		printf("#unmatched\n");
-		return 1;
-	}
+	struct module_list list;
+	struct string_list projectlines;
+	int count;
+	int quiet;
+	char *reference;
+	char *depth;
+	char *update;
+	char *env_prefix;
+	const char *prefix;
+	int print_unmatched;
+};
 
-	gitmodules_config();
-	/* Overlay the parsed .gitmodules file with .git/config */
-	git_config(git_submodule_config, NULL);
+static int get_next_task(void **pp_task_cb,
+			 struct child_process *cp,
+			 struct strbuf *err,
+			 void *pp_cb)
+{
+	struct submodule_list_or_clone *pp = pp_cb;
 
-	for (i = 0; i < list.nr; i++) {
+	for (; pp->count < pp->list.nr; pp->count++) {
 		const struct submodule *sub = NULL;
 		const char *displaypath = NULL;
-		const struct cache_entry *ce = list.entries[i];
+		const struct cache_entry *ce = pp->list.entries[pp->count];
 		struct strbuf sb = STRBUF_INIT;
 		const char *update_module = NULL;
 		const char *url = NULL;
 		int just_cloned = 0;
 
-		char *env_prefix = getenv("prefix");
 		if (ce_stage(ce)) {
-			if (env_prefix)
+			if (pp->env_prefix)
 				fprintf(stderr, "Skipping unmerged submodule %s/%s\n",
-					env_prefix, ce->name);
+					pp->env_prefix, ce->name);
 			else
 				fprintf(stderr, "Skipping unmerged submodule %s\n",
 					ce->name);
@@ -355,13 +330,13 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		}
 
 		sub = submodule_from_path(null_sha1, ce->name);
-		if (env_prefix)
-			displaypath = relative_path(env_prefix, ce->name, &sb);
+		if (pp->env_prefix)
+			displaypath = relative_path(pp->env_prefix, ce->name, &sb);
 		else
 			displaypath = ce->name;
 
-		if (update)
-			update_module = update;
+		if (pp->update)
+			update_module = pp->update;
 		if (!update_module)
 			update_module = sub->update;
 		if (!update_module)
@@ -385,7 +360,7 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 			 * Only mention uninitialized submodules when its
 			 * path have been specified
 			 */
-			if (pathspec.nr)
+			if (pp->pathspec.nr)
 				fprintf(stderr, _("Submodule path '%s' not initialized\n"
 					"Maybe you want to use 'update --init'?"), displaypath);
 			continue;
@@ -396,23 +371,105 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
 		just_cloned = !file_exists(sb.buf);
 
 		if (just_cloned) {
-			struct child_process cp = CHILD_PROCESS_INIT;
-			fill_clone_command(&cp, quiet, prefix, ce->name,
-					   sub->name, url, reference, depth);
-
-			if (run_command(&cp)) {
-				printf("#unmatched\n");
-				return 1;
-			}
+			fill_clone_command(cp, pp->quiet, pp->prefix, ce->name,
+					   sub->name, url, pp->reference, pp->depth);
+			return 1;
 		}
 		strbuf_reset(&sb);
 		strbuf_addf(&sb, "%06o %s %d %d\t%s\n", ce->ce_mode,
 				sha1_to_hex(ce->sha1), ce_stage(ce),
 				just_cloned, ce->name);
-		string_list_append(&projectlines, sb.buf);
+		string_list_append(&pp->projectlines, sb.buf);
+	}
+	return 0;
+}
+
+static int start_failure(struct child_process *cp,
+			 struct strbuf *err,
+			 void *pp_cb,
+			 void *pp_task_cb)
+{
+	struct submodule_list_or_clone *pp = pp_cb;
+
+	pp->print_unmatched = 1;
+
+	return 1;
+}
+
+static int task_finished(int result,
+			 struct child_process *cp,
+			 struct strbuf *err,
+			 void *pp_cb,
+			 void *pp_task_cb)
+{
+	struct submodule_list_or_clone *pp = pp_cb;
+
+	if (!result)
+		return 0;
+	else {
+		pp->print_unmatched = 1;
+		return 1;
+	}
+}
+
+static int module_list_or_clone(int argc, const char **argv, const char *prefix)
+{
+	struct submodule_list_or_clone pp;
+	struct string_list_item *item;
+
+	struct option module_list_options[] = {
+		OPT_STRING(0, "prefix", &pp.prefix,
+			   N_("path"),
+			   N_("alternative anchor for relative paths")),
+		OPT_STRING(0, "update", &pp.update,
+			   N_("string"),
+			   N_("update command for submodules")),
+		OPT_STRING(0, "reference", &pp.reference, "<repository>",
+			N_("Use the local reference repository "
+			   "instead of a full clone")),
+		OPT_STRING(0, "depth", &pp.depth, "<depth>",
+			N_("Create a shallow clone truncated to the "
+			   "specified number of revisions")),
+		OPT__QUIET(&pp.quiet, N_("do't print cloning progress")),
+		OPT_END()
+	};
+
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper list [--prefix=<path>] [<path>...]"),
+		NULL
+	};
+
+	pp.prefix = NULL;
+	pp.list.entries = NULL;
+	pp.list.alloc = 0;
+	pp.list.nr = 0;
+	string_list_init(&pp.projectlines, 1);
+	pp.count = 0;
+	pp.reference = NULL;
+	pp.depth = NULL;
+	pp.update = NULL;
+	pp.env_prefix = getenv("prefix");
+	pp.print_unmatched = 0;
+
+	argc = parse_options(argc, argv, prefix, module_list_options,
+			     git_submodule_helper_usage, 0);
+
+	if (module_list_compute(argc, argv, pp.prefix, &pp.pathspec, &pp.list) < 0) {
+		printf("#unmatched\n");
+		return 1;
+	}
+
+	gitmodules_config();
+	/* Overlay the parsed .gitmodules file with .git/config */
+	git_config(git_submodule_config, NULL);
+
+	run_processes_parallel(1, get_next_task, start_failure, task_finished, &pp);
+	if (pp.print_unmatched) {
+		printf("#unmatched\n");
+		return 1;
 	}
 
-	for_each_string_list_item(item, &projectlines) {
+	for_each_string_list_item(item, &pp.projectlines) {
 		utf8_fprintf(stdout, "%s", item->string);
 	}
 	return 0;
-- 
2.5.0.277.gfdc362b.dirty

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 01/12] git submodule update: Announce skipping submodules on stderr
  2015-10-16  1:52 ` [PATCH 01/12] git submodule update: Announce skipping submodules on stderr Stefan Beller
@ 2015-10-16 20:37   ` Junio C Hamano
  2015-10-16 20:47     ` Stefan Beller
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2015-10-16 20:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  git-submodule.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 8b0eb9a..578ec48 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -684,7 +684,7 @@ cmd_update()
>  
>  		if test "$update_module" = "none"
>  		then
> -			echo "Skipping submodule '$displaypath'"
> +			echo >&2 "Skipping submodule '$displaypath'"
>  			continue
>  		fi

Makes sense, but see 02/12.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 01/12] git submodule update: Announce skipping submodules on stderr
  2015-10-16 20:37   ` Junio C Hamano
@ 2015-10-16 20:47     ` Stefan Beller
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16 20:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Fri, Oct 16, 2015 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>  git-submodule.sh | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 8b0eb9a..578ec48 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -684,7 +684,7 @@ cmd_update()
>>
>>               if test "$update_module" = "none"
>>               then
>> -                     echo "Skipping submodule '$displaypath'"
>> +                     echo >&2 "Skipping submodule '$displaypath'"
>>                       continue
>>               fi
>
> Makes sense, but see 02/12.

The patch (I can't see a reply there) ?

I split them on purpose. This one uses echo as opposed to say and has
no tests to fail.

So this patch documents, that there are no breaking tests. I can just change it
2/12 tells another story: We codified the behavior in tests and rely on it, so
we need to carefully decide if that's a breaking change.



>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/12] git submodule update: Announce uninitialized modules on stderr
  2015-10-16  1:52 ` [PATCH 02/12] git submodule update: Announce uninitialized modules " Stefan Beller
@ 2015-10-16 20:54   ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2015-10-16 20:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  git-submodule.sh           |  2 +-
>  t/t7400-submodule-basic.sh | 12 ++++++------
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 578ec48..eea27f8 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -693,7 +693,7 @@ cmd_update()
>  			# Only mention uninitialized submodules when its
>  			# path have been specified
>  			test "$#" != "0" &&
> -			say "$(eval_gettext "Submodule path '\$displaypath' not initialized
> +			say >&2 "$(eval_gettext "Submodule path '\$displaypath' not initialized
>  Maybe you want to use 'update --init'?")"
>  			continue
>  		fi

There are quite a few other calls to "say" in this script, and you
are changing only this one to emit it to the standard error output.

My quick eyeballing of the script tells me that most of them, other
than the ones that are used in cmd_status to report the information
that the user asked to be shown on the standard output, are of "Now
I am doing this" kind fo output, which I feel are the same category
as this one that shouldn't be on the standard output.

Another thing (which relates to the one in 01/12) is that not all
output from this command comes from "say".

Perhaps the first thing to do before doing 01/12 is to sift these
messages into types and have them consistently use helpers designed
for different purposes, e.g.

 - a progress, like this one, the one in 01/12, and many other uses
   of "say"; which may want to become e.g. "say_progress".

 - an error or a warning, like "Could not remove working tree", "not
   initialized, maybe you want to do 'init' first?"; which may want
   to become something else e.g. "say_warning".

 - the real output from the program, e.g. output from cmd_status,
   would use yet another, e.g. "printf '%s\n'".

instead of converting each message that you happened to have noticed.

Note that "say" is squelched under GIT_QUIET (i.e. --quiet).  The
former two helpers we would want to make quiet (or for errors we may
not---I don't know offhand).  I do not think of any valid reason why
we want to squelch the output from cmd_status under --quiet; it is
not like the the while loop on the downstream of "list |" pipe tells
some status via its exit code.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 03/12] git submodule update: Move branch calculation to where it's needed
  2015-10-16  1:52 ` [PATCH 03/12] git submodule update: Move branch calculation to where it's needed Stefan Beller
@ 2015-10-16 20:54   ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2015-10-16 20:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> The branch variable is used only once so calculate it only when needed.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

Makes sense.

>  git-submodule.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index eea27f8..56a0524 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -668,7 +668,6 @@ cmd_update()
>  		fi
>  		name=$(git submodule--helper name "$sm_path") || exit
>  		url=$(git config submodule."$name".url)
> -		branch=$(get_submodule_config "$name" branch master)
>  		if ! test -z "$update"
>  		then
>  			update_module=$update
> @@ -718,6 +717,7 @@ Maybe you want to use 'update --init'?")"
>  				die "$(eval_gettext "Unable to fetch in submodule path '\$sm_path'")"
>  			fi
>  			remote_name=$(clear_local_git_env; cd "$sm_path" && get_default_remote)
> +			branch=$(get_submodule_config "$name" branch master)
>  			sha1=$(clear_local_git_env; cd "$sm_path" &&
>  				git rev-parse --verify "${remote_name}/${branch}") ||
>  			die "$(eval_gettext "Unable to find current ${remote_name}/${branch} revision in submodule path '\$sm_path'")"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 05/12] git submodule update: Use its own list implementation.
  2015-10-16  1:52 ` [PATCH 05/12] git submodule update: Use its own list implementation Stefan Beller
@ 2015-10-16 21:02   ` Junio C Hamano
  2015-10-16 21:08     ` Stefan Beller
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2015-10-16 21:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> Discussions turned out that we cannot parallelize the whole loop below
> `git submodule--helper list` in `git submodule update`, because some
> changes should be done only one at a time, such as messing up a submodule
> and leave it up to the user to cleanup the conflicted rebase or merge.
>
> The submodules which are need to be cloned however do not expect to create
> problems which require attention by the user one at a time, so we want to
> parallelize that first.
>
> To do so we will start with a literal copy of `git submodule--helper list`
> and port over features gradually.

I am not sure what you mean by this.

Surely, the current implementation of "update" does the fetching and
updating as a single unit of task and iterate over these tasks, and
we would rather want to instead have one iteration of submodules to
do the fetching part (without doing other things that can fail and
have to get attention of the end user), followed by another
iteration that does the "other things", in order to get closer to
the end goal of doing the fetch in parallel and then doing the
remainder one-module-at-a-time sequencially.

I would imagine that the logical first step towards the end goal, if
I understood you correctly, would be to split that single large loop
that does a fetch and other things for a single module in each
iteration into two, one that iterates and fetches all, followed by a
new one that does the checkout/merge.

What I do not understand is why that requires a different kind of
enumerator (unless this is a kind of premature optimization, knowing
that the set of modules iterated by these two loops are slightly
different or something).

>  int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
> diff --git a/git-submodule.sh b/git-submodule.sh
> index bb8b2c7..d2d80e2 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -656,7 +656,7 @@ cmd_update()
>  	fi
>  
>  	cloned_modules=
> -	git submodule--helper list --prefix "$wt_prefix" "$@" | {
> +	git submodule--helper list-or-clone --prefix "$wt_prefix" "$@" | {
>  	err=
>  	while read mode sha1 stage sm_path
>  	do

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 05/12] git submodule update: Use its own list implementation.
  2015-10-16 21:02   ` Junio C Hamano
@ 2015-10-16 21:08     ` Stefan Beller
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-16 21:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Fri, Oct 16, 2015 at 2:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Discussions turned out that we cannot parallelize the whole loop below
>> `git submodule--helper list` in `git submodule update`, because some
>> changes should be done only one at a time, such as messing up a submodule
>> and leave it up to the user to cleanup the conflicted rebase or merge.
>>
>> The submodules which are need to be cloned however do not expect to create
>> problems which require attention by the user one at a time, so we want to
>> parallelize that first.
>>
>> To do so we will start with a literal copy of `git submodule--helper list`
>> and port over features gradually.
>
> I am not sure what you mean by this.
>
> Surely, the current implementation of "update" does the fetching and
> updating as a single unit of task and iterate over these tasks, and
> we would rather want to instead have one iteration of submodules to
> do the fetching part (without doing other things that can fail and
> have to get attention of the end user), followed by another
> iteration that does the "other things", in order to get closer to
> the end goal of doing the fetch in parallel and then doing the
> remainder one-module-at-a-time sequencially.

I differentiated a bit more and moved the clone parts only.
Fetching should also be no problem. I initially assumed that to be a
problem too.

>
> I would imagine that the logical first step towards the end goal, if
> I understood you correctly, would be to split that single large loop
> that does a fetch and other things for a single module in each
> iteration into two, one that iterates and fetches all, followed by a
> new one that does the checkout/merge.

That was also one of the patch series I wrote (not sent to list)
1) split up into 2 phases
2) rewrite first phase in C
3) parallelize the first phase.

This series merges 1 and 2, so you don't have to review
the same functionality two times.

>
> What I do not understand is why that requires a different kind of
> enumerator (unless this is a kind of premature optimization, knowing
> that the set of modules iterated by these two loops are slightly
> different or something).

It is just moving all code before the clone step into the C part, so
we can call the clone in C.

>
>>  int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index bb8b2c7..d2d80e2 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -656,7 +656,7 @@ cmd_update()
>>       fi
>>
>>       cloned_modules=
>> -     git submodule--helper list --prefix "$wt_prefix" "$@" | {
>> +     git submodule--helper list-or-clone --prefix "$wt_prefix" "$@" | {
>>       err=
>>       while read mode sha1 stage sm_path
>>       do

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 06/12] git submodule update: Handle unmerged submodules in C
  2015-10-16  1:52 ` [PATCH 06/12] git submodule update: Handle unmerged submodules in C Stefan Beller
@ 2015-10-20 21:11   ` Junio C Hamano
  2015-10-20 21:21     ` Stefan Beller
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2015-10-20 21:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  builtin/submodule--helper.c | 15 +++++++++++----
>  git-submodule.sh            |  6 +-----
>  2 files changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 47dc9cb..f81f37a 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -284,11 +284,18 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
>  	for (i = 0; i < list.nr; i++) {
>  		const struct cache_entry *ce = list.entries[i];
>  
> -		if (ce_stage(ce))
> -			printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
> -		else
> -			printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
> +		char *env_prefix = getenv("prefix");

This somehow makes me feel dirty.  Do we really export such an
environment variable that is named overly generically to communicate
with our own helpers?

I can see why you need to be able to prefix leading paths (i.e. you
would need to prefix path to the enclosing submodule to a path to
obtain the "global view" from the very top-level superproject while
recursing into nested submodules), but still...

> +		if (ce_stage(ce)) {
> +			if (env_prefix)
> +				fprintf(stderr, "Skipping unmerged submodule %s/%s",
> +					env_prefix, ce->name);
> +			else
> +				fprintf(stderr, "Skipping unmerged submodule %s",
> +					ce->name);
> +			continue;
> +		}
>  
> +		printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
>  		utf8_fprintf(stdout, "%s\n", ce->name);
>  	}
>  	return 0;
> diff --git a/git-submodule.sh b/git-submodule.sh
> index d2d80e2..0754ecd 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -661,11 +661,7 @@ cmd_update()
>  	while read mode sha1 stage sm_path
>  	do
>  		die_if_unmatched "$mode"
> -		if test "$stage" = U
> -		then
> -			echo >&2 "Skipping unmerged submodule $prefix$sm_path"
> -			continue
> -		fi
> +
>  		name=$(git submodule--helper name "$sm_path") || exit
>  		url=$(git config submodule."$name".url)
>  		if ! test -z "$update"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 06/12] git submodule update: Handle unmerged submodules in C
  2015-10-20 21:11   ` Junio C Hamano
@ 2015-10-20 21:21     ` Stefan Beller
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Beller @ 2015-10-20 21:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Tue, Oct 20, 2015 at 2:11 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>  builtin/submodule--helper.c | 15 +++++++++++----
>>  git-submodule.sh            |  6 +-----
>>  2 files changed, 12 insertions(+), 9 deletions(-)
>>
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> index 47dc9cb..f81f37a 100644
>> --- a/builtin/submodule--helper.c
>> +++ b/builtin/submodule--helper.c
>> @@ -284,11 +284,18 @@ static int module_list_or_clone(int argc, const char **argv, const char *prefix)
>>       for (i = 0; i < list.nr; i++) {
>>               const struct cache_entry *ce = list.entries[i];
>>
>> -             if (ce_stage(ce))
>> -                     printf("%06o %s U\t", ce->ce_mode, sha1_to_hex(null_sha1));
>> -             else
>> -                     printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
>> +             char *env_prefix = getenv("prefix");
>

[Just checked the date, it's the old series. I am about to send out a new
series which collapses some patches in here, is on top of the fixes series and
off course fixes this issue ;) ]

> This somehow makes me feel dirty.  Do we really export such an
> environment variable that is named overly generically to communicate
> with our own helpers?

I agree that this is bad. It was the fastest way.
I should have taken the slower road. I think I'll replace this with
another argument.

>
> I can see why you need to be able to prefix leading paths (i.e. you
> would need to prefix path to the enclosing submodule to a path to
> obtain the "global view" from the very top-level superproject while
> recursing into nested submodules), but still...
>
>> +             if (ce_stage(ce)) {
>> +                     if (env_prefix)
>> +                             fprintf(stderr, "Skipping unmerged submodule %s/%s",
>> +                                     env_prefix, ce->name);
>> +                     else
>> +                             fprintf(stderr, "Skipping unmerged submodule %s",
>> +                                     ce->name);
>> +                     continue;
>> +             }
>>
>> +             printf("%06o %s %d\t", ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
>>               utf8_fprintf(stdout, "%s\n", ce->name);
>>       }
>>       return 0;
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index d2d80e2..0754ecd 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -661,11 +661,7 @@ cmd_update()
>>       while read mode sha1 stage sm_path
>>       do
>>               die_if_unmatched "$mode"
>> -             if test "$stage" = U
>> -             then
>> -                     echo >&2 "Skipping unmerged submodule $prefix$sm_path"
>> -                     continue
>> -             fi
>> +
>>               name=$(git submodule--helper name "$sm_path") || exit
>>               url=$(git config submodule."$name".url)
>>               if ! test -z "$update"

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-10-20 21:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-16  1:52 [RFC PATCHv1 00/12] git submodule update in C with parallel cloning Stefan Beller
2015-10-16  1:52 ` [PATCH 01/12] git submodule update: Announce skipping submodules on stderr Stefan Beller
2015-10-16 20:37   ` Junio C Hamano
2015-10-16 20:47     ` Stefan Beller
2015-10-16  1:52 ` [PATCH 02/12] git submodule update: Announce uninitialized modules " Stefan Beller
2015-10-16 20:54   ` Junio C Hamano
2015-10-16  1:52 ` [PATCH 03/12] git submodule update: Move branch calculation to where it's needed Stefan Beller
2015-10-16 20:54   ` Junio C Hamano
2015-10-16  1:52 ` [PATCH 04/12] git submodule update: Announce outcome of submodule operation to stderr Stefan Beller
2015-10-16  1:52 ` [PATCH 05/12] git submodule update: Use its own list implementation Stefan Beller
2015-10-16 21:02   ` Junio C Hamano
2015-10-16 21:08     ` Stefan Beller
2015-10-16  1:52 ` [PATCH 06/12] git submodule update: Handle unmerged submodules in C Stefan Beller
2015-10-20 21:11   ` Junio C Hamano
2015-10-20 21:21     ` Stefan Beller
2015-10-16  1:52 ` [PATCH 07/12] submodule config: keep update strategy around Stefan Beller
2015-10-16  1:52 ` [PATCH 08/12] git submodule update: check for "none" in C Stefan Beller
2015-10-16  1:52 ` [PATCH 09/12] git submodule update: Check url " Stefan Beller
2015-10-16  1:52 ` [PATCH 10/12] git submodule update: Clone projects from within C Stefan Beller
2015-10-16  1:52 ` [PATCH 11/12] submodule--helper: Do not emit submodules to process directly Stefan Beller
2015-10-16  1:52 ` [PATCH 12/12] WIP/broken Clone all outstanding submodules in parallel Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).