git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC PATCH 1/2] implement fetching of moved submodules
@ 2017-08-17 10:53 Heiko Voigt
  2017-08-17 11:00 ` [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Heiko Voigt @ 2017-08-17 10:53 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, Jens.Lehmann, Brandon Williams

We store the changed submodules paths to calculate which submodule needs
fetching. This does not work for moved submodules since their paths do
not stay the same in case of a moved submodules. In case of new
submodules we do not have a path in the current checkout, since they
just appeared in this fetch.

It is more general to collect the submodule names for changes instead of
their paths to include the above cases.

With the change described above we implement 'on-demand' fetching of
changes in moved submodules.

Note: This does only work when repositories have a .gitmodules file. In
other words: It breaks if we do not get a name for a repository.
IIRC, consensus was that this is a requirement to get nice submodule
handling these days?

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---

I updated the leftover code from my series implementing recursive fetch
for moved submodules[1] to the current master.

This breaks t5531 and t5545 because they do not use a .gitmodules file.

I also have some code leftover that does fallback on paths in case no
submodule names can be found. But I do not really like it. The question
here is how far do we support not using .gitmodules. Is it e.g.
reasonable to say: "For --recurse-submodules=on-demand you need a
.gitmodules file?"

[1] https://public-inbox.org/git/f5baa2acc09531a16f4f693eebbe60706bb8ed1e.1361751905.git.hvoigt@hvoigt.net/

 submodule.c                 | 92 +++++++++++++++++++++++++--------------------
 t/t5526-fetch-submodules.sh | 35 +++++++++++++++++
 2 files changed, 86 insertions(+), 41 deletions(-)

diff --git a/submodule.c b/submodule.c
index 27de65a..3ed78ac 100644
--- a/submodule.c
+++ b/submodule.c
@@ -23,7 +23,7 @@
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static int parallel_jobs = 1;
-static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
+static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -742,11 +742,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
 }
 
 static struct oid_array *submodule_commits(struct string_list *submodules,
-					   const char *path)
+					   const char *name)
 {
 	struct string_list_item *item;
 
-	item = string_list_insert(submodules, path);
+	item = string_list_insert(submodules, name);
 	if (item->util)
 		return (struct oid_array *) item->util;
 
@@ -755,39 +755,34 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
 	return (struct oid_array *) item->util;
 }
 
+struct collect_changed_submodules_cb_data {
+	struct string_list *changed;
+	const struct object_id *commit_oid;
+};
+
 static void collect_changed_submodules_cb(struct diff_queue_struct *q,
 					  struct diff_options *options,
 					  void *data)
 {
+	struct collect_changed_submodules_cb_data *me = data;
+	struct string_list *changed = me->changed;
+	const struct object_id *commit_oid = me->commit_oid;
 	int i;
-	struct string_list *changed = data;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		struct oid_array *commits;
+		const struct submodule *submodule;
+
 		if (!S_ISGITLINK(p->two->mode))
 			continue;
 
-		if (S_ISGITLINK(p->one->mode)) {
-			/*
-			 * NEEDSWORK: We should honor the name configured in
-			 * the .gitmodules file of the commit we are examining
-			 * here to be able to correctly follow submodules
-			 * being moved around.
-			 */
-			commits = submodule_commits(changed, p->two->path);
-			oid_array_append(commits, &p->two->oid);
-		} else {
-			/* Submodule is new or was moved here */
-			/*
-			 * NEEDSWORK: When the .git directories of submodules
-			 * live inside the superprojects .git directory some
-			 * day we should fetch new submodules directly into
-			 * that location too when config or options request
-			 * that so they can be checked out from there.
-			 */
+		submodule = submodule_from_path(commit_oid, p->two->path);
+		if (!submodule)
 			continue;
-		}
+
+		commits = submodule_commits(changed, submodule->name);
+		oid_array_append(commits, &p->two->oid);
 	}
 }
 
@@ -810,11 +805,14 @@ static void collect_changed_submodules(struct string_list *changed,
 
 	while ((commit = get_revision(&rev))) {
 		struct rev_info diff_rev;
+		struct collect_changed_submodules_cb_data data;
+		data.changed = changed;
+		data.commit_oid = &commit->object.oid;
 
 		init_revisions(&diff_rev, NULL);
 		diff_rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
 		diff_rev.diffopt.format_callback = collect_changed_submodules_cb;
-		diff_rev.diffopt.format_callback_data = changed;
+		diff_rev.diffopt.format_callback_data = &data;
 		diff_tree_combined_merge(commit, 1, &diff_rev);
 	}
 
@@ -871,6 +869,7 @@ static int submodule_has_commits(const char *path, struct oid_array *commits)
 	oid_array_for_each_unique(commits, check_has_commit, &has_commit);
 
 	if (has_commit) {
+
 		/*
 		 * Even if the submodule is checked out and the commit is
 		 * present, make sure it exists in the submodule's object store
@@ -945,7 +944,7 @@ int find_unpushed_submodules(struct oid_array *commits,
 		const char *remotes_name, struct string_list *needs_pushing)
 {
 	struct string_list submodules = STRING_LIST_INIT_DUP;
-	struct string_list_item *submodule;
+	struct string_list_item *name;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	/* argv.argv[0] will be ignored by setup_revisions */
@@ -956,12 +955,16 @@ int find_unpushed_submodules(struct oid_array *commits,
 
 	collect_changed_submodules(&submodules, &argv);
 
-	for_each_string_list_item(submodule, &submodules) {
-		struct oid_array *commits = submodule->util;
-		const char *path = submodule->string;
+	for_each_string_list_item(name, &submodules) {
+		struct oid_array *commits = name->util;
+		const struct submodule *submodule;
+
+		submodule = submodule_from_name(&null_oid, name->string);
+		if (!submodule)
+			continue;
 
-		if (submodule_needs_pushing(path, commits))
-			string_list_insert(needs_pushing, path);
+		if (submodule_needs_pushing(submodule->path, commits))
+			string_list_insert(needs_pushing, submodule->path);
 	}
 
 	free_submodules_oids(&submodules);
@@ -1104,7 +1107,7 @@ static void calculate_changed_submodule_paths(void)
 {
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
-	const struct string_list_item *item;
+	const struct string_list_item *name;
 
 	/* No need to check if there are no submodules configured */
 	if (!submodule_from_path(NULL, NULL))
@@ -1119,16 +1122,20 @@ static void calculate_changed_submodule_paths(void)
 
 	/*
 	 * Collect all submodules (whether checked out or not) for which new
-	 * commits have been recorded upstream in "changed_submodule_paths".
+	 * commits have been recorded upstream in "changed_submodule_names".
 	 */
 	collect_changed_submodules(&changed_submodules, &argv);
 
-	for_each_string_list_item(item, &changed_submodules) {
-		struct oid_array *commits = item->util;
-		const char *path = item->string;
+	for_each_string_list_item(name, &changed_submodules) {
+		struct oid_array *commits = name->util;
+		const struct submodule *submodule;
+
+		submodule = submodule_from_name(&null_oid, name->string);
+		if (!submodule)
+			continue;
 
-		if (!submodule_has_commits(path, commits))
-			string_list_append(&changed_submodule_paths, path);
+		if (!submodule_has_commits(submodule->path, commits))
+			string_list_append(&changed_submodule_names, name->string);
 	}
 
 	free_submodules_oids(&changed_submodules);
@@ -1206,7 +1213,8 @@ static int get_next_submodule(struct child_process *cp,
 					continue;
 				if (submodule->fetch_recurse ==
 						RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+					if (!unsorted_string_list_lookup(&changed_submodule_names,
+									 submodule->name))
 						continue;
 					default_argv = "on-demand";
 				}
@@ -1215,13 +1223,15 @@ static int get_next_submodule(struct child_process *cp,
 				    gitmodules_is_unmerged)
 					continue;
 				if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+					if (!unsorted_string_list_lookup(&changed_submodule_names,
+									 submodule->name))
 						continue;
 					default_argv = "on-demand";
 				}
 			}
 		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
-			if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+			if (!unsorted_string_list_lookup(&changed_submodule_names,
+							 submodule->name))
 				continue;
 			default_argv = "on-demand";
 		}
@@ -1315,7 +1325,7 @@ int fetch_populated_submodules(const struct argv_array *options,
 
 	argv_array_clear(&spf.args);
 out:
-	string_list_clear(&changed_submodule_paths, 1);
+	string_list_clear(&changed_submodule_names, 1);
 	return spf.result;
 }
 
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 162baf1..ce788e9 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -530,4 +530,39 @@ test_expect_success 'fetching submodule into a broken repository' '
 	test_must_fail git -C dst fetch --recurse-submodules
 '
 
+test_expect_success "fetch new commits when submodule got renamed" '
+	git clone . downstream_rename &&
+	(
+		cd downstream_rename &&
+		git submodule update --init &&
+# NEEDSWORK: we omitted --recursive for the submodule update here since
+# that does not work. See test 7001 for mv "moving nested submodules"
+# for details. Once that is fixed we should add the --recursive option
+# here.
+		git checkout -b rename &&
+		git mv submodule submodule_renamed &&
+		(
+			cd submodule_renamed &&
+			git checkout -b rename_sub &&
+			echo a >a &&
+			git add a &&
+			git commit -ma &&
+			git push origin rename_sub &&
+			git rev-parse HEAD >../../expect
+		) &&
+		git add submodule_renamed &&
+		git commit -m "update renamed submodule" &&
+		git push origin rename
+	) &&
+	(
+		cd downstream &&
+		git fetch --recurse-submodules=on-demand &&
+		(
+			cd submodule &&
+			git rev-parse origin/rename_sub >../../actual
+		)
+	) &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.0.0.274.g6b2cd91


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-08-17 10:53 [RFC PATCH 1/2] implement fetching of moved submodules Heiko Voigt
@ 2017-08-17 11:00 ` Heiko Voigt
  2017-08-17 17:24   ` Stefan Beller
  2017-08-17 17:20 ` [RFC PATCH 1/2] implement fetching of moved submodules Stefan Beller
  2017-09-15 13:18 ` [RFC PATCH v2 " Heiko Voigt
  2 siblings, 1 reply; 11+ messages in thread
From: Heiko Voigt @ 2017-08-17 11:00 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, Jens.Lehmann, Brandon Williams

To make extending this logic later easier.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---
I am quite sure I replicated the same logic but a few more eyes would be
appreciated.

Cheers Heiko

 submodule.c | 55 +++++++++++++++++++++++++++----------------------------
 1 file changed, 27 insertions(+), 28 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3ed78ac..a1011f4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1171,6 +1171,21 @@ int submodule_touches_in_range(struct object_id *excl_oid,
 	return ret;
 }
 
+static int get_fetch_recurse_config(const struct submodule *submodule, int command_line_option)
+{
+	if (command_line_option != RECURSE_SUBMODULES_DEFAULT)
+		return command_line_option;
+
+	if (submodule && submodule->fetch_recurse != RECURSE_SUBMODULES_NONE)
+		/* local config overrules everything except commandline */
+		return submodule->fetch_recurse;
+
+	if (gitmodules_is_unmerged)
+		return RECURSE_SUBMODULES_OFF;
+
+	return config_fetch_recurse_submodules;
+}
+
 struct submodule_parallel_fetch {
 	int count;
 	struct argv_array args;
@@ -1203,37 +1218,21 @@ static int get_next_submodule(struct child_process *cp,
 		if (!submodule)
 			submodule = submodule_from_name(&null_oid, ce->name);
 
-		default_argv = "yes";
-		if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
-			if (submodule &&
-			    submodule->fetch_recurse !=
-						RECURSE_SUBMODULES_NONE) {
-				if (submodule->fetch_recurse ==
-						RECURSE_SUBMODULES_OFF)
-					continue;
-				if (submodule->fetch_recurse ==
-						RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_names,
-									 submodule->name))
-						continue;
-					default_argv = "on-demand";
-				}
-			} else {
-				if ((config_fetch_recurse_submodules == RECURSE_SUBMODULES_OFF) ||
-				    gitmodules_is_unmerged)
-					continue;
-				if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_names,
-									 submodule->name))
-						continue;
-					default_argv = "on-demand";
-				}
-			}
-		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
-			if (!unsorted_string_list_lookup(&changed_submodule_names,
+		switch (get_fetch_recurse_config(submodule, spf->command_line_option))
+		{
+		default:
+		case RECURSE_SUBMODULES_DEFAULT:
+		case RECURSE_SUBMODULES_ON_DEMAND:
+			if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
 							 submodule->name))
 				continue;
 			default_argv = "on-demand";
+			break;
+		case RECURSE_SUBMODULES_ON:
+			default_argv = "yes";
+			break;
+		case RECURSE_SUBMODULES_OFF:
+			continue;
 		}
 
 		strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
-- 
2.0.0.274.g6b2cd91


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] implement fetching of moved submodules
  2017-08-17 10:53 [RFC PATCH 1/2] implement fetching of moved submodules Heiko Voigt
  2017-08-17 11:00 ` [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
@ 2017-08-17 17:20 ` Stefan Beller
  2017-08-18 16:06   ` Heiko Voigt
  2017-09-15 13:18 ` [RFC PATCH v2 " Heiko Voigt
  2 siblings, 1 reply; 11+ messages in thread
From: Stefan Beller @ 2017-08-17 17:20 UTC (permalink / raw)
  To: Heiko Voigt
  Cc: git@vger.kernel.org, Jonathan Nieder, Jens Lehmann,
	Brandon Williams

On Thu, Aug 17, 2017 at 3:53 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> We store the changed submodules paths to calculate which submodule needs
> fetching. This does not work for moved submodules since their paths do
> not stay the same in case of a moved submodules. In case of new
> submodules we do not have a path in the current checkout, since they
> just appeared in this fetch.
>
> It is more general to collect the submodule names for changes instead of
> their paths to include the above cases.
>
> With the change described above we implement 'on-demand' fetching of
> changes in moved submodules.

This sounds as if this would also enable fetching new submodules
eventually?

> Note: This does only work when repositories have a .gitmodules file. In
> other words: It breaks if we do not get a name for a repository.
> IIRC, consensus was that this is a requirement to get nice submodule
> handling these days?

I think that should have been the consensus since ~1.7.8 (since the
submodules git dir can live inside the superprojects
<gitdir>/module/<name>).

A gitlink entry without corresponding .gitmodules entry is just a gitlink.
If we happen to have a repository at that path of the gitlink, we can
be nice and pretend like it is a functional submodule, but it really is
not. It's just another repo inside the superproject that happen to live
at the path of a gitlink.

> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
>
> I updated the leftover code from my series implementing recursive fetch
> for moved submodules[1] to the current master.
>
> This breaks t5531 and t5545 because they do not use a .gitmodules file.
>
> I also have some code leftover that does fallback on paths in case no
> submodule names can be found. But I do not really like it. The question
> here is how far do we support not using .gitmodules. Is it e.g.
> reasonable to say: "For --recurse-submodules=on-demand you need a
> .gitmodules file?"

I would not intentionally break users here, but any new functionality can
safely assume (a) we have a proper .gitmodules entry or (b) it is not a
submodule, so do nothing/be extra careful.

For example in recursive diff sort of makes sense to also handle
non-submodule gitlinks, but fetch is harder to tell.

(just last night I was rereading
https://public-inbox.org/git/CAJo=hJvnAPNAdDcAAwAvU9C4RVeQdoS3Ev9WTguHx4fD0V_nOg@mail.gmail.com/
which I think is a super cute application of gitlinks. If you happen
to checkout such
a tree, you don't want to fetch all of the fake submodules)

>
> [1] https://public-inbox.org/git/f5baa2acc09531a16f4f693eebbe60706bb8ed1e.1361751905.git.hvoigt@hvoigt.net/

Oha, that is from way back in the time. :)

>  submodule.c                 | 92 +++++++++++++++++++++++++--------------------
>  t/t5526-fetch-submodules.sh | 35 +++++++++++++++++
>  2 files changed, 86 insertions(+), 41 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index 27de65a..3ed78ac 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -23,7 +23,7 @@
>  static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
>  static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
>  static int parallel_jobs = 1;
> -static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
> +static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
>  static int initialized_fetch_ref_tips;
>  static struct oid_array ref_tips_before_fetch;
>  static struct oid_array ref_tips_after_fetch;
> @@ -742,11 +742,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
>  }
>
>  static struct oid_array *submodule_commits(struct string_list *submodules,
> -                                          const char *path)
> +                                          const char *name)
>  {
>         struct string_list_item *item;
>
> -       item = string_list_insert(submodules, path);
> +       item = string_list_insert(submodules, name);
>         if (item->util)
>                 return (struct oid_array *) item->util;
>
> @@ -755,39 +755,34 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
>         return (struct oid_array *) item->util;
>  }
>
> +struct collect_changed_submodules_cb_data {
> +       struct string_list *changed;

Here a comment would be helpful or a more concise variable name.
(What is changed?)

> +       const struct object_id *commit_oid;
> +};
> +
>  static void collect_changed_submodules_cb(struct diff_queue_struct *q,
>                                           struct diff_options *options,
>                                           void *data)
>  {
> +       struct collect_changed_submodules_cb_data *me = data;
> +       struct string_list *changed = me->changed;
> +       const struct object_id *commit_oid = me->commit_oid;
>         int i;
> -       struct string_list *changed = data;
>
>         for (i = 0; i < q->nr; i++) {
>                 struct diff_filepair *p = q->queue[i];
>                 struct oid_array *commits;
> +               const struct submodule *submodule;

(optional style nit, personal opinion, feel free to ignore)
I personally prefer to not name variables exactly as their type.
Also most (all) of the struct submodule uses used 'sub' as
the variable name, maybe keep it consistent?

> +
>                 if (!S_ISGITLINK(p->two->mode))
>                         continue;
>
> -               if (S_ISGITLINK(p->one->mode)) {
> -                       /*
> -                        * NEEDSWORK: We should honor the name configured in
> -                        * the .gitmodules file of the commit we are examining
> -                        * here to be able to correctly follow submodules
> -                        * being moved around.
> -                        */
> -                       commits = submodule_commits(changed, p->two->path);
> -                       oid_array_append(commits, &p->two->oid);
> -               } else {
> -                       /* Submodule is new or was moved here */
> -                       /*
> -                        * NEEDSWORK: When the .git directories of submodules
> -                        * live inside the superprojects .git directory some
> -                        * day we should fetch new submodules directly into
> -                        * that location too when config or options request
> -                        * that so they can be checked out from there.
> -                        */
> +               submodule = submodule_from_path(commit_oid, p->two->path);
> +               if (!submodule)
>                         continue;
> -               }
> +
> +               commits = submodule_commits(changed, submodule->name);
> +               oid_array_append(commits, &p->two->oid);
>         }
>  }
>
> @@ -810,11 +805,14 @@ static void collect_changed_submodules(struct string_list *changed,
>
>         while ((commit = get_revision(&rev))) {
>                 struct rev_info diff_rev;
> +               struct collect_changed_submodules_cb_data data;
> +               data.changed = changed;
> +               data.commit_oid = &commit->object.oid;
>
>                 init_revisions(&diff_rev, NULL);
>                 diff_rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
>                 diff_rev.diffopt.format_callback = collect_changed_submodules_cb;
> -               diff_rev.diffopt.format_callback_data = changed;
> +               diff_rev.diffopt.format_callback_data = &data;
>                 diff_tree_combined_merge(commit, 1, &diff_rev);
>         }
>
> @@ -871,6 +869,7 @@ static int submodule_has_commits(const char *path, struct oid_array *commits)
>         oid_array_for_each_unique(commits, check_has_commit, &has_commit);
>
>         if (has_commit) {
> +

stray new line

>                 /*
>                  * Even if the submodule is checked out and the commit is
>                  * present, make sure it exists in the submodule's object store
> @@ -945,7 +944,7 @@ int find_unpushed_submodules(struct oid_array *commits,
>                 const char *remotes_name, struct string_list *needs_pushing)
>  {
>         struct string_list submodules = STRING_LIST_INIT_DUP;
> -       struct string_list_item *submodule;
> +       struct string_list_item *name;
>         struct argv_array argv = ARGV_ARRAY_INIT;
>
>         /* argv.argv[0] will be ignored by setup_revisions */
> @@ -956,12 +955,16 @@ int find_unpushed_submodules(struct oid_array *commits,
>
>         collect_changed_submodules(&submodules, &argv);
>
> -       for_each_string_list_item(submodule, &submodules) {
> -               struct oid_array *commits = submodule->util;
> -               const char *path = submodule->string;
> +       for_each_string_list_item(name, &submodules) {
> +               struct oid_array *commits = name->util;
> +               const struct submodule *submodule;
> +
> +               submodule = submodule_from_name(&null_oid, name->string);
> +               if (!submodule)
> +                       continue;
>
> -               if (submodule_needs_pushing(path, commits))
> -                       string_list_insert(needs_pushing, path);
> +               if (submodule_needs_pushing(submodule->path, commits))
> +                       string_list_insert(needs_pushing, submodule->path);

eventually we can also migrate to name here as well.
In a later patch.

>         }
>
>         free_submodules_oids(&submodules);
> @@ -1104,7 +1107,7 @@ static void calculate_changed_submodule_paths(void)
>  {
>         struct argv_array argv = ARGV_ARRAY_INIT;
>         struct string_list changed_submodules = STRING_LIST_INIT_DUP;
> -       const struct string_list_item *item;
> +       const struct string_list_item *name;
>
>         /* No need to check if there are no submodules configured */
>         if (!submodule_from_path(NULL, NULL))
> @@ -1119,16 +1122,20 @@ static void calculate_changed_submodule_paths(void)
>
>         /*
>          * Collect all submodules (whether checked out or not) for which new
> -        * commits have been recorded upstream in "changed_submodule_paths".
> +        * commits have been recorded upstream in "changed_submodule_names".
>          */
>         collect_changed_submodules(&changed_submodules, &argv);
>
> -       for_each_string_list_item(item, &changed_submodules) {
> -               struct oid_array *commits = item->util;
> -               const char *path = item->string;
> +       for_each_string_list_item(name, &changed_submodules) {
> +               struct oid_array *commits = name->util;
> +               const struct submodule *submodule;
> +
> +               submodule = submodule_from_name(&null_oid, name->string);
> +               if (!submodule)
> +                       continue;
>
> -               if (!submodule_has_commits(path, commits))
> -                       string_list_append(&changed_submodule_paths, path);
> +               if (!submodule_has_commits(submodule->path, commits))
> +                       string_list_append(&changed_submodule_names, name->string);
>         }
>
>         free_submodules_oids(&changed_submodules);
> @@ -1206,7 +1213,8 @@ static int get_next_submodule(struct child_process *cp,
>                                         continue;
>                                 if (submodule->fetch_recurse ==
>                                                 RECURSE_SUBMODULES_ON_DEMAND) {
> -                                       if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> +                                                                        submodule->name))
>                                                 continue;
>                                         default_argv = "on-demand";
>                                 }
> @@ -1215,13 +1223,15 @@ static int get_next_submodule(struct child_process *cp,
>                                     gitmodules_is_unmerged)
>                                         continue;
>                                 if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
> -                                       if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> +                                                                        submodule->name))
>                                                 continue;
>                                         default_argv = "on-demand";
>                                 }
>                         }
>                 } else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -                       if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> +                                                        submodule->name))
>                                 continue;
>                         default_argv = "on-demand";
>                 }
> @@ -1315,7 +1325,7 @@ int fetch_populated_submodules(const struct argv_array *options,
>
>         argv_array_clear(&spf.args);
>  out:
> -       string_list_clear(&changed_submodule_paths, 1);
> +       string_list_clear(&changed_submodule_names, 1);
>         return spf.result;
>  }
>
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 162baf1..ce788e9 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -530,4 +530,39 @@ test_expect_success 'fetching submodule into a broken repository' '
>         test_must_fail git -C dst fetch --recurse-submodules
>  '
>
> +test_expect_success "fetch new commits when submodule got renamed" '
> +       git clone . downstream_rename &&
> +       (
> +               cd downstream_rename &&
> +               git submodule update --init &&
> +# NEEDSWORK: we omitted --recursive for the submodule update here since
> +# that does not work. See test 7001 for mv "moving nested submodules"
> +# for details. Once that is fixed we should add the --recursive option
> +# here.
> +               git checkout -b rename &&
> +               git mv submodule submodule_renamed &&
> +               (
> +                       cd submodule_renamed &&
> +                       git checkout -b rename_sub &&
> +                       echo a >a &&
> +                       git add a &&
> +                       git commit -ma &&
> +                       git push origin rename_sub &&
> +                       git rev-parse HEAD >../../expect
> +               ) &&
> +               git add submodule_renamed &&
> +               git commit -m "update renamed submodule" &&
> +               git push origin rename
> +       ) &&
> +       (
> +               cd downstream &&
> +               git fetch --recurse-submodules=on-demand &&
> +               (
> +                       cd submodule &&
> +                       git rev-parse origin/rename_sub >../../actual
> +               )
> +       ) &&
> +       test_cmp expect actual
> +'
> +
>  test_done
> --
> 2.0.0.274.g6b2cd91
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-08-17 11:00 ` [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
@ 2017-08-17 17:24   ` Stefan Beller
  2017-08-17 17:50     ` Brandon Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Beller @ 2017-08-17 17:24 UTC (permalink / raw)
  To: Heiko Voigt
  Cc: git@vger.kernel.org, Jonathan Nieder, Jens Lehmann,
	Brandon Williams

On Thu, Aug 17, 2017 at 4:00 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> To make extending this logic later easier.
>
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
> I am quite sure I replicated the same logic but a few more eyes would be
> appreciated.

A code cleanup is appreciated!

I thought Brandon had a series in flight doing a very similar cleanup here,
but in master..pu there is nothing to be found.

> Cheers Heiko

The code looks good to me.

Cheers!
Stefan

>
>  submodule.c | 55 +++++++++++++++++++++++++++----------------------------
>  1 file changed, 27 insertions(+), 28 deletions(-)
>
> diff --git a/submodule.c b/submodule.c
> index 3ed78ac..a1011f4 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1171,6 +1171,21 @@ int submodule_touches_in_range(struct object_id *excl_oid,
>         return ret;
>  }
>
> +static int get_fetch_recurse_config(const struct submodule *submodule, int command_line_option)
> +{
> +       if (command_line_option != RECURSE_SUBMODULES_DEFAULT)
> +               return command_line_option;
> +
> +       if (submodule && submodule->fetch_recurse != RECURSE_SUBMODULES_NONE)
> +               /* local config overrules everything except commandline */
> +               return submodule->fetch_recurse;
> +
> +       if (gitmodules_is_unmerged)
> +               return RECURSE_SUBMODULES_OFF;
> +
> +       return config_fetch_recurse_submodules;
> +}
> +
>  struct submodule_parallel_fetch {
>         int count;
>         struct argv_array args;
> @@ -1203,37 +1218,21 @@ static int get_next_submodule(struct child_process *cp,
>                 if (!submodule)
>                         submodule = submodule_from_name(&null_oid, ce->name);
>
> -               default_argv = "yes";
> -               if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
> -                       if (submodule &&
> -                           submodule->fetch_recurse !=
> -                                               RECURSE_SUBMODULES_NONE) {
> -                               if (submodule->fetch_recurse ==
> -                                               RECURSE_SUBMODULES_OFF)
> -                                       continue;
> -                               if (submodule->fetch_recurse ==
> -                                               RECURSE_SUBMODULES_ON_DEMAND) {
> -                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> -                                                                        submodule->name))
> -                                               continue;
> -                                       default_argv = "on-demand";
> -                               }
> -                       } else {
> -                               if ((config_fetch_recurse_submodules == RECURSE_SUBMODULES_OFF) ||
> -                                   gitmodules_is_unmerged)
> -                                       continue;
> -                               if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
> -                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> -                                                                        submodule->name))
> -                                               continue;
> -                                       default_argv = "on-demand";
> -                               }
> -                       }
> -               } else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> +               switch (get_fetch_recurse_config(submodule, spf->command_line_option))
> +               {
> +               default:
> +               case RECURSE_SUBMODULES_DEFAULT:
> +               case RECURSE_SUBMODULES_ON_DEMAND:
> +                       if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
>                                                          submodule->name))
>                                 continue;
>                         default_argv = "on-demand";
> +                       break;
> +               case RECURSE_SUBMODULES_ON:
> +                       default_argv = "yes";
> +                       break;
> +               case RECURSE_SUBMODULES_OFF:
> +                       continue;
>                 }
>
>                 strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
> --
> 2.0.0.274.g6b2cd91
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-08-17 17:24   ` Stefan Beller
@ 2017-08-17 17:50     ` Brandon Williams
  2017-08-18 16:06       ` Heiko Voigt
  0 siblings, 1 reply; 11+ messages in thread
From: Brandon Williams @ 2017-08-17 17:50 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Heiko Voigt, git@vger.kernel.org, Jonathan Nieder, Jens Lehmann

On 08/17, Stefan Beller wrote:
> On Thu, Aug 17, 2017 at 4:00 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > To make extending this logic later easier.
> >
> > Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> > ---
> > I am quite sure I replicated the same logic but a few more eyes would be
> > appreciated.
> 
> A code cleanup is appreciated!
> 
> I thought Brandon had a series in flight doing a very similar cleanup here,
> but in master..pu there is nothing to be found.

Yeah there are 2 series in flight which will probably conflict here.
bw/grep-recurse-submodules and bw/submodule-config-cleanup

> 
> > Cheers Heiko
> 
> The code looks good to me.
> 
> Cheers!
> Stefan
> 
> >
> >  submodule.c | 55 +++++++++++++++++++++++++++----------------------------
> >  1 file changed, 27 insertions(+), 28 deletions(-)
> >
> > diff --git a/submodule.c b/submodule.c
> > index 3ed78ac..a1011f4 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -1171,6 +1171,21 @@ int submodule_touches_in_range(struct object_id *excl_oid,
> >         return ret;
> >  }
> >
> > +static int get_fetch_recurse_config(const struct submodule *submodule, int command_line_option)
> > +{
> > +       if (command_line_option != RECURSE_SUBMODULES_DEFAULT)
> > +               return command_line_option;
> > +
> > +       if (submodule && submodule->fetch_recurse != RECURSE_SUBMODULES_NONE)
> > +               /* local config overrules everything except commandline */
> > +               return submodule->fetch_recurse;
> > +
> > +       if (gitmodules_is_unmerged)
> > +               return RECURSE_SUBMODULES_OFF;
> > +
> > +       return config_fetch_recurse_submodules;
> > +}
> > +
> >  struct submodule_parallel_fetch {
> >         int count;
> >         struct argv_array args;
> > @@ -1203,37 +1218,21 @@ static int get_next_submodule(struct child_process *cp,
> >                 if (!submodule)
> >                         submodule = submodule_from_name(&null_oid, ce->name);
> >
> > -               default_argv = "yes";
> > -               if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
> > -                       if (submodule &&
> > -                           submodule->fetch_recurse !=
> > -                                               RECURSE_SUBMODULES_NONE) {
> > -                               if (submodule->fetch_recurse ==
> > -                                               RECURSE_SUBMODULES_OFF)
> > -                                       continue;
> > -                               if (submodule->fetch_recurse ==
> > -                                               RECURSE_SUBMODULES_ON_DEMAND) {
> > -                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> > -                                                                        submodule->name))
> > -                                               continue;
> > -                                       default_argv = "on-demand";
> > -                               }
> > -                       } else {
> > -                               if ((config_fetch_recurse_submodules == RECURSE_SUBMODULES_OFF) ||
> > -                                   gitmodules_is_unmerged)
> > -                                       continue;
> > -                               if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
> > -                                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> > -                                                                        submodule->name))
> > -                                               continue;
> > -                                       default_argv = "on-demand";
> > -                               }
> > -                       }
> > -               } else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
> > -                       if (!unsorted_string_list_lookup(&changed_submodule_names,
> > +               switch (get_fetch_recurse_config(submodule, spf->command_line_option))
> > +               {
> > +               default:
> > +               case RECURSE_SUBMODULES_DEFAULT:
> > +               case RECURSE_SUBMODULES_ON_DEMAND:
> > +                       if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
> >                                                          submodule->name))
> >                                 continue;
> >                         default_argv = "on-demand";
> > +                       break;
> > +               case RECURSE_SUBMODULES_ON:
> > +                       default_argv = "yes";
> > +                       break;
> > +               case RECURSE_SUBMODULES_OFF:
> > +                       continue;
> >                 }
> >
> >                 strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
> > --
> > 2.0.0.274.g6b2cd91
> >

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/2] implement fetching of moved submodules
  2017-08-17 17:20 ` [RFC PATCH 1/2] implement fetching of moved submodules Stefan Beller
@ 2017-08-18 16:06   ` Heiko Voigt
  0 siblings, 0 replies; 11+ messages in thread
From: Heiko Voigt @ 2017-08-18 16:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Jonathan Nieder, Jens Lehmann,
	Brandon Williams

On Thu, Aug 17, 2017 at 10:20:13AM -0700, Stefan Beller wrote:
> On Thu, Aug 17, 2017 at 3:53 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > We store the changed submodules paths to calculate which submodule needs
> > fetching. This does not work for moved submodules since their paths do
> > not stay the same in case of a moved submodules. In case of new
> > submodules we do not have a path in the current checkout, since they
> > just appeared in this fetch.
> >
> > It is more general to collect the submodule names for changes instead of
> > their paths to include the above cases.
> >
> > With the change described above we implement 'on-demand' fetching of
> > changes in moved submodules.
> 
> This sounds as if this would also enable fetching new submodules
> eventually?

Yes that was the goal when starting with these changes back then. But it
took more time than I had back then. So instead of letting these changes
sit bitrot again lets see if we can get them integrated.

For new submodules we need to change the iteration somehow. Currently we
are iterating through the index. But new submodules obviously do not
have an index entry (otherwise they would not be new). So instead of the
index we will need to create another list that contains "all"
submodules. Maybe something like: all submodules from the index plus all
submodules that changed / are new? We could also go further and inspect
all submodules from all ref tips to handle submodules on other branches
configured to 'yes'. But I think we should leave that for later if need
arises.

Some merge of index and additional submodules is needed, because for
--recurse-submodules=yes or submodule.<name>.fetchRecurseSubmodules=yes
we always need to run fetch inside the submodule. That would break if we
only looked at submodules that are collected as changed.

> > Note: This does only work when repositories have a .gitmodules file. In
> > other words: It breaks if we do not get a name for a repository.
> > IIRC, consensus was that this is a requirement to get nice submodule
> > handling these days?
> 
> I think that should have been the consensus since ~1.7.8 (since the
> submodules git dir can live inside the superprojects
> <gitdir>/module/<name>).

I agree but since we started without it, we kind of have a mixed state.

> A gitlink entry without corresponding .gitmodules entry is just a gitlink.
> If we happen to have a repository at that path of the gitlink, we can
> be nice and pretend like it is a functional submodule, but it really is
> not. It's just another repo inside the superproject that happen to live
> at the path of a gitlink.

Yeah but at the moment we are handling 'on-demand' fetches and stuff for
such just gitlink submodules. If we were firm on that requirement we
would just skip those but that is not the case with the current
implementation.

> > Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> > ---
> >
> > I updated the leftover code from my series implementing recursive fetch
> > for moved submodules[1] to the current master.
> >
> > This breaks t5531 and t5545 because they do not use a .gitmodules file.
> >
> > I also have some code leftover that does fallback on paths in case no
> > submodule names can be found. But I do not really like it. The question
> > here is how far do we support not using .gitmodules. Is it e.g.
> > reasonable to say: "For --recurse-submodules=on-demand you need a
> > .gitmodules file?"
> 
> I would not intentionally break users here, but any new functionality can
> safely assume (a) we have a proper .gitmodules entry or (b) it is not a
> submodule, so do nothing/be extra careful.
> 
> For example in recursive diff sort of makes sense to also handle
> non-submodule gitlinks, but fetch is harder to tell.

Well we have a few different cases for gitlinks without .gitmodule
entry:

 1. New gitlink: We can not handle since we do not know where to clone
                 from.

 2. Removed gitlink: No need to do anything in fetch

 3. Changed (but same name) gitlink: We can / and currently do run fetch
                                     in it

 4. Renamed: We currently skip those. We could probably do something to
             track the rename and run fetch in case of gitlink changes.
             In my current approach only the ones with a name are
             handled.

So I guess I will add a fallback to paths for 3. so we do not
unnecessarily break users using the current implementation.

> (just last night I was rereading
> https://public-inbox.org/git/CAJo=hJvnAPNAdDcAAwAvU9C4RVeQdoS3Ev9WTguHx4fD0V_nOg@mail.gmail.com/
> which I think is a super cute application of gitlinks. If you happen
> to checkout such
> a tree, you don't want to fetch all of the fake submodules)
> 
> >
> > [1] https://public-inbox.org/git/f5baa2acc09531a16f4f693eebbe60706bb8ed1e.1361751905.git.hvoigt@hvoigt.net/
> 
> Oha, that is from way back in the time. :)

Yeah this code did go through some proper bitrotting :)

> >  submodule.c                 | 92 +++++++++++++++++++++++++--------------------
> >  t/t5526-fetch-submodules.sh | 35 +++++++++++++++++
> >  2 files changed, 86 insertions(+), 41 deletions(-)
> >
> > diff --git a/submodule.c b/submodule.c
> > index 27de65a..3ed78ac 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -23,7 +23,7 @@
> >  static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
> >  static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
> >  static int parallel_jobs = 1;
> > -static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
> > +static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
> >  static int initialized_fetch_ref_tips;
> >  static struct oid_array ref_tips_before_fetch;
> >  static struct oid_array ref_tips_after_fetch;
> > @@ -742,11 +742,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
> >  }
> >
> >  static struct oid_array *submodule_commits(struct string_list *submodules,
> > -                                          const char *path)
> > +                                          const char *name)
> >  {
> >         struct string_list_item *item;
> >
> > -       item = string_list_insert(submodules, path);
> > +       item = string_list_insert(submodules, name);
> >         if (item->util)
> >                 return (struct oid_array *) item->util;
> >
> > @@ -755,39 +755,34 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
> >         return (struct oid_array *) item->util;
> >  }
> >
> > +struct collect_changed_submodules_cb_data {
> > +       struct string_list *changed;
> 
> Here a comment would be helpful or a more concise variable name.
> (What is changed?)

I'll change that to 'changed_submodules' the caller who is passing this
in called this changed. It is the list of changed submodules to be
filled.

> 
> > +       const struct object_id *commit_oid;

What about this name? It is the commit_oid in the superproject of the
current revision under investigation. IMO is easy to get confused what
commits are referenced superproject or submodule. Maybe
'super_commit_oid' would be more clear?

> > +};
> > +
> >  static void collect_changed_submodules_cb(struct diff_queue_struct *q,
> >                                           struct diff_options *options,
> >                                           void *data)
> >  {
> > +       struct collect_changed_submodules_cb_data *me = data;
> > +       struct string_list *changed = me->changed;
> > +       const struct object_id *commit_oid = me->commit_oid;
> >         int i;
> > -       struct string_list *changed = data;
> >
> >         for (i = 0; i < q->nr; i++) {
> >                 struct diff_filepair *p = q->queue[i];
> >                 struct oid_array *commits;
> > +               const struct submodule *submodule;
> 
> (optional style nit, personal opinion, feel free to ignore)
> I personally prefer to not name variables exactly as their type.
> Also most (all) of the struct submodule uses used 'sub' as
> the variable name, maybe keep it consistent?

Well I understand that and its similar for me but I personally I do not
like abbreviations for variable names since I like to be able to read
code natually. So that took precendence over naming submodule
differently than its type here :) Counting submodule vs. sub in
submodule.c I see 4 vs. 5 occurrences...  I'll think about it.

> > +
> >                 if (!S_ISGITLINK(p->two->mode))
> >                         continue;
> >
> > -               if (S_ISGITLINK(p->one->mode)) {
> > -                       /*
> > -                        * NEEDSWORK: We should honor the name configured in
> > -                        * the .gitmodules file of the commit we are examining
> > -                        * here to be able to correctly follow submodules
> > -                        * being moved around.
> > -                        */
> > -                       commits = submodule_commits(changed, p->two->path);
> > -                       oid_array_append(commits, &p->two->oid);
> > -               } else {
> > -                       /* Submodule is new or was moved here */
> > -                       /*
> > -                        * NEEDSWORK: When the .git directories of submodules
> > -                        * live inside the superprojects .git directory some
> > -                        * day we should fetch new submodules directly into
> > -                        * that location too when config or options request
> > -                        * that so they can be checked out from there.
> > -                        */
> > +               submodule = submodule_from_path(commit_oid, p->two->path);
> > +               if (!submodule)
> >                         continue;
> > -               }
> > +
> > +               commits = submodule_commits(changed, submodule->name);
> > +               oid_array_append(commits, &p->two->oid);
> >         }
> >  }
> >
> > @@ -810,11 +805,14 @@ static void collect_changed_submodules(struct string_list *changed,
> >
> >         while ((commit = get_revision(&rev))) {
> >                 struct rev_info diff_rev;
> > +               struct collect_changed_submodules_cb_data data;
> > +               data.changed = changed;
> > +               data.commit_oid = &commit->object.oid;
> >
> >                 init_revisions(&diff_rev, NULL);
> >                 diff_rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
> >                 diff_rev.diffopt.format_callback = collect_changed_submodules_cb;
> > -               diff_rev.diffopt.format_callback_data = changed;
> > +               diff_rev.diffopt.format_callback_data = &data;
> >                 diff_tree_combined_merge(commit, 1, &diff_rev);
> >         }
> >
> > @@ -871,6 +869,7 @@ static int submodule_has_commits(const char *path, struct oid_array *commits)
> >         oid_array_for_each_unique(commits, check_has_commit, &has_commit);
> >
> >         if (has_commit) {
> > +
> 
> stray new line

Yeah saw that too late. Will fix.

> >                 /*
> >                  * Even if the submodule is checked out and the commit is
> >                  * present, make sure it exists in the submodule's object store
> > @@ -945,7 +944,7 @@ int find_unpushed_submodules(struct oid_array *commits,
> >                 const char *remotes_name, struct string_list *needs_pushing)
> >  {
> >         struct string_list submodules = STRING_LIST_INIT_DUP;
> > -       struct string_list_item *submodule;
> > +       struct string_list_item *name;
> >         struct argv_array argv = ARGV_ARRAY_INIT;
> >
> >         /* argv.argv[0] will be ignored by setup_revisions */
> > @@ -956,12 +955,16 @@ int find_unpushed_submodules(struct oid_array *commits,
> >
> >         collect_changed_submodules(&submodules, &argv);
> >
> > -       for_each_string_list_item(submodule, &submodules) {
> > -               struct oid_array *commits = submodule->util;
> > -               const char *path = submodule->string;
> > +       for_each_string_list_item(name, &submodules) {
> > +               struct oid_array *commits = name->util;
> > +               const struct submodule *submodule;
> > +
> > +               submodule = submodule_from_name(&null_oid, name->string);
> > +               if (!submodule)
> > +                       continue;
> >
> > -               if (submodule_needs_pushing(path, commits))
> > -                       string_list_insert(needs_pushing, path);
> > +               if (submodule_needs_pushing(submodule->path, commits))
> > +                       string_list_insert(needs_pushing, submodule->path);
> 
> eventually we can also migrate to name here as well.
> In a later patch.

Yeah, then it would be possible to also push submodules without a
populated worktree.

There was this other thread about the is-populated-check in
push_submodules() where it was consensus that it does not make much
sense but for a maintainer integrating others work it might be useful to
not always have all submodules populated.

Cheers Heiko

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-08-17 17:50     ` Brandon Williams
@ 2017-08-18 16:06       ` Heiko Voigt
  0 siblings, 0 replies; 11+ messages in thread
From: Heiko Voigt @ 2017-08-18 16:06 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Stefan Beller, git@vger.kernel.org, Jonathan Nieder, Jens Lehmann

On Thu, Aug 17, 2017 at 10:50:07AM -0700, Brandon Williams wrote:
> On 08/17, Stefan Beller wrote:
> > On Thu, Aug 17, 2017 at 4:00 AM, Heiko Voigt <hvoigt@hvoigt.net> wrote:
> > > To make extending this logic later easier.
> > >
> > > Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> > > ---
> > > I am quite sure I replicated the same logic but a few more eyes would be
> > > appreciated.
> > 
> > A code cleanup is appreciated!
> > 
> > I thought Brandon had a series in flight doing a very similar cleanup here,
> > but in master..pu there is nothing to be found.
> 
> Yeah there are 2 series in flight which will probably conflict here.
> bw/grep-recurse-submodules and bw/submodule-config-cleanup

Ok then I will wait until those are in and then see if I can base the
cleanup on top. I think it is only necessary as a preparation for the
fully fledged fetch configuration logic mess we will get into once we
get to the full recursive submodule fetch implementation. Not
necessarily needed for the moved submodules.

> > 
> > The code looks good to me.

Thanks.

Cheers Heiko

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 1/2] implement fetching of moved submodules
  2017-08-17 10:53 [RFC PATCH 1/2] implement fetching of moved submodules Heiko Voigt
  2017-08-17 11:00 ` [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
  2017-08-17 17:20 ` [RFC PATCH 1/2] implement fetching of moved submodules Stefan Beller
@ 2017-09-15 13:18 ` Heiko Voigt
  2017-09-15 13:20   ` [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
  2017-09-18 16:49   ` [RFC PATCH v2 1/2] implement fetching of moved submodules Brandon Williams
  2 siblings, 2 replies; 11+ messages in thread
From: Heiko Voigt @ 2017-09-15 13:18 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, Jens.Lehmann, Brandon Williams

We store the changed submodules paths to calculate which submodule needs
fetching. This does not work for moved submodules since their paths do
not stay the same in case of a moved submodules. In case of new
submodules we do not have a path in the current checkout, since they
just appeared in this fetch.

It is more general to collect the submodule names for changes instead of
their paths to include the above cases.

With the change described above we implement 'on-demand' fetching of
changes in moved submodules.

Note: This does only work when repositories have a .gitmodules file. In
other words: It breaks if we do not get a name for a repository.
IIRC, consensus was that this is a requirement to get nice submodule
handling these days?

NEEDSWORK: This breaks t5531 and t5545 because they do not use a
.gitmodules file. I will add a fallback to paths to help such users.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---
This an update of the previous series[1] to the current master. The
fallback is still missing but now it should not conflict with any topics
in flight anymore (hopefully).

Cheers Heiko

[1] https://public-inbox.org/git/20170817105349.GC52233@book.hvoigt.net/

 submodule.c                 | 91 +++++++++++++++++++++++++--------------------
 t/t5526-fetch-submodules.sh | 35 +++++++++++++++++
 2 files changed, 85 insertions(+), 41 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3cea8221e0..38b9905e43 100644
--- a/submodule.c
+++ b/submodule.c
@@ -21,7 +21,7 @@
 #include "parse-options.h"
 
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
-static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
+static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
@@ -667,11 +667,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
 }
 
 static struct oid_array *submodule_commits(struct string_list *submodules,
-					   const char *path)
+					   const char *name)
 {
 	struct string_list_item *item;
 
-	item = string_list_insert(submodules, path);
+	item = string_list_insert(submodules, name);
 	if (item->util)
 		return (struct oid_array *) item->util;
 
@@ -680,39 +680,34 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
 	return (struct oid_array *) item->util;
 }
 
+struct collect_changed_submodules_cb_data {
+	struct string_list *changed;
+	const struct object_id *commit_oid;
+};
+
 static void collect_changed_submodules_cb(struct diff_queue_struct *q,
 					  struct diff_options *options,
 					  void *data)
 {
+	struct collect_changed_submodules_cb_data *me = data;
+	struct string_list *changed = me->changed;
+	const struct object_id *commit_oid = me->commit_oid;
 	int i;
-	struct string_list *changed = data;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		struct oid_array *commits;
+		const struct submodule *submodule;
+
 		if (!S_ISGITLINK(p->two->mode))
 			continue;
 
-		if (S_ISGITLINK(p->one->mode)) {
-			/*
-			 * NEEDSWORK: We should honor the name configured in
-			 * the .gitmodules file of the commit we are examining
-			 * here to be able to correctly follow submodules
-			 * being moved around.
-			 */
-			commits = submodule_commits(changed, p->two->path);
-			oid_array_append(commits, &p->two->oid);
-		} else {
-			/* Submodule is new or was moved here */
-			/*
-			 * NEEDSWORK: When the .git directories of submodules
-			 * live inside the superprojects .git directory some
-			 * day we should fetch new submodules directly into
-			 * that location too when config or options request
-			 * that so they can be checked out from there.
-			 */
+		submodule = submodule_from_path(commit_oid, p->two->path);
+		if (!submodule)
 			continue;
-		}
+
+		commits = submodule_commits(changed, submodule->name);
+		oid_array_append(commits, &p->two->oid);
 	}
 }
 
@@ -735,11 +730,14 @@ static void collect_changed_submodules(struct string_list *changed,
 
 	while ((commit = get_revision(&rev))) {
 		struct rev_info diff_rev;
+		struct collect_changed_submodules_cb_data data;
+		data.changed = changed;
+		data.commit_oid = &commit->object.oid;
 
 		init_revisions(&diff_rev, NULL);
 		diff_rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
 		diff_rev.diffopt.format_callback = collect_changed_submodules_cb;
-		diff_rev.diffopt.format_callback_data = changed;
+		diff_rev.diffopt.format_callback_data = &data;
 		diff_tree_combined_merge(commit, 1, &diff_rev);
 	}
 
@@ -870,7 +868,7 @@ int find_unpushed_submodules(struct oid_array *commits,
 		const char *remotes_name, struct string_list *needs_pushing)
 {
 	struct string_list submodules = STRING_LIST_INIT_DUP;
-	struct string_list_item *submodule;
+	struct string_list_item *name;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 
 	/* argv.argv[0] will be ignored by setup_revisions */
@@ -881,12 +879,16 @@ int find_unpushed_submodules(struct oid_array *commits,
 
 	collect_changed_submodules(&submodules, &argv);
 
-	for_each_string_list_item(submodule, &submodules) {
-		struct oid_array *commits = submodule->util;
-		const char *path = submodule->string;
+	for_each_string_list_item(name, &submodules) {
+		struct oid_array *commits = name->util;
+		const struct submodule *submodule;
+
+		submodule = submodule_from_name(&null_oid, name->string);
+		if (!submodule)
+			continue;
 
-		if (submodule_needs_pushing(path, commits))
-			string_list_insert(needs_pushing, path);
+		if (submodule_needs_pushing(submodule->path, commits))
+			string_list_insert(needs_pushing, submodule->path);
 	}
 
 	free_submodules_oids(&submodules);
@@ -1041,7 +1043,7 @@ static void calculate_changed_submodule_paths(void)
 {
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
-	const struct string_list_item *item;
+	const struct string_list_item *name;
 
 	/* No need to check if there are no submodules configured */
 	if (!submodule_from_path(NULL, NULL))
@@ -1056,16 +1058,20 @@ static void calculate_changed_submodule_paths(void)
 
 	/*
 	 * Collect all submodules (whether checked out or not) for which new
-	 * commits have been recorded upstream in "changed_submodule_paths".
+	 * commits have been recorded upstream in "changed_submodule_names".
 	 */
 	collect_changed_submodules(&changed_submodules, &argv);
 
-	for_each_string_list_item(item, &changed_submodules) {
-		struct oid_array *commits = item->util;
-		const char *path = item->string;
+	for_each_string_list_item(name, &changed_submodules) {
+		struct oid_array *commits = name->util;
+		const struct submodule *submodule;
+
+		submodule = submodule_from_name(&null_oid, name->string);
+		if (!submodule)
+			continue;
 
-		if (!submodule_has_commits(path, commits))
-			string_list_append(&changed_submodule_paths, path);
+		if (!submodule_has_commits(submodule->path, commits))
+			string_list_append(&changed_submodule_names, name->string);
 	}
 
 	free_submodules_oids(&changed_submodules);
@@ -1151,7 +1157,8 @@ static int get_next_submodule(struct child_process *cp,
 				if (fetch_recurse == RECURSE_SUBMODULES_OFF)
 					continue;
 				if (fetch_recurse == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+					if (!unsorted_string_list_lookup(&changed_submodule_names,
+									 submodule->name))
 						continue;
 					default_argv = "on-demand";
 				}
@@ -1159,13 +1166,15 @@ static int get_next_submodule(struct child_process *cp,
 				if (spf->default_option == RECURSE_SUBMODULES_OFF)
 					continue;
 				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+					if (!unsorted_string_list_lookup(&changed_submodule_names,
+									  submodule->name))
 						continue;
 					default_argv = "on-demand";
 				}
 			}
 		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
-			if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
+			if (!unsorted_string_list_lookup(&changed_submodule_names,
+							 submodule->name))
 				continue;
 			default_argv = "on-demand";
 		}
@@ -1258,7 +1267,7 @@ int fetch_populated_submodules(const struct argv_array *options,
 
 	argv_array_clear(&spf.args);
 out:
-	string_list_clear(&changed_submodule_paths, 1);
+	string_list_clear(&changed_submodule_names, 1);
 	return spf.result;
 }
 
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 42251f7f3a..22a7358b6f 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -530,4 +530,39 @@ test_expect_success 'fetching submodule into a broken repository' '
 	test_must_fail git -C dst fetch --recurse-submodules
 '
 
+test_expect_success "fetch new commits when submodule got renamed" '
+	git clone . downstream_rename &&
+	(
+		cd downstream_rename &&
+		git submodule update --init &&
+# NEEDSWORK: we omitted --recursive for the submodule update here since
+# that does not work. See test 7001 for mv "moving nested submodules"
+# for details. Once that is fixed we should add the --recursive option
+# here.
+		git checkout -b rename &&
+		git mv submodule submodule_renamed &&
+		(
+			cd submodule_renamed &&
+			git checkout -b rename_sub &&
+			echo a >a &&
+			git add a &&
+			git commit -ma &&
+			git push origin rename_sub &&
+			git rev-parse HEAD >../../expect
+		) &&
+		git add submodule_renamed &&
+		git commit -m "update renamed submodule" &&
+		git push origin rename
+	) &&
+	(
+		cd downstream &&
+		git fetch --recurse-submodules=on-demand &&
+		(
+			cd submodule &&
+			git rev-parse origin/rename_sub >../../actual
+		)
+	) &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.14.1.145.gb3622a4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-09-15 13:18 ` [RFC PATCH v2 " Heiko Voigt
@ 2017-09-15 13:20   ` Heiko Voigt
  2017-09-18 16:47     ` Brandon Williams
  2017-09-18 16:49   ` [RFC PATCH v2 1/2] implement fetching of moved submodules Brandon Williams
  1 sibling, 1 reply; 11+ messages in thread
From: Heiko Voigt @ 2017-09-15 13:20 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, Jens.Lehmann, Brandon Williams

To make extending this logic later easier.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 submodule.c | 74 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/submodule.c b/submodule.c
index 38b9905e43..fa44fc59f2 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1118,6 +1118,31 @@ struct submodule_parallel_fetch {
 };
 #define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
 
+static int get_fetch_recurse_config(const struct submodule *submodule,
+				    struct submodule_parallel_fetch *spf)
+{
+	if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
+		return spf->command_line_option;
+
+	if (submodule) {
+		char *key;
+		const char *value;
+
+		int fetch_recurse = submodule->fetch_recurse;
+		key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
+		if (!repo_config_get_string_const(the_repository, key, &value)) {
+			fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
+		}
+		free(key);
+
+		if (fetch_recurse != RECURSE_SUBMODULES_NONE)
+			/* local config overrules everything except commandline */
+			return fetch_recurse;
+	}
+
+	return spf->default_option;
+}
+
 static int get_next_submodule(struct child_process *cp,
 			      struct strbuf *err, void *data, void **task_cb)
 {
@@ -1137,46 +1162,21 @@ static int get_next_submodule(struct child_process *cp,
 
 		submodule = submodule_from_path(&null_oid, ce->name);
 
-		default_argv = "yes";
-		if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
-			int fetch_recurse = RECURSE_SUBMODULES_NONE;
-
-			if (submodule) {
-				char *key;
-				const char *value;
-
-				fetch_recurse = submodule->fetch_recurse;
-				key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
-				if (!repo_config_get_string_const(the_repository, key, &value)) {
-					fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
-				}
-				free(key);
-			}
-
-			if (fetch_recurse != RECURSE_SUBMODULES_NONE) {
-				if (fetch_recurse == RECURSE_SUBMODULES_OFF)
-					continue;
-				if (fetch_recurse == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_names,
-									 submodule->name))
-						continue;
-					default_argv = "on-demand";
-				}
-			} else {
-				if (spf->default_option == RECURSE_SUBMODULES_OFF)
-					continue;
-				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
-					if (!unsorted_string_list_lookup(&changed_submodule_names,
-									  submodule->name))
-						continue;
-					default_argv = "on-demand";
-				}
-			}
-		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
-			if (!unsorted_string_list_lookup(&changed_submodule_names,
+		switch (get_fetch_recurse_config(submodule, spf))
+		{
+		default:
+		case RECURSE_SUBMODULES_DEFAULT:
+		case RECURSE_SUBMODULES_ON_DEMAND:
+			if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
 							 submodule->name))
 				continue;
 			default_argv = "on-demand";
+			break;
+		case RECURSE_SUBMODULES_ON:
+			default_argv = "yes";
+			break;
+		case RECURSE_SUBMODULES_OFF:
+			continue;
 		}
 
 		strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
-- 
2.14.1.145.gb3622a4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch
  2017-09-15 13:20   ` [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
@ 2017-09-18 16:47     ` Brandon Williams
  0 siblings, 0 replies; 11+ messages in thread
From: Brandon Williams @ 2017-09-18 16:47 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git, sbeller, jrnieder, Jens.Lehmann

On 09/15, Heiko Voigt wrote:
> To make extending this logic later easier.
> 
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>

I like the result of this patch, its much cleaner.

> ---
>  submodule.c | 74 ++++++++++++++++++++++++++++++-------------------------------
>  1 file changed, 37 insertions(+), 37 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index 38b9905e43..fa44fc59f2 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1118,6 +1118,31 @@ struct submodule_parallel_fetch {
>  };
>  #define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
>  
> +static int get_fetch_recurse_config(const struct submodule *submodule,
> +				    struct submodule_parallel_fetch *spf)
> +{
> +	if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
> +		return spf->command_line_option;
> +
> +	if (submodule) {
> +		char *key;
> +		const char *value;
> +
> +		int fetch_recurse = submodule->fetch_recurse;
> +		key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
> +		if (!repo_config_get_string_const(the_repository, key, &value)) {
> +			fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
> +		}
> +		free(key);
> +
> +		if (fetch_recurse != RECURSE_SUBMODULES_NONE)
> +			/* local config overrules everything except commandline */
> +			return fetch_recurse;
> +	}
> +
> +	return spf->default_option;
> +}
> +
>  static int get_next_submodule(struct child_process *cp,
>  			      struct strbuf *err, void *data, void **task_cb)
>  {
> @@ -1137,46 +1162,21 @@ static int get_next_submodule(struct child_process *cp,
>  
>  		submodule = submodule_from_path(&null_oid, ce->name);
>  
> -		default_argv = "yes";
> -		if (spf->command_line_option == RECURSE_SUBMODULES_DEFAULT) {
> -			int fetch_recurse = RECURSE_SUBMODULES_NONE;
> -
> -			if (submodule) {
> -				char *key;
> -				const char *value;
> -
> -				fetch_recurse = submodule->fetch_recurse;
> -				key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
> -				if (!repo_config_get_string_const(the_repository, key, &value)) {
> -					fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
> -				}
> -				free(key);
> -			}
> -
> -			if (fetch_recurse != RECURSE_SUBMODULES_NONE) {
> -				if (fetch_recurse == RECURSE_SUBMODULES_OFF)
> -					continue;
> -				if (fetch_recurse == RECURSE_SUBMODULES_ON_DEMAND) {
> -					if (!unsorted_string_list_lookup(&changed_submodule_names,
> -									 submodule->name))
> -						continue;
> -					default_argv = "on-demand";
> -				}
> -			} else {
> -				if (spf->default_option == RECURSE_SUBMODULES_OFF)
> -					continue;
> -				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -					if (!unsorted_string_list_lookup(&changed_submodule_names,
> -									  submodule->name))
> -						continue;
> -					default_argv = "on-demand";
> -				}
> -			}
> -		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -			if (!unsorted_string_list_lookup(&changed_submodule_names,
> +		switch (get_fetch_recurse_config(submodule, spf))
> +		{
> +		default:
> +		case RECURSE_SUBMODULES_DEFAULT:
> +		case RECURSE_SUBMODULES_ON_DEMAND:
> +			if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names,
>  							 submodule->name))
>  				continue;
>  			default_argv = "on-demand";
> +			break;
> +		case RECURSE_SUBMODULES_ON:
> +			default_argv = "yes";
> +			break;
> +		case RECURSE_SUBMODULES_OFF:
> +			continue;
>  		}
>  
>  		strbuf_addf(&submodule_path, "%s/%s", spf->work_tree, ce->name);
> -- 
> 2.14.1.145.gb3622a4
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH v2 1/2] implement fetching of moved submodules
  2017-09-15 13:18 ` [RFC PATCH v2 " Heiko Voigt
  2017-09-15 13:20   ` [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
@ 2017-09-18 16:49   ` Brandon Williams
  1 sibling, 0 replies; 11+ messages in thread
From: Brandon Williams @ 2017-09-18 16:49 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: git, sbeller, jrnieder, Jens.Lehmann

On 09/15, Heiko Voigt wrote:
> We store the changed submodules paths to calculate which submodule needs
> fetching. This does not work for moved submodules since their paths do
> not stay the same in case of a moved submodules. In case of new
> submodules we do not have a path in the current checkout, since they
> just appeared in this fetch.
> 
> It is more general to collect the submodule names for changes instead of
> their paths to include the above cases.
> 
> With the change described above we implement 'on-demand' fetching of
> changes in moved submodules.
> 
> Note: This does only work when repositories have a .gitmodules file. In
> other words: It breaks if we do not get a name for a repository.
> IIRC, consensus was that this is a requirement to get nice submodule
> handling these days?
> 
> NEEDSWORK: This breaks t5531 and t5545 because they do not use a
> .gitmodules file. I will add a fallback to paths to help such users.
> 
> Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
> ---
> This an update of the previous series[1] to the current master. The
> fallback is still missing but now it should not conflict with any topics
> in flight anymore (hopefully).

So the idea is to collect changed submodule's name, instead of their
path, so that if they happened to moved you don't have to worry about
the path changing underneath you.  This should be good once those tests
get fixed.

Thanks for working on cleaning this up! :)

> 
> Cheers Heiko
> 
> [1] https://public-inbox.org/git/20170817105349.GC52233@book.hvoigt.net/
> 
>  submodule.c                 | 91 +++++++++++++++++++++++++--------------------
>  t/t5526-fetch-submodules.sh | 35 +++++++++++++++++
>  2 files changed, 85 insertions(+), 41 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index 3cea8221e0..38b9905e43 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -21,7 +21,7 @@
>  #include "parse-options.h"
>  
>  static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
> -static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
> +static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP;
>  static int initialized_fetch_ref_tips;
>  static struct oid_array ref_tips_before_fetch;
>  static struct oid_array ref_tips_after_fetch;
> @@ -667,11 +667,11 @@ const struct submodule *submodule_from_ce(const struct cache_entry *ce)
>  }
>  
>  static struct oid_array *submodule_commits(struct string_list *submodules,
> -					   const char *path)
> +					   const char *name)
>  {
>  	struct string_list_item *item;
>  
> -	item = string_list_insert(submodules, path);
> +	item = string_list_insert(submodules, name);
>  	if (item->util)
>  		return (struct oid_array *) item->util;
>  
> @@ -680,39 +680,34 @@ static struct oid_array *submodule_commits(struct string_list *submodules,
>  	return (struct oid_array *) item->util;
>  }
>  
> +struct collect_changed_submodules_cb_data {
> +	struct string_list *changed;
> +	const struct object_id *commit_oid;
> +};
> +
>  static void collect_changed_submodules_cb(struct diff_queue_struct *q,
>  					  struct diff_options *options,
>  					  void *data)
>  {
> +	struct collect_changed_submodules_cb_data *me = data;
> +	struct string_list *changed = me->changed;
> +	const struct object_id *commit_oid = me->commit_oid;
>  	int i;
> -	struct string_list *changed = data;
>  
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filepair *p = q->queue[i];
>  		struct oid_array *commits;
> +		const struct submodule *submodule;
> +
>  		if (!S_ISGITLINK(p->two->mode))
>  			continue;
>  
> -		if (S_ISGITLINK(p->one->mode)) {
> -			/*
> -			 * NEEDSWORK: We should honor the name configured in
> -			 * the .gitmodules file of the commit we are examining
> -			 * here to be able to correctly follow submodules
> -			 * being moved around.
> -			 */
> -			commits = submodule_commits(changed, p->two->path);
> -			oid_array_append(commits, &p->two->oid);
> -		} else {
> -			/* Submodule is new or was moved here */
> -			/*
> -			 * NEEDSWORK: When the .git directories of submodules
> -			 * live inside the superprojects .git directory some
> -			 * day we should fetch new submodules directly into
> -			 * that location too when config or options request
> -			 * that so they can be checked out from there.
> -			 */
> +		submodule = submodule_from_path(commit_oid, p->two->path);
> +		if (!submodule)
>  			continue;
> -		}
> +
> +		commits = submodule_commits(changed, submodule->name);
> +		oid_array_append(commits, &p->two->oid);
>  	}
>  }
>  
> @@ -735,11 +730,14 @@ static void collect_changed_submodules(struct string_list *changed,
>  
>  	while ((commit = get_revision(&rev))) {
>  		struct rev_info diff_rev;
> +		struct collect_changed_submodules_cb_data data;
> +		data.changed = changed;
> +		data.commit_oid = &commit->object.oid;
>  
>  		init_revisions(&diff_rev, NULL);
>  		diff_rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
>  		diff_rev.diffopt.format_callback = collect_changed_submodules_cb;
> -		diff_rev.diffopt.format_callback_data = changed;
> +		diff_rev.diffopt.format_callback_data = &data;
>  		diff_tree_combined_merge(commit, 1, &diff_rev);
>  	}
>  
> @@ -870,7 +868,7 @@ int find_unpushed_submodules(struct oid_array *commits,
>  		const char *remotes_name, struct string_list *needs_pushing)
>  {
>  	struct string_list submodules = STRING_LIST_INIT_DUP;
> -	struct string_list_item *submodule;
> +	struct string_list_item *name;
>  	struct argv_array argv = ARGV_ARRAY_INIT;
>  
>  	/* argv.argv[0] will be ignored by setup_revisions */
> @@ -881,12 +879,16 @@ int find_unpushed_submodules(struct oid_array *commits,
>  
>  	collect_changed_submodules(&submodules, &argv);
>  
> -	for_each_string_list_item(submodule, &submodules) {
> -		struct oid_array *commits = submodule->util;
> -		const char *path = submodule->string;
> +	for_each_string_list_item(name, &submodules) {
> +		struct oid_array *commits = name->util;
> +		const struct submodule *submodule;
> +
> +		submodule = submodule_from_name(&null_oid, name->string);
> +		if (!submodule)
> +			continue;
>  
> -		if (submodule_needs_pushing(path, commits))
> -			string_list_insert(needs_pushing, path);
> +		if (submodule_needs_pushing(submodule->path, commits))
> +			string_list_insert(needs_pushing, submodule->path);
>  	}
>  
>  	free_submodules_oids(&submodules);
> @@ -1041,7 +1043,7 @@ static void calculate_changed_submodule_paths(void)
>  {
>  	struct argv_array argv = ARGV_ARRAY_INIT;
>  	struct string_list changed_submodules = STRING_LIST_INIT_DUP;
> -	const struct string_list_item *item;
> +	const struct string_list_item *name;
>  
>  	/* No need to check if there are no submodules configured */
>  	if (!submodule_from_path(NULL, NULL))
> @@ -1056,16 +1058,20 @@ static void calculate_changed_submodule_paths(void)
>  
>  	/*
>  	 * Collect all submodules (whether checked out or not) for which new
> -	 * commits have been recorded upstream in "changed_submodule_paths".
> +	 * commits have been recorded upstream in "changed_submodule_names".
>  	 */
>  	collect_changed_submodules(&changed_submodules, &argv);
>  
> -	for_each_string_list_item(item, &changed_submodules) {
> -		struct oid_array *commits = item->util;
> -		const char *path = item->string;
> +	for_each_string_list_item(name, &changed_submodules) {
> +		struct oid_array *commits = name->util;
> +		const struct submodule *submodule;
> +
> +		submodule = submodule_from_name(&null_oid, name->string);
> +		if (!submodule)
> +			continue;
>  
> -		if (!submodule_has_commits(path, commits))
> -			string_list_append(&changed_submodule_paths, path);
> +		if (!submodule_has_commits(submodule->path, commits))
> +			string_list_append(&changed_submodule_names, name->string);
>  	}
>  
>  	free_submodules_oids(&changed_submodules);
> @@ -1151,7 +1157,8 @@ static int get_next_submodule(struct child_process *cp,
>  				if (fetch_recurse == RECURSE_SUBMODULES_OFF)
>  					continue;
>  				if (fetch_recurse == RECURSE_SUBMODULES_ON_DEMAND) {
> -					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +					if (!unsorted_string_list_lookup(&changed_submodule_names,
> +									 submodule->name))
>  						continue;
>  					default_argv = "on-demand";
>  				}
> @@ -1159,13 +1166,15 @@ static int get_next_submodule(struct child_process *cp,
>  				if (spf->default_option == RECURSE_SUBMODULES_OFF)
>  					continue;
>  				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +					if (!unsorted_string_list_lookup(&changed_submodule_names,
> +									  submodule->name))
>  						continue;
>  					default_argv = "on-demand";
>  				}
>  			}
>  		} else if (spf->command_line_option == RECURSE_SUBMODULES_ON_DEMAND) {
> -			if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
> +			if (!unsorted_string_list_lookup(&changed_submodule_names,
> +							 submodule->name))
>  				continue;
>  			default_argv = "on-demand";
>  		}
> @@ -1258,7 +1267,7 @@ int fetch_populated_submodules(const struct argv_array *options,
>  
>  	argv_array_clear(&spf.args);
>  out:
> -	string_list_clear(&changed_submodule_paths, 1);
> +	string_list_clear(&changed_submodule_names, 1);
>  	return spf.result;
>  }
>  
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 42251f7f3a..22a7358b6f 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -530,4 +530,39 @@ test_expect_success 'fetching submodule into a broken repository' '
>  	test_must_fail git -C dst fetch --recurse-submodules
>  '
>  
> +test_expect_success "fetch new commits when submodule got renamed" '
> +	git clone . downstream_rename &&
> +	(
> +		cd downstream_rename &&
> +		git submodule update --init &&
> +# NEEDSWORK: we omitted --recursive for the submodule update here since
> +# that does not work. See test 7001 for mv "moving nested submodules"
> +# for details. Once that is fixed we should add the --recursive option
> +# here.
> +		git checkout -b rename &&
> +		git mv submodule submodule_renamed &&
> +		(
> +			cd submodule_renamed &&
> +			git checkout -b rename_sub &&
> +			echo a >a &&
> +			git add a &&
> +			git commit -ma &&
> +			git push origin rename_sub &&
> +			git rev-parse HEAD >../../expect
> +		) &&
> +		git add submodule_renamed &&
> +		git commit -m "update renamed submodule" &&
> +		git push origin rename
> +	) &&
> +	(
> +		cd downstream &&
> +		git fetch --recurse-submodules=on-demand &&
> +		(
> +			cd submodule &&
> +			git rev-parse origin/rename_sub >../../actual
> +		)
> +	) &&
> +	test_cmp expect actual
> +'
> +
>  test_done
> -- 
> 2.14.1.145.gb3622a4
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-09-18 16:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-17 10:53 [RFC PATCH 1/2] implement fetching of moved submodules Heiko Voigt
2017-08-17 11:00 ` [RFC PATCH 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
2017-08-17 17:24   ` Stefan Beller
2017-08-17 17:50     ` Brandon Williams
2017-08-18 16:06       ` Heiko Voigt
2017-08-17 17:20 ` [RFC PATCH 1/2] implement fetching of moved submodules Stefan Beller
2017-08-18 16:06   ` Heiko Voigt
2017-09-15 13:18 ` [RFC PATCH v2 " Heiko Voigt
2017-09-15 13:20   ` [RFC PATCH v2 2/2] submodule: simplify decision tree whether to or not to fetch Heiko Voigt
2017-09-18 16:47     ` Brandon Williams
2017-09-18 16:49   ` [RFC PATCH v2 1/2] implement fetching of moved submodules Brandon Williams

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).