git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* Bug in Submodule add
@ 2012-09-26  4:18 Jonathan Johnson
  2012-09-26 20:56 ` Jens Lehmann
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Johnson @ 2012-09-26  4:18 UTC (permalink / raw)
  To: git

I believe I have found an issue with the way `submodule add` detects a submodule that already exists in the repository. 

To reproduce

1) add a git submodule in a specific location (we'll say it's at `./submodule/location`)
2) go through the normal steps of removing a submodule, as listed here - https://git.wiki.kernel.org/index.php/GitSubmoduleTutorial
3) Now the submodule is completely removed and there is no reference to it in .gitmodules or .git/config
4) Re-add a different repository at the same location (`./submodule/location`)

Expected - The new submodule repository will be set up at ./submodule/location and have the new repository as its origin

What Actually Happens - The new submodule uses the existing `$gitdir` (old repository) as the actual backing repository to the submodule, but the new repository is reflected in .gitmodules and .git/config.

So to recap, the result is that `git remote show origin`  in the submodule shows a different origin than is in .gitmodules and .git/config

One simple step to remedy this would be to add the deletion of the backing repository from the .git/modules directory, but again, I think an actual command to take care of all of these steps is in order anyways.  Not sure you want to encourage people poking around in the .git directory.

If this is already resolved in the newest versions, please disregard

Thanks!

Jonathan Johnson

http://jondavidjohn.com | me@jondavidjohn.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Bug in Submodule add
  2012-09-26  4:18 Bug in Submodule add Jonathan Johnson
@ 2012-09-26 20:56 ` Jens Lehmann
  2012-09-29 23:04   ` [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists Jens Lehmann
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Lehmann @ 2012-09-26 20:56 UTC (permalink / raw)
  To: Jonathan Johnson; +Cc: Git Mailing List, Heiko Voigt

Am 26.09.2012 06:18, schrieb Jonathan Johnson:
> I believe I have found an issue with the way `submodule add` detects a submodule that already exists in the repository. 

Yes, this is an issue and thanks for the detailed report.

> To reproduce
> 
> 1) add a git submodule in a specific location (we'll say it's at `./submodule/location`)
> 2) go through the normal steps of removing a submodule, as listed here - https://git.wiki.kernel.org/index.php/GitSubmoduleTutorial
> 3) Now the submodule is completely removed and there is no reference to it in .gitmodules or .git/config
> 4) Re-add a different repository at the same location (`./submodule/location`)
> 
> Expected - The new submodule repository will be set up at ./submodule/location and have the new repository as its origin
> 
> What Actually Happens - The new submodule uses the existing `$gitdir` (old repository) as the actual backing repository to the submodule, but the new repository is reflected in .gitmodules and .git/config.
> 
> So to recap, the result is that `git remote show origin`  in the submodule shows a different origin than is in .gitmodules and .git/config
> 
> One simple step to remedy this would be to add the deletion of the backing repository from the .git/modules directory, but again, I think an actual command to take care of all of these steps is in order anyways.  Not sure you want to encourage people poking around in the .git directory.

Unfortunately just throwing away the old repository under .git/modules,
whether manually or by a git command, is no real solution here: it would
make it impossible to go back to a commit which records the old submodule
and check that out again.

The reason for this issue is that the submodule path is used as its name
by "git submodule add". While we could check this type of conflict locally,
we can't really avoid it due to the distributed nature of git (somebody
else could add a different repo under the same path - and thus the same
name - in another clone of the repo).

The only long term solution I can think of is to use some kind of UUID for
the name, so that the names of newly added submodules won't have a chance
to clash anymore. For the short term aborting "git submodule add" when a
submodule of that name already exists in .git/modules of the superproject
together with the ability to provide a custom name might at least solve
the local clashes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists
  2012-09-26 20:56 ` Jens Lehmann
@ 2012-09-29 23:04   ` Jens Lehmann
  2012-09-29 23:05     ` [PATCH 1/2] Teach "git submodule add" the --name option Jens Lehmann
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jens Lehmann @ 2012-09-29 23:04 UTC (permalink / raw)
  To: Jonathan Johnson; +Cc: Git Mailing List, Heiko Voigt, Junio C Hamano

Am 26.09.2012 22:56, schrieb Jens Lehmann:
> Am 26.09.2012 06:18, schrieb Jonathan Johnson:
>> To reproduce
>>
>> 1) add a git submodule in a specific location (we'll say it's at `./submodule/location`)
>> 2) go through the normal steps of removing a submodule, as listed here - https://git.wiki.kernel.org/index.php/GitSubmoduleTutorial
>> 3) Now the submodule is completely removed and there is no reference to it in .gitmodules or .git/config
>> 4) Re-add a different repository at the same location (`./submodule/location`)
>>
>> Expected - The new submodule repository will be set up at ./submodule/location and have the new repository as its origin
>>
>> What Actually Happens - The new submodule uses the existing `$gitdir` (old repository) as the actual backing repository to the submodule, but the new repository is reflected in .gitmodules and .git/config.
>>
>> So to recap, the result is that `git remote show origin`  in the submodule shows a different origin than is in .gitmodules and .git/config
>>
>> One simple step to remedy this would be to add the deletion of the backing repository from the .git/modules directory, but again, I think an actual command to take care of all of these steps is in order anyways.  Not sure you want to encourage people poking around in the .git directory.
> 
> Unfortunately just throwing away the old repository under .git/modules,
> whether manually or by a git command, is no real solution here: it would
> make it impossible to go back to a commit which records the old submodule
> and check that out again.
> 
> The reason for this issue is that the submodule path is used as its name
> by "git submodule add". While we could check this type of conflict locally,
> we can't really avoid it due to the distributed nature of git (somebody
> else could add a different repo under the same path - and thus the same
> name - in another clone of the repo).
> 
> The only long term solution I can think of is to use some kind of UUID for
> the name, so that the names of newly added submodules won't have a chance
> to clash anymore. For the short term aborting "git submodule add" when a
> submodule of that name already exists in .git/modules of the superproject
> together with the ability to provide a custom name might at least solve
> the local clashes.

This two patch series implements the short term solution described above.

Using some kind of UUID can easily be added in a subsequent patch, we just
have to replace 'sm_name="$sm_path"' with 'sm_name=$(<generate uuid>)' in
line 348 of git-submodule.sh. I think it'll be the best solution to just
use a random UUID for that, as doing anything clever (like using the SHA1
of the url to avoid copies of the same remote repo) might lead to subtle
breakages (e.g. because it assumes the url stays unique forever, which it
sometimes won't). But maybe the short term solution is sufficient as most
of the time people won't produce submodule name conflicts (and names
derived from paths are much more readable that UUIDs). Thoughts?


Jens Lehmann (2):
  Teach "git submodule add" the --name option
  submodule add: Fail when .git/modules/<name> already exists

 Documentation/git-submodule.txt |  7 ++++-
 Documentation/gitmodules.txt    |  4 ++-
 git-submodule.sh                | 35 ++++++++++++++++-------
 t/t7400-submodule-basic.sh      | 63 +++++++++++++++++++++++++++++++++++++++++
 t/t7406-submodule-update.sh     |  2 +-
 5 files changed, 97 insertions(+), 14 deletions(-)

-- 
1.7.12.1.430.g4fd6dc4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] Teach "git submodule add" the --name option
  2012-09-29 23:04   ` [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists Jens Lehmann
@ 2012-09-29 23:05     ` Jens Lehmann
  2012-09-29 23:07     ` [PATCH 2/2] submodule add: Fail when .git/modules/<name> already exists Jens Lehmann
  2012-09-30  4:47     ` [PATCH 0/2] Let "git submodule add" fail " Junio C Hamano
  2 siblings, 0 replies; 9+ messages in thread
From: Jens Lehmann @ 2012-09-29 23:05 UTC (permalink / raw)
  To: Jonathan Johnson; +Cc: Git Mailing List, Heiko Voigt, Junio C Hamano

"git submodule add" initializes the name of a submodule to its path. This
was ok as long as the .git directory lived inside the submodule's work
tree, but since 1.7.8 it is stored in the .git/modules/<name> directory of
the superproject, making the submodule name survive the removal of the
submodule's work tree. This leads to problems when the user tries to add a
different submodule at the same path - and thus the same name - later, as
that will happily try to restore the submodule from the old repository
instead of the one the user specified and will lead to a checkout of the
wrong repository.

Add the new "--name" option to let the user provide a name for the
submodule. This enables the user to solve this conflict without having to
remove .git/modules/<name> by hand (which is no viable solution as it
makes it impossible to checkout a commit that records the old submodule
and populate it, as that will still check out the new submodule for the
same reason).

To achieve that the submodule's name is added to the parameter list of
the module_clone() helper function. This makes it possible to remove the
call of module_name() there because both callers of module_clone() already
know the name and can provide it as argument number two.

Reported-by: Jonathan Johnson <me@jondavidjohn.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---
 Documentation/git-submodule.txt |  7 ++++++-
 Documentation/gitmodules.txt    |  4 +++-
 git-submodule.sh                | 32 ++++++++++++++++++++---------
 t/t7400-submodule-basic.sh      | 45 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 76 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 2de7bf0..22efca0 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -9,7 +9,7 @@ git-submodule - Initialize, update or inspect submodules
 SYNOPSIS
 --------
 [verse]
-'git submodule' [--quiet] add [-b branch] [-f|--force]
+'git submodule' [--quiet] add [-b branch] [-f|--force] [--name <name>]
 	      [--reference <repository>] [--] <repository> [<path>]
 'git submodule' [--quiet] status [--cached] [--recursive] [--] [<path>...]
 'git submodule' [--quiet] init [--] [<path>...]
@@ -266,6 +266,11 @@ OPTIONS
 	Initialize all submodules for which "git submodule init" has not been
 	called so far before updating.

+--name::
+	This option is only valid for the add command. It sets the submodule's
+	name to the given string instead of defaulting to its path. The name
+	must be valid as a directory name and may not end with a '/'.
+
 --reference <repository>::
 	This option is only valid for add and update commands.  These
 	commands sometimes need to clone a remote repository. In this case,
diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt
index 4effd78..ab3e91c 100644
--- a/Documentation/gitmodules.txt
+++ b/Documentation/gitmodules.txt
@@ -18,7 +18,9 @@ working tree, is a text file with a syntax matching the requirements
 of linkgit:git-config[1].

 The file contains one subsection per submodule, and the subsection value
-is the name of the submodule. Each submodule section also contains the
+is the name of the submodule. The name is set to the path where the
+submodule has been added unless it was customized with the '--name'
+option of 'git submodule add'. Each submodule section also contains the
 following required keys:

 submodule.<name>.path::
diff --git a/git-submodule.sh b/git-submodule.sh
index 3e2045e..22febb1 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -5,7 +5,7 @@
 # Copyright (c) 2007 Lars Hjemli

 dashless=$(basename "$0" | sed -e 's/-/ /')
-USAGE="[--quiet] add [-b branch] [-f|--force] [--reference <repository>] [--] <repository> [<path>]
+USAGE="[--quiet] add [-b branch] [-f|--force] [--name <name>] [--reference <repository>] [--] <repository> [<path>]
    or: $dashless [--quiet] status [--cached] [--recursive] [--] [<path>...]
    or: $dashless [--quiet] init [--] [<path>...]
    or: $dashless [--quiet] update [--init] [-N|--no-fetch] [-f|--force] [--rebase] [--reference <repository>] [--merge] [--recursive] [--] [<path>...]
@@ -29,6 +29,7 @@ files=
 nofetch=
 update=
 prefix=
+custom_name=

 # The function takes at most 2 arguments. The first argument is the
 # URL that navigates to the submodule origin repo. When relative, this URL
@@ -179,8 +180,9 @@ module_name()
 module_clone()
 {
 	sm_path=$1
-	url=$2
-	reference="$3"
+	name=$2
+	url=$3
+	reference="$4"
 	quiet=
 	if test -n "$GIT_QUIET"
 	then
@@ -189,8 +191,6 @@ module_clone()

 	gitdir=
 	gitdir_base=
-	name=$(module_name "$sm_path" 2>/dev/null)
-	test -n "$name" || name="$sm_path"
 	base_name=$(dirname "$name")

 	gitdir=$(git rev-parse --git-dir)
@@ -272,6 +272,11 @@ cmd_add()
 			reference="$1"
 			shift
 			;;
+		--name)
+			case "$2" in '') usage ;; esac
+			custom_name=$2
+			shift
+			;;
 		--)
 			shift
 			break
@@ -336,6 +341,13 @@ Use -f if you really want to add it." >&2
 		exit 1
 	fi

+	if test -n "$custom_name"
+	then
+		sm_name="$custom_name"
+	else
+		sm_name="$sm_path"
+	fi
+
 	# perhaps the path exists and is already a git repo, else clone it
 	if test -e "$sm_path"
 	then
@@ -348,7 +360,7 @@ Use -f if you really want to add it." >&2

 	else

-		module_clone "$sm_path" "$realrepo" "$reference" || exit
+		module_clone "$sm_path" "$sm_name" "$realrepo" "$reference" || exit
 		(
 			clear_local_git_env
 			cd "$sm_path" &&
@@ -359,13 +371,13 @@ Use -f if you really want to add it." >&2
 			esac
 		) || die "$(eval_gettext "Unable to checkout submodule '\$sm_path'")"
 	fi
-	git config submodule."$sm_path".url "$realrepo"
+	git config submodule."$sm_name".url "$realrepo"

 	git add $force "$sm_path" ||
 	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"

-	git config -f .gitmodules submodule."$sm_path".path "$sm_path" &&
-	git config -f .gitmodules submodule."$sm_path".url "$repo" &&
+	git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
+	git config -f .gitmodules submodule."$sm_name".url "$repo" &&
 	git add --force .gitmodules ||
 	die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
 }
@@ -594,7 +606,7 @@ Maybe you want to use 'update --init'?")"

 		if ! test -d "$sm_path"/.git -o -f "$sm_path"/.git
 		then
-			module_clone "$sm_path" "$url" "$reference"|| exit
+			module_clone "$sm_path" "$name" "$url" "$reference" || exit
 			cloned_modules="$cloned_modules;$name"
 			subsha1=
 		else
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 56a81cd..78bf739 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -681,4 +681,49 @@ test_expect_success 'moving the superproject does not break submodules' '
 	)
 '

+test_expect_success 'submodule add --name allows to replace a submodule with another at the same path' '
+	(
+		cd addtest2 &&
+		(
+			cd repo &&
+			echo "$submodurl/repo" >expect &&
+			git config remote.origin.url >actual &&
+			test_cmp expect actual &&
+			echo "gitdir: ../.git/modules/repo" >expect &&
+			test_cmp expect .git
+		) &&
+		rm -rf repo &&
+		git rm repo &&
+		git submodule add -q --name repo_new "$submodurl/bare.git" repo >actual &&
+		test ! -s actual &&
+		echo "gitdir: ../.git/modules/submod" >expect &&
+		test_cmp expect submod/.git &&
+		(
+			cd repo &&
+			echo "$submodurl/bare.git" >expect &&
+			git config remote.origin.url >actual &&
+			test_cmp expect actual &&
+			echo "gitdir: ../.git/modules/repo_new" >expect &&
+			test_cmp expect .git
+		) &&
+		echo "repo" >expect &&
+		git config -f .gitmodules submodule.repo.path >actual &&
+		test_cmp expect actual &&
+		git config -f .gitmodules submodule.repo_new.path >actual &&
+		test_cmp expect actual&&
+		echo "$submodurl/repo" >expect &&
+		git config -f .gitmodules submodule.repo.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/bare.git" >expect &&
+		git config -f .gitmodules submodule.repo_new.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/repo" >expect &&
+		git config submodule.repo.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/bare.git" >expect &&
+		git config submodule.repo_new.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
1.7.12.1.430.g4fd6dc4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] submodule add: Fail when .git/modules/<name> already exists
  2012-09-29 23:04   ` [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists Jens Lehmann
  2012-09-29 23:05     ` [PATCH 1/2] Teach "git submodule add" the --name option Jens Lehmann
@ 2012-09-29 23:07     ` Jens Lehmann
  2012-09-30  4:47     ` [PATCH 0/2] Let "git submodule add" fail " Junio C Hamano
  2 siblings, 0 replies; 9+ messages in thread
From: Jens Lehmann @ 2012-09-29 23:07 UTC (permalink / raw)
  To: Jonathan Johnson; +Cc: Git Mailing List, Heiko Voigt, Junio C Hamano

When adding a new submodule it can happen that .git/modules/<name> already
contains a submodule repo, e.g. when a submodule is removed from the work
tree and another submodule is added at the same path. But then the work
tree of the submodule will be populated using the existing repository and
not the one the user provided.

Error out in that case and tell the user she should use a different name
for the submodule with the "--name" option to avoid this problem. In one
test in t7406 the --name option had to be added to "git submodule add", as
that test re-adds a formerly removed submodule.

Reported-by: Jonathan Johnson <me@jondavidjohn.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---
 git-submodule.sh            |  3 ++-
 t/t7400-submodule-basic.sh  | 18 ++++++++++++++++++
 t/t7406-submodule-update.sh |  2 +-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 22febb1..58cd053 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -359,7 +359,8 @@ Use -f if you really want to add it." >&2
 		fi

 	else
-
+		test ! -d ".git/modules/$sm_name" ||
+		die "$(eval_gettext "Submodule name '\$sm_name' is already used. Please choose another name with the '--name' option.")"
 		module_clone "$sm_path" "$sm_name" "$realrepo" "$reference" || exit
 		(
 			clear_local_git_env
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 78bf739..a031a27 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -726,4 +726,22 @@ test_expect_success 'submodule add --name allows to replace a submodule with ano
 	)
 '

+test_expect_success 'submodule add with an existing name fails' '
+	(
+		cd addtest2 &&
+		rm -rf repo &&
+		test_must_fail git submodule add -q --name repo_new "$submodurl/bare.git" repo &&
+		test ! -d repo &&
+		echo "repo" >expect &&
+		git config -f .gitmodules submodule.repo_new.path >actual &&
+		test_cmp expect actual&&
+		echo "$submodurl/bare.git" >expect &&
+		git config -f .gitmodules submodule.repo_new.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/bare.git" >expect &&
+		git config submodule.repo_new.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 1542653..2d44c51 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -627,7 +627,7 @@ test_expect_success 'submodule add properly re-creates deeper level submodules'
 	(cd super &&
 	 git reset --hard master &&
 	 rm -rf deeper/ &&
-	 git submodule add ../submodule deeper/submodule
+	 git submodule add --name deeper/submodule2 ../submodule deeper/submodule
 	)
 '

-- 
1.7.12.1.430.g4fd6dc4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists
  2012-09-29 23:04   ` [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists Jens Lehmann
  2012-09-29 23:05     ` [PATCH 1/2] Teach "git submodule add" the --name option Jens Lehmann
  2012-09-29 23:07     ` [PATCH 2/2] submodule add: Fail when .git/modules/<name> already exists Jens Lehmann
@ 2012-09-30  4:47     ` Junio C Hamano
  2012-09-30 19:19       ` Jens Lehmann
  2 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2012-09-30  4:47 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Jonathan Johnson, Git Mailing List, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

>> The only long term solution I can think of is to use some kind of UUID for
>> the name, so that the names of newly added submodules won't have a chance
>> to clash anymore. For the short term aborting "git submodule add" when a
>> submodule of that name already exists in .git/modules of the superproject
>> together with the ability to provide a custom name might at least solve
>> the local clashes.

That assumes that the addition of the submodule for the second time
is to add a completely different submodule at the same location and
is done on purpose, but is that a sensible assumption?

If a superproject that is about an embedded appliance used to have a
submodule A bound at its path "kernel", but for some reason stopped
shipping with "kernel" and then later reintroduced the directory
"kernel" bound to some submodule B, my gut feeling is that it is
just as likely (if not more likely) that A and B are indeed the same
submodule (i.e. it shares the same history) as they are totally
unrelated.

Could it be that it is a user error combined with the immaturity of
"git submodule" tool that does not yet support "it used to be here,
but it disappears for a while and then it reappears in the history
of the superproject" very well that caused the user to manually add
a "new" submodule which in fact is the same submodule at the same
path?

I think failing with a better error message is a good idea. It
should suggest to either resurrect the submodule that is stashed
away in "$GIT_DIR/modules/$name" if it indeed is the same, or to
give it a different name (perhaps "kernel" used to be pointing at
the Linux kernel history, then the user is replacing it with a
totally different implementation that is really from different
origin and do not share any history, perhaps BSD).  In such a case,
the user may want to pick bsd-kernel or something as its name, to
differentiate it.

> Using some kind of UUID can easily be added in a subsequent patch,...

I would suggest thinking really long and hard before saying UUID.
It is an easy cop-out to ensure uniqueness, but risks to allow two
people (or one person at two different time) to give two unrelated
names to a single thing that actually is the same.

A better alternative might be to use the commit object name at the
root of the history of the submodule, which would catch the simplest
and most common case of the mistake, I would think.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists
  2012-09-30  4:47     ` [PATCH 0/2] Let "git submodule add" fail " Junio C Hamano
@ 2012-09-30 19:19       ` Jens Lehmann
  2012-09-30 21:01         ` [PATCH v2 2/2] submodule add: Fail when .git/modules/<name> already exists unless forced Jens Lehmann
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Lehmann @ 2012-09-30 19:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Johnson, Git Mailing List, Heiko Voigt

Am 30.09.2012 06:47, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>>> The only long term solution I can think of is to use some kind of UUID for
>>> the name, so that the names of newly added submodules won't have a chance
>>> to clash anymore. For the short term aborting "git submodule add" when a
>>> submodule of that name already exists in .git/modules of the superproject
>>> together with the ability to provide a custom name might at least solve
>>> the local clashes.
> 
> That assumes that the addition of the submodule for the second time
> is to add a completely different submodule at the same location and
> is done on purpose, but is that a sensible assumption?
> 
> If a superproject that is about an embedded appliance used to have a
> submodule A bound at its path "kernel", but for some reason stopped
> shipping with "kernel" and then later reintroduced the directory
> "kernel" bound to some submodule B, my gut feeling is that it is
> just as likely (if not more likely) that A and B are indeed the same
> submodule (i.e. it shares the same history) as they are totally
> unrelated.
> 
> Could it be that it is a user error combined with the immaturity of
> "git submodule" tool that does not yet support "it used to be here,
> but it disappears for a while and then it reappears in the history
> of the superproject" very well that caused the user to manually add
> a "new" submodule which in fact is the same submodule at the same
> path?
> 
> I think failing with a better error message is a good idea. It
> should suggest to either resurrect the submodule that is stashed
> away in "$GIT_DIR/modules/$name" if it indeed is the same, or to
> give it a different name (perhaps "kernel" used to be pointing at
> the Linux kernel history, then the user is replacing it with a
> totally different implementation that is really from different
> origin and do not share any history, perhaps BSD).  In such a case,
> the user may want to pick bsd-kernel or something as its name, to
> differentiate it.

Good point! I will add a more detailed error message (including
the url of the default remote which is configured for the already
present submodule repo) and teach --force to skip the test and
resurrect that submodule repo.

>> Using some kind of UUID can easily be added in a subsequent patch,...
> 
> I would suggest thinking really long and hard before saying UUID.

Absolutely.

> It is an easy cop-out to ensure uniqueness, but risks to allow two
> people (or one person at two different time) to give two unrelated
> names to a single thing that actually is the same.

I'm not too worried about that (even though it would be good for
the disk footprint). And I couldn't come up with a better way to
solve the problem we currently have when the same name is used
for two different submodule repos.

> A better alternative might be to use the commit object name at the
> root of the history of the submodule, which would catch the simplest
> and most common case of the mistake, I would think.

This won't work well e.g. when one uses a fork of another repo,
that will contain different commits while still having the same
root commit. I was also thinking about hashing the URL, but that
will break when the user reconfigures the URL to somewhere else.
After playing with some ideas I couldn't find a way to let the
submodule's repo provide sufficient uniqueness.

I'd say for now we go with the detection of name clashes and let
the user choose if he wants to resurrect that submodule repo or
if he wants to choose another name. But if we notice further down
the road that collisions are a problem in real life, we can think
again if UUIDs - or something else - might be a solution.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] submodule add: Fail when .git/modules/<name> already exists unless forced
  2012-09-30 19:19       ` Jens Lehmann
@ 2012-09-30 21:01         ` Jens Lehmann
  2012-10-01  0:06           ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Lehmann @ 2012-09-30 21:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Johnson, Git Mailing List, Heiko Voigt

When adding a new submodule it can happen that .git/modules/<name> already
contains a submodule repo, e.g. when a submodule is removed from the work
tree and another submodule is added at the same path. But then the work
tree of the submodule will be populated using the existing repository and
not the one the user provided, which results in an incorrect work tree. On
the other hand the user might reactivate a submodule removed earlier, then
reusing that .git directory is the Right Thing to do.

As git can't decide what is the case, error out and tell the user she
should use either use a different name for the submodule with the "--name"
option or can reuse the .git directory for the newly added submodule by
providing the --force option (which only makes sense when the upstream
matches, so the error message lists all remotes of .git/modules/<name>).

In one test in t7406 the --force option had to be added to "git submodule
add", as that test re-adds a formerly removed submodule.

Reported-by: Jonathan Johnson <me@jondavidjohn.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---

Am 30.09.2012 21:19, schrieb Jens Lehmann:
> Am 30.09.2012 06:47, schrieb Junio C Hamano:
>> I think failing with a better error message is a good idea. It
>> should suggest to either resurrect the submodule that is stashed
>> away in "$GIT_DIR/modules/$name" if it indeed is the same, or to
>> give it a different name (perhaps "kernel" used to be pointing at
>> the Linux kernel history, then the user is replacing it with a
>> totally different implementation that is really from different
>> origin and do not share any history, perhaps BSD).  In such a case,
>> the user may want to pick bsd-kernel or something as its name, to
>> differentiate it.
> 
> Good point! I will add a more detailed error message (including
> the url of the default remote which is configured for the already
> present submodule repo) and teach --force to skip the test and
> resurrect that submodule repo.

The message when "git submodule add" finds .git/modules/<name> is:

A git directory for '<name>' is found locally with remote(s):
  origin	<url(s) from .git/modules/<name>>
If you want to reuse this local git directory instead of cloning again from
  <url given on command line>
use the '--force' option. If the local git directory is not the correct repo
or you are unsure what this means choose another name with the '--name' option.

When run with the --force option the following message is printed:

Reactivating local git directory for submodule '<name>'.


 git-submodule.sh            | 15 ++++++++++++++-
 t/t7400-submodule-basic.sh  | 30 ++++++++++++++++++++++++++++++
 t/t7406-submodule-update.sh |  2 +-
 3 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 22febb1..e8112c8 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -359,7 +359,20 @@ Use -f if you really want to add it." >&2
 		fi

 	else
-
+		if test -d ".git/modules/$sm_name"
+		then
+			if test -z "$force"
+			then
+				echo >&2 "$(eval_gettext "A git directory for '\$sm_name' is found locally with remote(s):")"
+				GIT_DIR=".git/modules/$sm_name" GIT_WORK_TREE=. git remote -v | grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',, >&2
+				echo >&2 "$(eval_gettext "If you want to reuse this local git directory instead of cloning again from")"
+				echo >&2 "  $realrepo"
+				echo >&2 "$(eval_gettext "use the '--force' option. If the local git directory is not the correct repo")"
+				die "$(eval_gettext "or you are unsure what this means choose another name with the '--name' option.")"
+			else
+				echo "$(eval_gettext "Reactivating local git directory for submodule '\$sm_name'.")"
+			fi
+		fi
 		module_clone "$sm_path" "$sm_name" "$realrepo" "$reference" || exit
 		(
 			clear_local_git_env
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 78bf739..f1a94f7 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -726,4 +726,34 @@ test_expect_success 'submodule add --name allows to replace a submodule with ano
 	)
 '

+test_expect_success 'submodule add with an existing name fails unless forced' '
+	(
+		cd addtest2 &&
+		rm -rf repo &&
+		git rm repo &&
+		test_must_fail git submodule add -q --name repo_new "$submodurl/repo.git" repo &&
+		test ! -d repo &&
+		echo "repo" >expect &&
+		git config -f .gitmodules submodule.repo_new.path >actual &&
+		test_cmp expect actual&&
+		echo "$submodurl/bare.git" >expect &&
+		git config -f .gitmodules submodule.repo_new.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/bare.git" >expect &&
+		git config submodule.repo_new.url >actual &&
+		test_cmp expect actual &&
+		git submodule add -f -q --name repo_new "$submodurl/repo.git" repo &&
+		test -d repo &&
+		echo "repo" >expect &&
+		git config -f .gitmodules submodule.repo_new.path >actual &&
+		test_cmp expect actual&&
+		echo "$submodurl/repo.git" >expect &&
+		git config -f .gitmodules submodule.repo_new.url >actual &&
+		test_cmp expect actual &&
+		echo "$submodurl/repo.git" >expect &&
+		git config submodule.repo_new.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 1542653..feaec6c 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -627,7 +627,7 @@ test_expect_success 'submodule add properly re-creates deeper level submodules'
 	(cd super &&
 	 git reset --hard master &&
 	 rm -rf deeper/ &&
-	 git submodule add ../submodule deeper/submodule
+	 git submodule add --force ../submodule deeper/submodule
 	)
 '

-- 
1.7.12.1.430.gd057107

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] submodule add: Fail when .git/modules/<name> already exists unless forced
  2012-09-30 21:01         ` [PATCH v2 2/2] submodule add: Fail when .git/modules/<name> already exists unless forced Jens Lehmann
@ 2012-10-01  0:06           ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2012-10-01  0:06 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Jonathan Johnson, Git Mailing List, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

>> Good point! I will add a more detailed error message (including
>> the url of the default remote which is configured for the already
>> present submodule repo) and teach --force to skip the test and
>> resurrect that submodule repo.
>
> The message when "git submodule add" finds .git/modules/<name> is:
>
> A git directory for '<name>' is found locally with remote(s):
>   origin	<url(s) from .git/modules/<name>>
> If you want to reuse this local git directory instead of cloning again from
>   <url given on command line>
> use the '--force' option. If the local git directory is not the correct repo
> or you are unsure what this means choose another name with the '--name' option.
>
> When run with the --force option the following message is printed:
>
> Reactivating local git directory for submodule '<name>'.

Thanks, will re-queue.

The approach "submodule rm" takes when removing a project is to
treat the removed submodule as not necessary for the current commit
in the superproject, but it is considered necessary elsewhere in the
history of the superproject, and that is why we stash away the
repository in $GIT_DIR/modules of the superproject.

We may however want to think about another mode of user error where
the user runs "submodule add $path" for a wrong repository, realizes
the mistake _before_ making any commit and try to repoint the $path
to a correct repository.  The behaviour of "submodule add" in this
patch, and the behaviour of existing "submodule rm", assumes that
the user is not stupid and won't make such a mistake, but to recover,
the user may need a way to really nuke the submodule repository that
was added by the earlier misteake (which is not needed anywhere in
the history of the superproject) and $GIT_DIR/module/$name really
replaced with the updated one.

I don't know how important to support a recovery procedure from such
mistakes, though.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-10-01  0:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-26  4:18 Bug in Submodule add Jonathan Johnson
2012-09-26 20:56 ` Jens Lehmann
2012-09-29 23:04   ` [PATCH 0/2] Let "git submodule add" fail when .git/modules/<name> already exists Jens Lehmann
2012-09-29 23:05     ` [PATCH 1/2] Teach "git submodule add" the --name option Jens Lehmann
2012-09-29 23:07     ` [PATCH 2/2] submodule add: Fail when .git/modules/<name> already exists Jens Lehmann
2012-09-30  4:47     ` [PATCH 0/2] Let "git submodule add" fail " Junio C Hamano
2012-09-30 19:19       ` Jens Lehmann
2012-09-30 21:01         ` [PATCH v2 2/2] submodule add: Fail when .git/modules/<name> already exists unless forced Jens Lehmann
2012-10-01  0:06           ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).