git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC][PATCH 0/7] Submodule support in git mv, git rm
@ 2008-07-16 19:11 Petr Baudis
  2008-07-16 19:11 ` [PATCH 1/7] git-mv: Remove dead code branch Petr Baudis
                   ` (6 more replies)
  0 siblings, 7 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git, git

The following series implements submodule support in git mv and git rm,
plus enhancing the submodules testsuite a bit. I'd appreciate comments,
especially on the git mv change, since the index_path_src_sha1 is
really a horrible hack.

The pinnacle of this series was supposed to be merge-recursive support
for submodule-somethingelese conflicts, however that seems a bit more
complicated than I expected, so I decided to first send the rest for
a review.

---

Petr Baudis (7):
      t7403: Submodule git mv, git rm testsuite
      git rm: Support for removing submodules
      git mv: Support moving submodules
      submodule.*: Introduce simple C interface for submodule lookup by path
      git submodule add: Fix naming clash handling
      t7400: Add short "git submodule add" testsuite
      git-mv: Remove dead code branch


 Documentation/git-rm.txt   |    6 +
 Makefile                   |    2 
 builtin-mv.c               |   67 ++++++++++--
 builtin-rm.c               |   65 ++++++++++--
 cache.h                    |    2 
 git-submodule.sh           |   15 ++-
 sha1_file.c                |   10 ++
 submodule.c                |   50 +++++++++
 submodule.h                |    8 +
 t/t7400-submodule-basic.sh |   39 +++++++
 t/t7403-submodule-mvrm.sh  |  242 ++++++++++++++++++++++++++++++++++++++++++++
 11 files changed, 476 insertions(+), 30 deletions(-)
 create mode 100644 submodule.c
 create mode 100644 submodule.h
 create mode 100755 t/t7403-submodule-mvrm.sh

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/7] git-mv: Remove dead code branch
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-16 19:11 ` [PATCH 2/7] t7400: Add short "git submodule add" testsuite Petr Baudis
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

The path list builder had a branch for the case the source is not in index, but
this can happen only if the source was a directory. However, in that case we
have already expanded the list to the directory contents and set mode
to WORKING_DIRECTORY, which is tested earlier.

The patch removes the superfluous branch and adds an assert() instead. git-mv
testsuite still passes.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c |   16 +++++++---------
 1 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 5530e11..158a83d 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -227,15 +227,13 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		if (cache_name_pos(src, strlen(src)) >= 0) {
-			path_list_insert(src, &deleted);
-
-			/* destination can be a directory with 1 file inside */
-			if (path_list_has_path(&overwritten, dst))
-				path_list_insert(dst, &changed);
-			else
-				path_list_insert(dst, &added);
-		} else
+		assert(cache_name_pos(src, strlen(src)) >= 0);
+
+		path_list_insert(src, &deleted);
+		/* destination can be a directory with 1 file inside */
+		if (path_list_has_path(&overwritten, dst))
+			path_list_insert(dst, &changed);
+		else
 			path_list_insert(dst, &added);
 	}
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/7] t7400: Add short "git submodule add" testsuite
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
  2008-07-16 19:11 ` [PATCH 1/7] git-mv: Remove dead code branch Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-16 19:11 ` [PATCH 3/7] git submodule add: Fix naming clash handling Petr Baudis
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

This patch introduces basic tests for

	git submodule add

covering the basic functionality and the -b parameter.

A trivial update --init test fix freeloads on this commit as well.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 t/t7400-submodule-basic.sh |   28 +++++++++++++++++++++++++++-
 1 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 6c7b902..ab5eb1e 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -200,7 +200,7 @@ test_expect_success 'update --init' '
 
 	mv init init2 &&
 	git config -f .gitmodules submodule.example.url "$(pwd)/init2" &&
-	git config --remove-section submodule.example
+	git config --remove-section submodule.example &&
 	git submodule update init > update.out &&
 	grep "not initialized" update.out &&
 	test ! -d init/.git &&
@@ -209,4 +209,30 @@ test_expect_success 'update --init' '
 
 '
 
+test_expect_success 'submodule add' '
+
+	git submodule add "$(pwd)/init2" init-added &&
+	test -d init-added/.git &&
+	[ "$(git config -f .gitmodules submodule.init-added.url)" = "$(pwd)/init2" ] &&
+	[ "$(git config -f .gitmodules submodule.init-added.path)" = "init-added" ]
+
+'
+
+test_expect_success 'submodule add -b' '
+
+	(
+		cd init2 &&
+		git checkout -b branch &&
+		echo t >s &&
+		git add s &&
+		git commit -m "change branch" &&
+		git checkout master
+	) &&
+	git submodule add -b branch -- "$(pwd)/init2" init-added-b &&
+	test -d init-added-b/.git &&
+	[ "$(git config -f .gitmodules submodule.init-added-b.url)" = "$(pwd)/init2" ] &&
+	[ "$(cd init2 && git rev-parse branch)" = "$(cd init-added-b && git rev-parse HEAD)" ]
+
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 3/7] git submodule add: Fix naming clash handling
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
  2008-07-16 19:11 ` [PATCH 1/7] git-mv: Remove dead code branch Petr Baudis
  2008-07-16 19:11 ` [PATCH 2/7] t7400: Add short "git submodule add" testsuite Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-16 19:11 ` [PATCH 4/7] submodule.*: Introduce simple C interface for submodule lookup by path Petr Baudis
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

This patch fixes git submodule add behaviour when we add submodule
living at a same path as logical name of existing submodule. This
can happen e.g. in case the user git mv's the previous submodule away
and then git submodule add's another under the same name.

A test-case is obviously included.

This is not completely satisfactory since .git/config cross-commit
conflicts can still occur. A question is whether this is worth
handling, maybe it would be worth adding some kind of randomization
of the autogenerated submodule name, e.g. appending $$ or a timestamp.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 git-submodule.sh           |   15 ++++++++++++---
 t/t7400-submodule-basic.sh |   11 +++++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 9228f56..a93547b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -192,10 +192,19 @@ cmd_add()
 	git add "$path" ||
 	die "Failed to add submodule '$path'"
 
-	git config -f .gitmodules submodule."$path".path "$path" &&
-	git config -f .gitmodules submodule."$path".url "$repo" &&
+	name="$path"
+	if git config -f .gitmodules submodule."$name".path; then
+		name="$path~"; i=1;
+		while git config -f .gitmodules submodule."$name".path; do
+			name="$path~$i"
+			i=$((i+1))
+		done
+	fi
+
+	git config -f .gitmodules submodule."$name".path "$path" &&
+	git config -f .gitmodules submodule."$name".url "$repo" &&
 	git add .gitmodules ||
-	die "Failed to register submodule '$path'"
+	die "Failed to register submodule '$path' (name '$name')"
 }
 
 #
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index ab5eb1e..092dffc 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -235,4 +235,15 @@ test_expect_success 'submodule add -b' '
 
 '
 
+test_expect_success 'submodule add auto-naming clash' '
+
+	git submodule add "$(pwd)/init2/.git" example &&
+	test -d example/.git &&
+	[ "$(git config -f .gitmodules submodule.example.url)" = "$(pwd)/init2" ] &&
+	[ "$(git config -f .gitmodules submodule.example.path)" = "init" ]
+	[ "$(git config -f .gitmodules submodule.example~.url)" = "$(pwd)/init2/.git" ] &&
+	[ "$(git config -f .gitmodules submodule.example~.path)" = "example" ]
+
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 4/7] submodule.*: Introduce simple C interface for submodule lookup by path
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
                   ` (2 preceding siblings ...)
  2008-07-16 19:11 ` [PATCH 3/7] git submodule add: Fix naming clash handling Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-16 19:11 ` [PATCH 5/7] git mv: Support moving submodules Petr Baudis
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

The interface will be used for git-mv and git-rm submodule support.
So far, only the submodule_by_path() function is defined, however more
can be probably expected in the future if/when the git-submodule command
is ported from shell.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Makefile    |    2 ++
 submodule.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 submodule.h |    8 ++++++++
 3 files changed, 60 insertions(+), 0 deletions(-)
 create mode 100644 submodule.c
 create mode 100644 submodule.h

diff --git a/Makefile b/Makefile
index 9b52071..742b5cb 100644
--- a/Makefile
+++ b/Makefile
@@ -368,6 +368,7 @@ LIB_H += run-command.h
 LIB_H += sha1-lookup.h
 LIB_H += sideband.h
 LIB_H += strbuf.h
+LIB_H += submodule.h
 LIB_H += tag.h
 LIB_H += transport.h
 LIB_H += tree.h
@@ -458,6 +459,7 @@ LIB_OBJS += sha1_name.o
 LIB_OBJS += shallow.o
 LIB_OBJS += sideband.o
 LIB_OBJS += strbuf.o
+LIB_OBJS += submodule.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += trace.o
diff --git a/submodule.c b/submodule.c
new file mode 100644
index 0000000..2883ae6
--- /dev/null
+++ b/submodule.c
@@ -0,0 +1,50 @@
+#include "cache.h"
+#include "submodule.h"
+
+
+struct gitmodules_info {
+	const char *path;
+	char *key;
+};
+
+static int gitmodules_worker(const char *key, const char *value, void *info_)
+{
+	struct gitmodules_info *info = info_;
+	const char *subkey;
+
+	if (prefixcmp(key, "submodule."))
+		return 0;
+
+	subkey = strrchr(key, '.');
+	if (!subkey)
+		return 0;
+
+	if (strcmp(subkey, ".path"))
+		return 0;
+
+	if (strcmp(value, info->path))
+		return 0;
+
+	/* Found the key to change. */
+	if (info->key) {
+		error("multiple submodules live at path `%s'", info->path);
+		/* The last one is supposed to win. */
+		free(info->key);
+	}
+	info->key = xstrdup(key);
+	return 0;
+}
+
+char *submodule_by_path(const char *path)
+{
+	struct gitmodules_info info = { path, NULL };
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config(gitmodules_worker, &info))
+		die("cannot process .gitmodules");
+	if (!info.key)
+		die("the submodule of `%s' not found in .gitmodules", path);
+	config_exclusive_filename = NULL;
+
+	return info.key;
+}
diff --git a/submodule.h b/submodule.h
new file mode 100644
index 0000000..bc74fa0
--- /dev/null
+++ b/submodule.h
@@ -0,0 +1,8 @@
+#ifndef SUBMODULE_H
+#define SUBMODULE_H
+
+/* Find submodule living at given path in .gitmodules and return the key
+ * of its path config variable (dynamically allocated). */
+extern char *submodule_by_path(const char *path);
+
+#endif

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 5/7] git mv: Support moving submodules
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
                   ` (3 preceding siblings ...)
  2008-07-16 19:11 ` [PATCH 4/7] submodule.*: Introduce simple C interface for submodule lookup by path Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-17  2:37   ` Junio C Hamano
  2008-07-16 19:11 ` [PATCH 6/7] git rm: Support for removing submodules Petr Baudis
  2008-07-16 19:11 ` [PATCH 7/7] t7403: Submodule git mv, git rm testsuite Petr Baudis
  6 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

This patch adds support for moving submodules to 'git mv', including
rewriting of the .gitmodules file to reflect the movement.

The usage of struct path_list here is a bit abusive, but keeps the
code simple and hopefully still reasonably readable. The horrid
index_path_src_sha1 hack is unfortunately much worse, however the author
is currently unaware of any more reasonable solution of the problem.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++------
 cache.h      |    2 ++
 sha1_file.c  |   10 ++++++++++
 3 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 158a83d..30c4e7d 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "path-list.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_mv_usage[] = {
 	"git-mv [options] <source>... <destination>",
@@ -60,6 +61,24 @@ static const char *add_slash(const char *path)
 	return path;
 }
 
+static int ce_is_gitlink(int i)
+{
+	return i < 0 ? 0 : S_ISGITLINK(active_cache[i]->ce_mode);
+}
+
+static void rename_submodule(struct path_list_item *i)
+{
+	char *key = submodule_by_path(i->path);
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config_set(key, (const char *) i->util))
+		die("cannot update .gitmodules");
+	config_exclusive_filename = NULL;
+
+	free(key);
+}
+
+
 static struct lock_file lock_file;
 
 int cmd_mv(int argc, const char **argv, const char *prefix)
@@ -77,9 +96,12 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	struct stat st;
 	struct path_list overwritten = {NULL, 0, 0, 0};
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
+	/* .util contains sha1 for submodules */
 	struct path_list added = {NULL, 0, 0, 0};
 	struct path_list deleted = {NULL, 0, 0, 0};
 	struct path_list changed = {NULL, 0, 0, 0};
+	/* .path is source path, .util is destination path */
+	struct path_list submodules = {NULL, 0, 0, 0};
 
 	git_config(git_default_config, NULL);
 
@@ -99,7 +121,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		/* special case: "." was normalized to "" */
 		destination = copy_pathspec(dest_path[0], argv, argc, 1);
 	else if (!lstat(dest_path[0], &st) &&
-			S_ISDIR(st.st_mode)) {
+			S_ISDIR(st.st_mode) &&
+			!ce_is_gitlink(cache_name_pos(dest_path[0], strlen(dest_path[0])))) {
 		dest_path[0] = add_slash(dest_path[0]);
 		destination = copy_pathspec(dest_path[0], argv, argc, 1);
 	} else {
@@ -111,7 +134,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length, src_is_dir, src_cache_pos;
 		const char *bad = NULL;
 
 		if (show_only)
@@ -126,7 +149,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		} else if ((src_is_dir = S_ISDIR(st.st_mode))
 				&& lstat(dst, &st) == 0)
 			bad = "cannot move directory over file";
-		else if (src_is_dir) {
+		else if ((src_cache_pos = cache_name_pos(src, length)) < 0 && src_is_dir) {
 			const char *src_w_slash = add_slash(src);
 			int len_w_slash = length + 1;
 			int first, last;
@@ -193,7 +216,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = "Cannot overwrite";
 			}
-		} else if (cache_name_pos(src, length) < 0)
+		} else if (src_cache_pos < 0)
 			bad = "not under version control";
 		else if (path_list_has_path(&src_for_dst, dst))
 			bad = "multiple sources for the same target";
@@ -218,6 +241,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
+		struct path_list_item *last = NULL;
+		int j;
 		if (show_only || verbose)
 			printf("Renaming %s to %s\n", src, dst);
 		if (!show_only && mode != INDEX &&
@@ -227,14 +252,24 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		assert(cache_name_pos(src, strlen(src)) >= 0);
+		j = cache_name_pos(src, strlen(src));
+		assert(j >= 0);
 
 		path_list_insert(src, &deleted);
 		/* destination can be a directory with 1 file inside */
 		if (path_list_has_path(&overwritten, dst))
 			path_list_insert(dst, &changed);
 		else
-			path_list_insert(dst, &added);
+			last = path_list_insert(dst, &added);
+		if (ce_is_gitlink(j)) {
+			path_list_insert(src, &submodules)->util = (void *) dst;
+			/* We will note the original HEAD sha1; the submodule
+			 * may not be checked out, in which case we would have
+			 * no way to re-obtain it later when adding the new
+			 * entry. */
+			assert(last);
+			last->util = active_cache[j]->sha1;
+		}
 	}
 
 	if (show_only) {
@@ -254,13 +289,21 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		for (i = 0; i < added.nr; i++) {
 			const char *path = added.items[i].path;
+			index_path_src_sha1 = added.items[i].util;
 			if (add_file_to_cache(path, verbose ? ADD_CACHE_VERBOSE : 0))
 				die("updating index entries failed");
+			index_path_src_sha1 = NULL;
 		}
 
 		for (i = 0; i < deleted.nr; i++)
 			remove_file_from_cache(deleted.items[i].path);
 
+		for (i = 0; i < submodules.nr; i++)
+			rename_submodule(&submodules.items[i]);
+
+		if (submodules.nr > 0 && add_file_to_cache(".gitmodules", 0))
+			die("cannot add new .gitmodules to the index");
+
 		if (active_cache_changed) {
 			if (write_cache(newfd, active_cache, active_nr) ||
 			    commit_locked_index(&lock_file))
diff --git a/cache.h b/cache.h
index a779d92..998b5fb 100644
--- a/cache.h
+++ b/cache.h
@@ -387,6 +387,8 @@ extern int ce_same_name(struct cache_entry *a, struct cache_entry *b);
 extern int ie_match_stat(const struct index_state *, struct cache_entry *, struct stat *, unsigned int);
 extern int ie_modified(const struct index_state *, struct cache_entry *, struct stat *, unsigned int);
 
+extern unsigned char *index_path_src_sha1;
+
 extern int ce_path_match(const struct cache_entry *ce, const char **pathspec);
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path);
 extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object);
diff --git a/sha1_file.c b/sha1_file.c
index e281c14..4da6048 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2416,12 +2416,22 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
 	return ret;
 }
 
+unsigned char *index_path_src_sha1;
+
 int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
 {
 	int fd;
 	char *target;
 	size_t len;
 
+	if (index_path_src_sha1) {
+		/* SHA1 for this path was prepared by the caller specially;
+		 * e.g. comes from original cache entry in some cases of
+		 * a rename. */
+		memcpy(sha1, index_path_src_sha1, 20);
+		return 0;
+	}
+
 	switch (st->st_mode & S_IFMT) {
 	case S_IFREG:
 		fd = open(path, O_RDONLY);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 6/7] git rm: Support for removing submodules
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
                   ` (4 preceding siblings ...)
  2008-07-16 19:11 ` [PATCH 5/7] git mv: Support moving submodules Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  2008-07-16 22:41   ` Johannes Schindelin
  2008-07-16 19:11 ` [PATCH 7/7] t7403: Submodule git mv, git rm testsuite Petr Baudis
  6 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

This patch adds support for removing submodules to 'git rm', including
removing the appropriate sections from the .gitmodules file to reflect this

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-rm.txt |    6 +++-
 builtin-rm.c             |   65 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index 4d0c495..bfc3dfa 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -20,7 +20,8 @@ and no updates to their contents can be staged in the index,
 though that default behavior can be overridden with the `-f` option.
 When '--cached' is given, the staged content has to
 match either the tip of the branch or the file on disk,
-allowing the file to be removed from just the index.
+allowing the file to be removed from just the index;
+this is always the case when removing submodules.
 
 
 OPTIONS
@@ -56,7 +57,8 @@ OPTIONS
 --cached::
 	Use this option to unstage and remove paths only from the index.
 	Working tree files, whether modified or not, will be
-	left alone.
+	left alone.  Note that this is always assumed when removing
+	a checked-out submodule.
 
 --ignore-unmatch::
 	Exit with a zero status even if no files matched.
diff --git a/builtin-rm.c b/builtin-rm.c
index 22c9bd1..363d1fa 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "tree-walk.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_rm_usage[] = {
 	"git-rm [options] [--] <file>...",
@@ -17,16 +18,21 @@ static const char * const builtin_rm_usage[] = {
 
 static struct {
 	int nr, alloc;
-	const char **name;
+	struct {
+		const char *name;
+		int is_gitlink;
+	} *info;
 } list;
 
-static void add_list(const char *name)
+static void add_list(const char *name, int is_gitlink)
 {
 	if (list.nr >= list.alloc) {
 		list.alloc = alloc_nr(list.alloc);
-		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
+		list.info = xrealloc(list.info, list.alloc * sizeof(*list.info));
 	}
-	list.name[list.nr++] = name;
+	list.info[list.nr].name = name;
+	list.info[list.nr].is_gitlink = is_gitlink;
+	list.nr++;
 }
 
 static int remove_file(const char *name)
@@ -38,6 +44,13 @@ static int remove_file(const char *name)
 	if (ret && errno == ENOENT)
 		/* The user has removed it from the filesystem by hand */
 		ret = errno = 0;
+	if (ret && errno == EISDIR) {
+		/* This is a gitlink entry; try to remove at least the
+		 * directory if the submodule is not checked out; we always
+		 * leave the checked out ones as they are */
+		if (!rmdir(name) || errno == ENOTEMPTY)
+			ret = errno = 0;
+	}
 
 	if (!ret && (slash = strrchr(name, '/'))) {
 		char *n = xstrdup(name);
@@ -65,7 +78,7 @@ static int check_local_mod(unsigned char *head, int index_only)
 		struct stat st;
 		int pos;
 		struct cache_entry *ce;
-		const char *name = list.name[i];
+		const char *name = list.info[i].name;
 		unsigned char sha1[20];
 		unsigned mode;
 		int local_changes = 0;
@@ -83,7 +96,7 @@ static int check_local_mod(unsigned char *head, int index_only)
 			/* It already vanished from the working tree */
 			continue;
 		}
-		else if (S_ISDIR(st.st_mode)) {
+		else if (S_ISDIR(st.st_mode) && !S_ISGITLINK(ce->ce_mode)) {
 			/* if a file was removed and it is now a
 			 * directory, that is the same as ENOENT as
 			 * far as git is concerned; we do not track
@@ -122,6 +135,22 @@ static int check_local_mod(unsigned char *head, int index_only)
 	return errs;
 }
 
+static void remove_submodule(const char *name)
+{
+	char *key = submodule_by_path(name);
+	char *sectend = strrchr(key, '.');
+
+	assert(sectend);
+	*sectend = 0;
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config_rename_section(key, NULL) <= 0)
+		die("cannot remove section `%s' from .gitmodules", key);
+	config_exclusive_filename = NULL;
+
+	free(key);
+}
+
 static struct lock_file lock_file;
 
 static int show_only = 0, force = 0, index_only = 0, recursive = 0, quiet = 0;
@@ -140,7 +169,7 @@ static struct option builtin_rm_options[] = {
 
 int cmd_rm(int argc, const char **argv, const char *prefix)
 {
-	int i, newfd;
+	int i, newfd, subs;
 	const char **pathspec;
 	char *seen;
 
@@ -168,7 +197,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		struct cache_entry *ce = active_cache[i];
 		if (!match_pathspec(pathspec, ce->name, ce_namelen(ce), 0, seen))
 			continue;
-		add_list(ce->name);
+		add_list(ce->name, S_ISGITLINK(ce->ce_mode));
 	}
 
 	if (pathspec) {
@@ -216,9 +245,11 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	 * the index unless all of them succeed.
 	 */
 	for (i = 0; i < list.nr; i++) {
-		const char *path = list.name[i];
+		const char *path = list.info[i].name;
 		if (!quiet)
-			printf("rm '%s'\n", path);
+			printf("rm%s '%s'\n",
+				list.info[i].is_gitlink ? "dir" : "",
+				path);
 
 		if (remove_file_from_cache(path))
 			die("git-rm: unable to remove %s", path);
@@ -238,7 +269,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (!index_only) {
 		int removed = 0;
 		for (i = 0; i < list.nr; i++) {
-			const char *path = list.name[i];
+			const char *path = list.info[i].name;
 			if (!remove_file(path)) {
 				removed = 1;
 				continue;
@@ -248,6 +279,18 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		}
 	}
 
+	/*
+	 * Get rid of stale submodule setup.
+	 */
+	subs = 0;
+	for (i = 0; i < list.nr; i++)
+		if (list.info[i].is_gitlink) {
+			remove_submodule(list.info[i].name);
+			subs++;
+		}
+	if (subs && add_file_to_cache(".gitmodules", 0))
+		die("cannot add new .gitmodules to the index");
+
 	if (active_cache_changed) {
 		if (write_cache(newfd, active_cache, active_nr) ||
 		    commit_locked_index(&lock_file))

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 7/7] t7403: Submodule git mv, git rm testsuite
  2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
                   ` (5 preceding siblings ...)
  2008-07-16 19:11 ` [PATCH 6/7] git rm: Support for removing submodules Petr Baudis
@ 2008-07-16 19:11 ` Petr Baudis
  6 siblings, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-16 19:11 UTC (permalink / raw)
  To: git

The testsuite for newly added submodule support in git mv, git rm.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 t/t7403-submodule-mvrm.sh |  242 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 242 insertions(+), 0 deletions(-)
 create mode 100755 t/t7403-submodule-mvrm.sh

diff --git a/t/t7403-submodule-mvrm.sh b/t/t7403-submodule-mvrm.sh
new file mode 100755
index 0000000..9b50d6a
--- /dev/null
+++ b/t/t7403-submodule-mvrm.sh
@@ -0,0 +1,242 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Johannes Schindelin
+#
+
+test_description='Test submodules support in git mv and git rm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+
+	(mkdir sub-repo &&
+	 cd sub-repo &&
+	 git init &&
+	 echo file > file &&
+	 git add file &&
+	 git commit -m "sub initial") &&
+	(cp -r sub-repo sub2-repo &&
+	 cd sub2-repo &&
+	 echo file2 > file &&
+	 git add file &&
+	 git commit -m "sub commit2") &&
+	git submodule add "$(pwd)/sub-repo" sub &&
+	git submodule add "$(pwd)/sub2-repo" sub2 &&
+	git commit -m initial &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub" &&
+	test "$(git config -f .gitmodules submodule.sub2.path)" = "sub2"
+
+'
+
+test_expect_success 'git mv of a submodule' '
+
+	git mv sub sub.moved &&
+	! test -d sub &&
+	test -d sub.moved/.git &&
+	! git ls-files --error-unmatch sub &&
+	test "$(git ls-files --stage --error-unmatch sub.moved | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved" &&
+	! git config -f .gitmodules submodule.sub.moved.path
+
+'
+
+test_expect_success 'git submodule add vs. git mv' '
+
+	! git submodule add "$(pwd)/sub2-repo" sub.moved &&
+	git submodule add "$(pwd)/sub2-repo" sub &&
+	test -d sub/.git &&
+	test "$(git config -f .gitmodules submodule.sub.url)" = "$(pwd)/sub-repo" &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved" &&
+	test "$(git config -f .gitmodules submodule.sub~.path)" = "sub"
+
+'
+
+test_expect_success 'git mv onto existing file' '
+
+	echo file > file &&
+	git add file &&
+	! git mv sub.moved file &&
+	test -d sub.moved &&
+	! test -d file/.git &&
+	test "$(git ls-files --stage --error-unmatch file | cut -d " " -f 1)" = 100644 &&
+	test "$(git ls-files --stage --error-unmatch sub.moved | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved"
+
+'
+
+test_expect_success 'git mv onto existing directory' '
+
+	mkdir -p dir &&
+	echo file > dir/file &&
+	git add dir/file &&
+	git mv sub.moved dir &&
+	! test -d sub.moved &&
+	test -d dir/sub.moved/.git &&
+	! git ls-files --error-unmatch sub.moved &&
+	test "$(git ls-files --stage --error-unmatch dir/sub.moved | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "dir/sub.moved" &&
+	git mv dir/sub.moved . &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved"
+
+'
+
+test_expect_success 'git mv onto existing submodule' '
+
+	! git mv sub.moved sub2 &&
+	test -d sub.moved/.git &&
+	! test -d sub2/sub.moved &&
+	test "$(git ls-files --stage --error-unmatch sub2 | cut -d " " -f 1)" = 160000 &&
+	test "$(git ls-files --stage --error-unmatch sub.moved | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved"
+
+'
+
+test_expect_success 'git mv of multiple submodules' '
+
+	mkdir -p dir &&
+	git mv sub.moved sub dir &&
+	! test -d sub.moved &&
+	! test -d sub &&
+	test -d dir/sub.moved/.git &&
+	test -d dir/sub/.git &&
+	! git ls-files --error-unmatch sub.moved sub &&
+	test "$(git ls-files --stage --error-unmatch dir/sub.moved dir/sub | cut -d " " -f 1 | uniq)" = 160000 &&
+	! git config -f .gitmodules submodule.dir.path &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "dir/sub.moved" &&
+	test "$(git config -f .gitmodules submodule.sub~.path)" = "dir/sub"
+
+'
+
+test_expect_success 'git mv of multiple submodules back from a subdir' '
+
+	(cd dir && git mv sub.moved sub .. && cd ..) &&
+	test -d sub.moved &&
+	test -d sub &&
+	! test -d dir/sub.moved/.git &&
+	! test -d dir/sub/.git &&
+	! git ls-files --error-unmatch dir/sub.moved dir/sub &&
+	test "$(git ls-files --stage --error-unmatch sub.moved sub | cut -d " " -f 1 | uniq)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved" &&
+	test "$(git config -f .gitmodules submodule.sub~.path)" = "sub"
+
+'
+
+test_expect_success 'git mv of non-checked-out submodules' '
+
+	git clone . clone &&
+	(cd clone &&
+	test -d sub &&
+	test -d sub2 &&
+	! test -d sub/.git &&
+	! test -d sub2/.git &&
+	git ls-files --stage --error-unmatch sub sub2 > ls-files.out &&
+	mkdir -p dir &&
+	git mv sub sub2 dir &&
+	! test -d sub &&
+	! test -d sub2 &&
+	test -d dir/sub &&
+	test -d dir/sub2 &&
+	! git ls-files --error-unmatch sub sub2 &&
+	test "$(git ls-files --stage --error-unmatch dir/sub dir/sub2 | cut -d " " -f 1 | uniq)" = 160000 &&
+	git ls-files --stage --error-unmatch dir/sub dir/sub2 | sed "s#dir/##g" | diff - ls-files.out &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "dir/sub" &&
+	test "$(git config -f .gitmodules submodule.sub2.path)" = "dir/sub2" &&
+	(cd dir && git mv sub2 .. && cd ..) &&
+	test -d sub2 &&
+	! test -d dir/sub2 &&
+	! git ls-files --error-unmatch dir/sub2 &&
+	test "$(git ls-files --stage --error-unmatch sub2 | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub2.path)" = "sub2")
+
+'
+
+test_expect_success 'checkpointing state with git commit' '
+
+	git commit -m"checkpoint" -a &&
+	(cd clone && git commit -m"clone checkpoint" -a)
+
+'
+
+test_expect_success 'git rm of a regular submodule' '
+
+	git rm sub2 &&
+	test -d sub2/.git &&
+	! git ls-files --error-unmatch sub2 &&
+	! git config -f .gitmodules submodule.sub2.path &&
+	! git config -f .gitmodules submodule.sub2.url
+
+'
+
+test_expect_success 'git rm of a submodule with name different from path' '
+
+	git rm sub.moved &&
+	test -d sub.moved/.git &&
+	! git ls-files --error-unmatch sub.moved &&
+	! git config -f .gitmodules submodule.sub.path &&
+	! git config -f .gitmodules submodule.sub.url
+
+'
+
+test_expect_success 'git rm of a modified submodule' '
+
+	git mv sub dir/sub && # more fun with richer path
+	(cd dir/sub &&
+	 echo mod > file &&
+	 git commit -m "sub mod" file) &&
+	git add dir/sub &&
+	! git rm dir/sub &&
+	test -d dir/sub/.git &&
+	test "$(git ls-files --stage --error-unmatch dir/sub | cut -d " " -f 1)" = "160000" &&
+	git config -f .gitmodules submodule.sub~.path &&
+	git config -f .gitmodules submodule.sub~.url &&
+	git rm -f dir/sub &&
+	test -d dir/sub/.git &&
+	! git ls-files --error-unmatch dir/sub &&
+	! git config -f .gitmodules submodule.sub~.path &&
+	! git config -f .gitmodules submodule.sub~.url
+
+'
+
+test_expect_success 'git rm of a submodule from within a subdirectory' '
+
+	git submodule add "$(pwd)/sub-repo" sub-torm &&
+	mkdir -p dir &&
+	# -f since we did not commit the submodule
+	(cd dir && git rm -f ../sub-torm && cd ..) &&
+	test -d sub-torm/.git &&
+	! git ls-files --error-unmatch sub-torm &&
+	! git config -f .gitmodules submodule.sub-torm.path &&
+	! git config -f .gitmodules submodule.sub-torm.url
+
+'
+
+test_expect_success 'git rm of a non-checked-out submodule' '
+
+	(cd clone &&
+	test -d dir/sub &&
+	! test -d dir/sub/.git &&
+	git rm dir/sub &&
+	! test -d dir/sub &&
+	! git ls-files --error-unmatch dir/sub &&
+	! git config -f .gitmodules submodule.sub.path &&
+	! git config -f .gitmodules submodule.sub.url)
+
+'
+
+test_expect_success 'git rm of a non-checked-out submodule w/ different working tree' '
+
+	(cd clone &&
+	rmdir sub2 &&
+	echo cunning > sub2 &&
+	! git rm sub2 &&
+	test -f sub2 &&
+	test "$(git ls-files --stage --error-unmatch sub2 | cut -d " " -f 1)" = "160000" &&
+	git rm -f sub2 &&
+	! test -e sub2 &&
+	! git ls-files --error-unmatch sub2 &&
+	! git config -f .gitmodules submodule.sub2.path &&
+	! git config -f .gitmodules submodule.sub2.url)
+
+'
+
+test_done

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 6/7] git rm: Support for removing submodules
  2008-07-16 19:11 ` [PATCH 6/7] git rm: Support for removing submodules Petr Baudis
@ 2008-07-16 22:41   ` Johannes Schindelin
  2008-07-17 12:35     ` Petr Baudis
  0 siblings, 1 reply; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-16 22:41 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Hi,

On Wed, 16 Jul 2008, Petr Baudis wrote:

> This patch adds support for removing submodules to 'git rm', including 
> removing the appropriate sections from the .gitmodules file to reflect 
> this

I have no time to look at the whole series, or even at the patch, but I 
have concerns.  Do you really remove the whole directory?  Because if you 
do, you remove more than what can be possibly reconstructed from the 
object store.

That is much more dangerous than what "git rm" does.

For example, you can have local branches, remote settings, untracked 
files, etc. in the subdirectory.  And that cannot be recovered once 
deleted.

I wonder if it really makes sense to integrate that into git-rm, and not 
git-submodule, if only to introduce another level of consideration for the 
user before committing what is potentially a big mistake.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 5/7] git mv: Support moving submodules
  2008-07-16 19:11 ` [PATCH 5/7] git mv: Support moving submodules Petr Baudis
@ 2008-07-17  2:37   ` Junio C Hamano
  2008-07-17 13:06     ` Petr Baudis
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-07-17  2:37 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

> The usage of struct path_list here is a bit abusive,...

I do not think use of path_list->util is particularly bad.  It is a data
structure to store a bag of random pointers that can be indexed with
string keys, which is exactly what you are doing here.

> ... The horrid
> index_path_src_sha1 hack is unfortunately much worse,

This one certainly is ugly beyond words.

By the way, why is it even necessary to rehash the contents when you run
"mv"?

IOW,

	$ >foo
        $ git add foo
        $ echo 1 >foo
        $ git mv foo bar

makes "foo" disappear from the index and adds "bar", but it makes the
local change added to the index at the same time.

	Side note. It actually is a lot worse than that.  When you move
	something to another existing entry, the code does not even add
	the object name of moved entry recorded in the index, nor rehashes
	the moved file.  This is totally inconsistent!

I personally think these buggy behaviours you are trying to inherit are
making this patch more complex than necessary.  Perhaps this needs to be
fixed up even further (you did some fix with the first patch) before
adding new features?  For example, I think it is a bug that the
"overwrite" codepath does not work with symlinks.

But let's assume that we do not want to "fix" any of these existing
(mis)behaviour, because doing so would change the semantics.

There are a few cases:

 (1) gitlink exists and so is directory but no repository (i.e. the user is
     not interested);

 (2) gitlink exists and there is a repository.  Its HEAD matches what
     gitlink records;

 (3) gitlink exists and there is a repository.  Its HEAD does not match what
     gitlink records;

The (mis)behaviour that automatically adds moved files to the index we saw
earlier suggests that in case (3) you would want to update the gitlink in
the index with whatever HEAD the submodule repository has to be
consistent.

So instead of adding a index_path_src_sha1 hack like that, how about

 * When you see ce_is_gitlink(j), you keep the original gitlink object
   name.  That is very good in order to preserve it for not just (1) but
   also for (3) once we decide to fix this "auto adding" misbehaviour in
   the later round.  But you do not have to do this for case (2) if you
   are going to rehash anyway, don't you?

   So when you see ce_is_gitlink(j), you check if it has repository and
   HEAD, and record the path_list->util only when it is case (1).

 * Then, only for case (1), you do not call add_file_to_cache() -- because
   you know you do not have anything you can rehash; instead, factor out
   the codepath "git-update-index --cacheinfo" uses and call that.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 6/7] git rm: Support for removing submodules
  2008-07-16 22:41   ` Johannes Schindelin
@ 2008-07-17 12:35     ` Petr Baudis
  2008-07-17 12:59       ` Johannes Schindelin
  0 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-17 12:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

  Hi,

On Thu, Jul 17, 2008 at 12:41:57AM +0200, Johannes Schindelin wrote:
> On Wed, 16 Jul 2008, Petr Baudis wrote:
> 
> > This patch adds support for removing submodules to 'git rm', including 
> > removing the appropriate sections from the .gitmodules file to reflect 
> > this
> 
> I have no time to look at the whole series, or even at the patch, but I 
> have concerns.  Do you really remove the whole directory?  Because if you 
> do, you remove more than what can be possibly reconstructed from the 
> object store.

  no; I remove only the index entry and the empty directory made for
non-checked-out submodules (I just try rmdir() and ignore ENOTEMPTY
error).  I make that clear in git rm documentation, but not in the patch
description; I will fix that.

> I wonder if it really makes sense to integrate that into git-rm, and not 
> git-submodule, if only to introduce another level of consideration for the 
> user before committing what is potentially a big mistake.

  That is good question and I forgot to elaborate on this in the cover
letter. However, I believe that integrating this functionality into the
basic commands makes for a smoother user experience, following the
principle of the least surprise. It is difficult for me to argue
extensively in further favor of this choice, though. :-)

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 6/7] git rm: Support for removing submodules
  2008-07-17 12:35     ` Petr Baudis
@ 2008-07-17 12:59       ` Johannes Schindelin
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-17 12:59 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Hi,

On Thu, 17 Jul 2008, Petr Baudis wrote:

> On Thu, Jul 17, 2008 at 12:41:57AM +0200, Johannes Schindelin wrote:
> > On Wed, 16 Jul 2008, Petr Baudis wrote:
> > 
> > > This patch adds support for removing submodules to 'git rm', 
> > > including removing the appropriate sections from the .gitmodules 
> > > file to reflect this
> > 
> > I have no time to look at the whole series, or even at the patch, but 
> > I have concerns.  Do you really remove the whole directory?  Because 
> > if you do, you remove more than what can be possibly reconstructed 
> > from the object store.
> 
> no; I remove only the index entry and the empty directory made for 
> non-checked-out submodules (I just try rmdir() and ignore ENOTEMPTY 
> error).  I make that clear in git rm documentation, but not in the patch 
> description; I will fix that.

Thanks for explanation!

> > I wonder if it really makes sense to integrate that into git-rm, and 
> > not git-submodule, if only to introduce another level of consideration 
> > for the user before committing what is potentially a big mistake.
> 
> That is good question and I forgot to elaborate on this in the cover 
> letter. However, I believe that integrating this functionality into the 
> basic commands makes for a smoother user experience, following the 
> principle of the least surprise.

Makes sense,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 5/7] git mv: Support moving submodules
  2008-07-17  2:37   ` Junio C Hamano
@ 2008-07-17 13:06     ` Petr Baudis
  2008-07-17 22:31       ` [PATCH] git-mv: Keep moved index entries inact Petr Baudis
  0 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-17 13:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

  Hi!

On Wed, Jul 16, 2008 at 07:37:57PM -0700, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> > ... The horrid
> > index_path_src_sha1 hack is unfortunately much worse,
> 
> This one certainly is ugly beyond words.

  ;-)

> By the way, why is it even necessary to rehash the contents when you run
> "mv"?
> 
> IOW,
> 
> 	$ >foo
>         $ git add foo
>         $ echo 1 >foo
>         $ git mv foo bar
> 
> makes "foo" disappear from the index and adds "bar", but it makes the
> local change added to the index at the same time.
> 
> 	Side note. It actually is a lot worse than that.  When you move
> 	something to another existing entry, the code does not even add
> 	the object name of moved entry recorded in the index, nor rehashes
> 	the moved file.  This is totally inconsistent!
> 
> I personally think these buggy behaviours you are trying to inherit are
> making this patch more complex than necessary.  Perhaps this needs to be
> fixed up even further (you did some fix with the first patch) before
> adding new features?  For example, I think it is a bug that the
> "overwrite" codepath does not work with symlinks.

  I agree that it would be much cleaner to fix this; I got puzzled about
this behaviour a bit, but I was afraid to break the traditional
behaviour. However, if you are feeling this brave, patch to follow up
shortly. :-)

>  * Then, only for case (1), you do not call add_file_to_cache() -- because
>    you know you do not have anything you can rehash; instead, factor out
>    the codepath "git-update-index --cacheinfo" uses and call that.

  This is excellent hint, sort of what I hoped for, thanks! I forgot
about --cacheinfo completely, which is truly a shame especially when I
look at the history of this switch. ;-) (BTW, curiously, the commit
lists Linus as an author even though the patch is yours. Maybe this was
merely some imperfection of the early scripts around Git, though.)
I really did not touch git internals for way too long.

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH] git-mv: Keep moved index entries inact
  2008-07-17 13:06     ` Petr Baudis
@ 2008-07-17 22:31       ` Petr Baudis
  2008-07-17 22:34         ` [PATCH] git mv: Support moving submodules Petr Baudis
  2008-07-19 23:54         ` [PATCH] git-mv: Keep moved index entries inact Junio C Hamano
  0 siblings, 2 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-17 22:31 UTC (permalink / raw)
  To: git; +Cc: gitster

The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

A new test has been added to the testsuite to reflect this change.
Also, based on suggestion by Junio about desired symlink behaviour
of git mv, I have added two tests for that; however, I do not have
need or desire to spend time fixing this, so they are expected
to fail for now until someone gets around to fixing that.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

  This patch might depend on git-mv: Remove dead code branch.

 builtin-mv.c  |   62 ++++++++-------------------------------------------------
 cache.h       |    2 ++
 read-cache.c  |   15 ++++++++++++++
 t/t7001-mv.sh |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 53 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 33ad082..28ebc9c 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -36,17 +36,6 @@ static const char **copy_pathspec(const char *prefix, const char **pathspec,
 	return get_pathspec(prefix, result);
 }
 
-static void show_list(const char *label, struct path_list *list)
-{
-	if (list->nr > 0) {
-		int i;
-		printf("%s", label);
-		for (i = 0; i < list->nr; i++)
-			printf("%s%s", i > 0 ? ", " : "", list->items[i].path);
-		putchar('\n');
-	}
-}
-
 static const char *add_slash(const char *path)
 {
 	int len = strlen(path);
@@ -75,11 +64,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	const char **source, **destination, **dest_path;
 	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX } *modes;
 	struct stat st;
-	struct path_list overwritten = {NULL, 0, 0, 0};
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
-	struct path_list added = {NULL, 0, 0, 0};
-	struct path_list deleted = {NULL, 0, 0, 0};
-	struct path_list changed = {NULL, 0, 0, 0};
 
 	git_config(git_default_config, NULL);
 
@@ -189,7 +174,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							" will overwrite!\n",
 							bad);
 					bad = NULL;
-					path_list_insert(dst, &overwritten);
 				} else
 					bad = "Cannot overwrite";
 			}
@@ -218,6 +202,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
+		int pos;
 		if (show_only || verbose)
 			printf("Renaming %s to %s\n", src, dst);
 		if (!show_only && mode != INDEX &&
@@ -227,45 +212,16 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		assert(cache_name_pos(src, strlen(src)) >= 0);
-
-		path_list_insert(src, &deleted);
-		/* destination can be a directory with 1 file inside */
-		if (path_list_has_path(&overwritten, dst))
-			path_list_insert(dst, &changed);
-		else
-			path_list_insert(dst, &added);
+		pos = cache_name_pos(src, strlen(src));
+		assert(pos >= 0);
+		if (!show_only)
+			rename_cache_entry_at(pos, dst);
 	}
 
-	if (show_only) {
-		show_list("Changed  : ", &changed);
-		show_list("Adding   : ", &added);
-		show_list("Deleting : ", &deleted);
-	} else {
-		for (i = 0; i < changed.nr; i++) {
-			const char *path = changed.items[i].path;
-			int j = cache_name_pos(path, strlen(path));
-			struct cache_entry *ce = active_cache[j];
-
-			if (j < 0)
-				die ("Huh? Cache entry for %s unknown?", path);
-			refresh_cache_entry(ce, 0);
-		}
-
-		for (i = 0; i < added.nr; i++) {
-			const char *path = added.items[i].path;
-			if (add_file_to_cache(path, verbose ? ADD_CACHE_VERBOSE : 0))
-				die("updating index entries failed");
-		}
-
-		for (i = 0; i < deleted.nr; i++)
-			remove_file_from_cache(deleted.items[i].path);
-
-		if (active_cache_changed) {
-			if (write_cache(newfd, active_cache, active_nr) ||
-			    commit_locked_index(&lock_file))
-				die("Unable to write new index file");
-		}
+	if (active_cache_changed) {
+		if (write_cache(newfd, active_cache, active_nr) ||
+		    commit_locked_index(&lock_file))
+			die("Unable to write new index file");
 	}
 
 	return 0;
diff --git a/cache.h b/cache.h
index a779d92..6f1d003 100644
--- a/cache.h
+++ b/cache.h
@@ -260,6 +260,7 @@ static inline void remove_name_hash(struct cache_entry *ce)
 #define unmerged_cache() unmerged_index(&the_index)
 #define cache_name_pos(name, namelen) index_name_pos(&the_index,(name),(namelen))
 #define add_cache_entry(ce, option) add_index_entry(&the_index, (ce), (option))
+#define rename_cache_entry_at(pos, new_name) rename_index_entry_at(&the_index, (pos), (new_name))
 #define remove_cache_entry_at(pos) remove_index_entry_at(&the_index, (pos))
 #define remove_file_from_cache(path) remove_file_from_index(&the_index, (path))
 #define add_to_cache(path, st, flags) add_to_index(&the_index, (path), (st), (flags))
@@ -370,6 +371,7 @@ extern int index_name_pos(const struct index_state *, const char *name, int name
 #define ADD_CACHE_JUST_APPEND 8		/* Append only; tree.c::read_tree() */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option);
 extern struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int really);
+extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name);
 extern int remove_index_entry_at(struct index_state *, int pos);
 extern int remove_file_from_index(struct index_state *, const char *path);
 #define ADD_CACHE_VERBOSE 1
diff --git a/read-cache.c b/read-cache.c
index 1648428..70e5f57 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -38,6 +38,21 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	istate->cache_changed = 1;
 }
 
+void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
+{
+	struct cache_entry *old = istate->cache[nr], *new;
+	int namelen = strlen(new_name);
+
+	new = xmalloc(cache_entry_size(namelen));
+	copy_cache_entry(new, old);
+	new->ce_flags = (new->ce_flags & ~CE_NAMEMASK) | namelen;
+	memcpy(new->name, new_name, namelen);
+
+	cache_tree_invalidate_path(istate->cache_tree, old->name);
+	remove_index_entry_at(istate, nr);
+	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
+}
+
 /*
  * This only updates the "non-critical" parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 336cfaa..6b615f8 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -156,4 +156,61 @@ test_expect_success 'absolute pathname outside should fail' '(
 
 )'
 
+# git mv meets angry Git maintainer
+
+test_expect_success 'git mv should not change sha1 of moved cache entry' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >dirty &&
+	git add dirty &&
+	entry="$(git ls-files --stage dirty | cut -f 1)"
+	git mv dirty dirty2 &&
+	[ "$entry" = "$(git ls-files --stage dirty2 | cut -f 1)" ] &&
+	echo 2 >dirty2 &&
+	git mv dirty2 dirty &&
+	[ "$entry" = "$(git ls-files --stage dirty | cut -f 1)" ]
+
+'
+
+rm -f dirty dirty2
+
+test_expect_failure 'git mv should overwrite symlink to a file' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >moved &&
+	ln -s moved symlink &&
+	git add moved symlink &&
+	! git mv moved symlink &&
+	git mv -f moved symlink &&
+	[ ! -e moved ] &&
+	[ -f symlink ] &&
+	[ $(cat symlink) = 1 ] &&
+	git diff-files --quiet
+
+'
+
+rm -f moved symlink
+
+test_expect_failure 'git mv should follow symlink to a directory' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >moved &&
+	mkdir -p dir &&
+	touch dir/.keep &&
+	ln -s dir symlink &&
+	git add moved dir/.keep symlink &&
+	git mv moved symlink &&
+	[ ! -e moved ] &&
+	[ -f symlink/moved ] &&
+	[ $(cat symlink/moved) = 1 ] &&
+	[ "$(ls dir)" = "$(ls symlink)" ] &&
+	git diff-files --quiet
+
+'
+
+rm -rf moved symlink dir
+
 test_done

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH] git mv: Support moving submodules
  2008-07-17 22:31       ` [PATCH] git-mv: Keep moved index entries inact Petr Baudis
@ 2008-07-17 22:34         ` Petr Baudis
  2008-07-19 23:54         ` [PATCH] git-mv: Keep moved index entries inact Junio C Hamano
  1 sibling, 0 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-17 22:34 UTC (permalink / raw)
  To: git; +Cc: gitster

This patch adds support for moving submodules to 'git mv', including
rewriting of the .gitmodules file to reflect the movement.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

  This is the updated version reflecting the recent rewrite. No other
  changes in my queue yet (except expanding the git rm explanation).

 builtin-mv.c |   37 +++++++++++++++++++++++++++++++++----
 1 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 28ebc9c..62d0c95 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "path-list.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_mv_usage[] = {
 	"git mv [options] <source>... <destination>",
@@ -49,6 +50,24 @@ static const char *add_slash(const char *path)
 	return path;
 }
 
+static int ce_is_gitlink(int i)
+{
+	return i < 0 ? 0 : S_ISGITLINK(active_cache[i]->ce_mode);
+}
+
+static void rename_submodule(struct path_list_item *i)
+{
+	char *key = submodule_by_path(i->path);
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config_set(key, (const char *) i->util))
+		die("cannot update .gitmodules");
+	config_exclusive_filename = NULL;
+
+	free(key);
+}
+
+
 static struct lock_file lock_file;
 
 int cmd_mv(int argc, const char **argv, const char *prefix)
@@ -65,6 +84,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX } *modes;
 	struct stat st;
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
+	/* .path is source path, .util is destination path */
+	struct path_list submodules = {NULL, 0, 0, 0};
 
 	git_config(git_default_config, NULL);
 
@@ -84,7 +105,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		/* special case: "." was normalized to "" */
 		destination = copy_pathspec(dest_path[0], argv, argc, 1);
 	else if (!lstat(dest_path[0], &st) &&
-			S_ISDIR(st.st_mode)) {
+			S_ISDIR(st.st_mode) &&
+			!ce_is_gitlink(cache_name_pos(dest_path[0], strlen(dest_path[0])))) {
 		dest_path[0] = add_slash(dest_path[0]);
 		destination = copy_pathspec(dest_path[0], argv, argc, 1);
 	} else {
@@ -96,7 +118,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length, src_is_dir, src_cache_pos;
 		const char *bad = NULL;
 
 		if (show_only)
@@ -111,7 +133,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		} else if ((src_is_dir = S_ISDIR(st.st_mode))
 				&& lstat(dst, &st) == 0)
 			bad = "cannot move directory over file";
-		else if (src_is_dir) {
+		else if ((src_cache_pos = cache_name_pos(src, length)) < 0 && src_is_dir) {
 			const char *src_w_slash = add_slash(src);
 			int len_w_slash = length + 1;
 			int first, last;
@@ -177,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = "Cannot overwrite";
 			}
-		} else if (cache_name_pos(src, length) < 0)
+		} else if (src_cache_pos < 0)
 			bad = "not under version control";
 		else if (path_list_has_path(&src_for_dst, dst))
 			bad = "multiple sources for the same target";
@@ -214,10 +236,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
+		if (ce_is_gitlink(pos))
+			path_list_insert(src, &submodules)->util = (void *) dst;
 		if (!show_only)
 			rename_cache_entry_at(pos, dst);
 	}
 
+	for (i = 0; i < submodules.nr; i++)
+		rename_submodule(&submodules.items[i]);
+	if (submodules.nr > 0 && add_file_to_cache(".gitmodules", 0))
+		die("cannot add new .gitmodules to the index");
+
 	if (active_cache_changed) {
 		if (write_cache(newfd, active_cache, active_nr) ||
 		    commit_locked_index(&lock_file))

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] git-mv: Keep moved index entries inact
  2008-07-17 22:31       ` [PATCH] git-mv: Keep moved index entries inact Petr Baudis
  2008-07-17 22:34         ` [PATCH] git mv: Support moving submodules Petr Baudis
@ 2008-07-19 23:54         ` Junio C Hamano
  2008-07-21  0:23           ` Petr Baudis
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-07-19 23:54 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

> A new test has been added to the testsuite to reflect this change.
> Also, based on suggestion by Junio about desired symlink behaviour
> of git mv, I have added two tests for that; however, I do not have
> need or desire to spend time fixing this, so they are expected
> to fail for now until someone gets around to fixing that.

Well, somebody would eventually come to help, then ;-).

>  builtin-mv.c  |   62 ++++++++-------------------------------------------------
>  cache.h       |    2 ++
>  read-cache.c  |   15 ++++++++++++++
>  t/t7001-mv.sh |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 83 insertions(+), 53 deletions(-)

Very nice code reduction, isn't it?

> diff --git a/read-cache.c b/read-cache.c
> index 1648428..70e5f57 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -38,6 +38,21 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
>  	istate->cache_changed = 1;
>  }
>  
> +void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
> +{
> +	struct cache_entry *old = istate->cache[nr], *new;
> +	int namelen = strlen(new_name);
> +
> +	new = xmalloc(cache_entry_size(namelen));
> +	copy_cache_entry(new, old);
> +	new->ce_flags = (new->ce_flags & ~CE_NAMEMASK) | namelen;
> +	memcpy(new->name, new_name, namelen);
> +
> +	cache_tree_invalidate_path(istate->cache_tree, old->name);
> +	remove_index_entry_at(istate, nr);
> +	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
> +}

Hmm, would this use of copy_cache_entry() kosher, I have to wonder.  This
new copy of cache entry begins its life unhashed, doesn't it?  Shouldn't
we be not copying its hashed/unhashed bits from the old one?

Also setting of that ce_flags looks wrong when namelen does not fit within
the width of CE_NAMEMASK.  Shouldn't it be doing the same thing as
create_ce_flags()?

> diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
> index 336cfaa..6b615f8 100755
> --- a/t/t7001-mv.sh
> +++ b/t/t7001-mv.sh
> @@ -156,4 +156,61 @@ test_expect_success 'absolute pathname outside should fail' '(
>  
>  )'
>  
> +# git mv meets angry Git maintainer

What's this comment about?

> +test_expect_success 'git mv should not change sha1 of moved cache entry' '
> +
> +	rm -fr .git &&
> +	git init &&
> +	echo 1 >dirty &&
> +	git add dirty &&
> +	entry="$(git ls-files --stage dirty | cut -f 1)"

"rev-parse :dirty"?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] git-mv: Keep moved index entries inact
  2008-07-19 23:54         ` [PATCH] git-mv: Keep moved index entries inact Junio C Hamano
@ 2008-07-21  0:23           ` Petr Baudis
  2008-07-21  0:25             ` [PATCHv2] " Petr Baudis
  2008-07-21  1:20             ` [PATCH] git-mv: Keep moved index entries inact Johannes Schindelin
  0 siblings, 2 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-21  0:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Jul 19, 2008 at 04:54:20PM -0700, Junio C Hamano wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> > diff --git a/read-cache.c b/read-cache.c
> > index 1648428..70e5f57 100644
> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -38,6 +38,21 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
> >  	istate->cache_changed = 1;
> >  }
> >  
> > +void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
> > +{
> > +	struct cache_entry *old = istate->cache[nr], *new;
> > +	int namelen = strlen(new_name);
> > +
> > +	new = xmalloc(cache_entry_size(namelen));
> > +	copy_cache_entry(new, old);
> > +	new->ce_flags = (new->ce_flags & ~CE_NAMEMASK) | namelen;
> > +	memcpy(new->name, new_name, namelen);
> > +
> > +	cache_tree_invalidate_path(istate->cache_tree, old->name);
> > +	remove_index_entry_at(istate, nr);
> > +	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
> > +}
> 
> Hmm, would this use of copy_cache_entry() kosher, I have to wonder.  This
> new copy of cache entry begins its life unhashed, doesn't it?  Shouldn't
> we be not copying its hashed/unhashed bits from the old one?
> 
> Also setting of that ce_flags looks wrong when namelen does not fit within
> the width of CE_NAMEMASK.  Shouldn't it be doing the same thing as
> create_ce_flags()?

I have to admit that I don't clearly understand all the index entry
intricacies. It is good you didn't see my earlier misguided attempts.
:-) I have patched the two mistakes you pointed out. It is too bad I
cannot simply use existing functions for this, but I want to keep a
different set of traits that either copy_cache_entry() or
create_ce_flags() assumes.

> > diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
> > index 336cfaa..6b615f8 100755
> > --- a/t/t7001-mv.sh
> > +++ b/t/t7001-mv.sh
> > @@ -156,4 +156,61 @@ test_expect_success 'absolute pathname outside should fail' '(
> >  
> >  )'
> >  
> > +# git mv meets angry Git maintainer
> 
> What's this comment about?

Oh. Well, you sounded agitated in your original mail, but this actually
just slipped through. :-)

> > +test_expect_success 'git mv should not change sha1 of moved cache entry' '
> > +
> > +	rm -fr .git &&
> > +	git init &&
> > +	echo 1 >dirty &&
> > +	git add dirty &&
> > +	entry="$(git ls-files --stage dirty | cut -f 1)"
> 
> "rev-parse :dirty"?

I want to make sure the whole index entry is intact, not just the sha1.

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-21  0:23           ` Petr Baudis
@ 2008-07-21  0:25             ` Petr Baudis
  2008-07-21  4:36               ` Junio C Hamano
  2008-07-26  6:46               ` Junio C Hamano
  2008-07-21  1:20             ` [PATCH] git-mv: Keep moved index entries inact Johannes Schindelin
  1 sibling, 2 replies; 56+ messages in thread
From: Petr Baudis @ 2008-07-21  0:25 UTC (permalink / raw)
  To: gitster; +Cc: git

The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

A new test has been added to the testsuite to reflect this change.
Also, based on suggestion by Junio about desired symlink behaviour
of git mv, I have added two tests for that; however, I do not have
need or desire to spend time fixing this, so they are expected
to fail for now until someone gets around to fixing that.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c  |   62 ++++++++-------------------------------------------------
 cache.h       |    2 ++
 read-cache.c  |   17 ++++++++++++++++
 t/t7001-mv.sh |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 53 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 33ad082..28ebc9c 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -36,17 +36,6 @@ static const char **copy_pathspec(const char *prefix, const char **pathspec,
 	return get_pathspec(prefix, result);
 }
 
-static void show_list(const char *label, struct path_list *list)
-{
-	if (list->nr > 0) {
-		int i;
-		printf("%s", label);
-		for (i = 0; i < list->nr; i++)
-			printf("%s%s", i > 0 ? ", " : "", list->items[i].path);
-		putchar('\n');
-	}
-}
-
 static const char *add_slash(const char *path)
 {
 	int len = strlen(path);
@@ -75,11 +64,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	const char **source, **destination, **dest_path;
 	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX } *modes;
 	struct stat st;
-	struct path_list overwritten = {NULL, 0, 0, 0};
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
-	struct path_list added = {NULL, 0, 0, 0};
-	struct path_list deleted = {NULL, 0, 0, 0};
-	struct path_list changed = {NULL, 0, 0, 0};
 
 	git_config(git_default_config, NULL);
 
@@ -189,7 +174,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							" will overwrite!\n",
 							bad);
 					bad = NULL;
-					path_list_insert(dst, &overwritten);
 				} else
 					bad = "Cannot overwrite";
 			}
@@ -218,6 +202,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
+		int pos;
 		if (show_only || verbose)
 			printf("Renaming %s to %s\n", src, dst);
 		if (!show_only && mode != INDEX &&
@@ -227,45 +212,16 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		assert(cache_name_pos(src, strlen(src)) >= 0);
-
-		path_list_insert(src, &deleted);
-		/* destination can be a directory with 1 file inside */
-		if (path_list_has_path(&overwritten, dst))
-			path_list_insert(dst, &changed);
-		else
-			path_list_insert(dst, &added);
+		pos = cache_name_pos(src, strlen(src));
+		assert(pos >= 0);
+		if (!show_only)
+			rename_cache_entry_at(pos, dst);
 	}
 
-	if (show_only) {
-		show_list("Changed  : ", &changed);
-		show_list("Adding   : ", &added);
-		show_list("Deleting : ", &deleted);
-	} else {
-		for (i = 0; i < changed.nr; i++) {
-			const char *path = changed.items[i].path;
-			int j = cache_name_pos(path, strlen(path));
-			struct cache_entry *ce = active_cache[j];
-
-			if (j < 0)
-				die ("Huh? Cache entry for %s unknown?", path);
-			refresh_cache_entry(ce, 0);
-		}
-
-		for (i = 0; i < added.nr; i++) {
-			const char *path = added.items[i].path;
-			if (add_file_to_cache(path, verbose ? ADD_CACHE_VERBOSE : 0))
-				die("updating index entries failed");
-		}
-
-		for (i = 0; i < deleted.nr; i++)
-			remove_file_from_cache(deleted.items[i].path);
-
-		if (active_cache_changed) {
-			if (write_cache(newfd, active_cache, active_nr) ||
-			    commit_locked_index(&lock_file))
-				die("Unable to write new index file");
-		}
+	if (active_cache_changed) {
+		if (write_cache(newfd, active_cache, active_nr) ||
+		    commit_locked_index(&lock_file))
+			die("Unable to write new index file");
 	}
 
 	return 0;
diff --git a/cache.h b/cache.h
index a779d92..6f1d003 100644
--- a/cache.h
+++ b/cache.h
@@ -260,6 +260,7 @@ static inline void remove_name_hash(struct cache_entry *ce)
 #define unmerged_cache() unmerged_index(&the_index)
 #define cache_name_pos(name, namelen) index_name_pos(&the_index,(name),(namelen))
 #define add_cache_entry(ce, option) add_index_entry(&the_index, (ce), (option))
+#define rename_cache_entry_at(pos, new_name) rename_index_entry_at(&the_index, (pos), (new_name))
 #define remove_cache_entry_at(pos) remove_index_entry_at(&the_index, (pos))
 #define remove_file_from_cache(path) remove_file_from_index(&the_index, (path))
 #define add_to_cache(path, st, flags) add_to_index(&the_index, (path), (st), (flags))
@@ -370,6 +371,7 @@ extern int index_name_pos(const struct index_state *, const char *name, int name
 #define ADD_CACHE_JUST_APPEND 8		/* Append only; tree.c::read_tree() */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option);
 extern struct cache_entry *refresh_cache_entry(struct cache_entry *ce, int really);
+extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name);
 extern int remove_index_entry_at(struct index_state *, int pos);
 extern int remove_file_from_index(struct index_state *, const char *path);
 #define ADD_CACHE_VERBOSE 1
diff --git a/read-cache.c b/read-cache.c
index 1648428..e93ee3c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -38,6 +38,23 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 	istate->cache_changed = 1;
 }
 
+void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
+{
+	struct cache_entry *old = istate->cache[nr], *new;
+	int namelen = strlen(new_name);
+
+	new = xmalloc(cache_entry_size(namelen));
+	copy_cache_entry(new, old);
+	new->ce_flags = (new->ce_flags & ~CE_HASHED) | CE_UNHASHED;
+	new->ce_flags = (new->ce_flags & ~CE_NAMEMASK)
+	                | (namelen >= CE_NAMEMASK ? CE_NAMEMASK : namelen);
+	memcpy(new->name, new_name, namelen);
+
+	cache_tree_invalidate_path(istate->cache_tree, old->name);
+	remove_index_entry_at(istate, nr);
+	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
+}
+
 /*
  * This only updates the "non-critical" parts of the directory
  * cache, ie the parts that aren't tracked by GIT, and only used
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 336cfaa..7e47931 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -156,4 +156,59 @@ test_expect_success 'absolute pathname outside should fail' '(
 
 )'
 
+test_expect_success 'git mv should not change sha1 of moved cache entry' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >dirty &&
+	git add dirty &&
+	entry="$(git ls-files --stage dirty | cut -f 1)"
+	git mv dirty dirty2 &&
+	[ "$entry" = "$(git ls-files --stage dirty2 | cut -f 1)" ] &&
+	echo 2 >dirty2 &&
+	git mv dirty2 dirty &&
+	[ "$entry" = "$(git ls-files --stage dirty | cut -f 1)" ]
+
+'
+
+rm -f dirty dirty2
+
+test_expect_failure 'git mv should overwrite symlink to a file' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >moved &&
+	ln -s moved symlink &&
+	git add moved symlink &&
+	! git mv moved symlink &&
+	git mv -f moved symlink &&
+	[ ! -e moved ] &&
+	[ -f symlink ] &&
+	[ $(cat symlink) = 1 ] &&
+	git diff-files --quiet
+
+'
+
+rm -f moved symlink
+
+test_expect_failure 'git mv should follow symlink to a directory' '
+
+	rm -fr .git &&
+	git init &&
+	echo 1 >moved &&
+	mkdir -p dir &&
+	touch dir/.keep &&
+	ln -s dir symlink &&
+	git add moved dir/.keep symlink &&
+	git mv moved symlink &&
+	[ ! -e moved ] &&
+	[ -f symlink/moved ] &&
+	[ $(cat symlink/moved) = 1 ] &&
+	[ "$(ls dir)" = "$(ls symlink)" ] &&
+	git diff-files --quiet
+
+'
+
+rm -rf moved symlink dir
+
 test_done

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] git-mv: Keep moved index entries inact
  2008-07-21  0:23           ` Petr Baudis
  2008-07-21  0:25             ` [PATCHv2] " Petr Baudis
@ 2008-07-21  1:20             ` Johannes Schindelin
  2008-07-21  7:18               ` Petr Baudis
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-21  1:20 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git

Hi,

On Mon, 21 Jul 2008, Petr Baudis wrote:

> On Sat, Jul 19, 2008 at 04:54:20PM -0700, Junio C Hamano wrote:
> > Petr Baudis <pasky@suse.cz> writes:
> > 
> > > +test_expect_success 'git mv should not change sha1 of moved cache entry' '
> > > +
> > > +	rm -fr .git &&
> > > +	git init &&
> > > +	echo 1 >dirty &&
> > > +	git add dirty &&
> > > +	entry="$(git ls-files --stage dirty | cut -f 1)"
> > 
> > "rev-parse :dirty"?
> 
> I want to make sure the whole index entry is intact, not just the sha1.

"rev-parse :dirty" will have to read the index to get at the object name 
of "dirty".  So there you have your index validation for you.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-21  0:25             ` [PATCHv2] " Petr Baudis
@ 2008-07-21  4:36               ` Junio C Hamano
  2008-07-26  6:46               ` Junio C Hamano
  1 sibling, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-21  4:36 UTC (permalink / raw)
  To: Petr Baudis; +Cc: gitster, git

Petr Baudis <pasky@suse.cz> writes:

> This patch introduces rename_index_entry_at() into the index toolkit,
> which will rename an entry while removing any entries the new entry
> might render duplicate. This is then used in git mv instead
> of all the file queues, resulting in a major simplification
> of the code and an inevitable change in git mv -n output format.
>
> A new test has been added to the testsuite to reflect this change.
> Also, based on suggestion by Junio about desired symlink behaviour
> of git mv, I have added two tests for that; however, I do not have
> need or desire to spend time fixing this, so they are expected
> to fail for now until someone gets around to fixing that.

I haven't looked into the builtin-mv changes to see how involved that fix
would be yet (and I do not use "mv" and this is somewhat lower priority
for me myself), but I did look at changes to read-cache.c.  I've queued a
fixed up version in 'pu' for now but I am hoping that we can see some
comments from more competent people on the patch and move the review
result to 'next' shortly.

This may be a change in semantics, but after thinking about it a bit more,
I think this can even go into maintenance series.  IOW, it is really a
fix.  Nobody sane should have been relying on the old behaviour that new
contents of dirty paths are staged in the index sometimes, which was
simply just crazy.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] git-mv: Keep moved index entries inact
  2008-07-21  1:20             ` [PATCH] git-mv: Keep moved index entries inact Johannes Schindelin
@ 2008-07-21  7:18               ` Petr Baudis
  2008-07-21  7:38                 ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-21  7:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

  Hi,

On Mon, Jul 21, 2008 at 03:20:46AM +0200, Johannes Schindelin wrote:
> On Mon, 21 Jul 2008, Petr Baudis wrote:
> > I want to make sure the whole index entry is intact, not just the sha1.
> 
> "rev-parse :dirty" will have to read the index to get at the object name 
> of "dirty".  So there you have your index validation for you.

  it will test if the entry stays _valid_, but not whether it stays the
_same_.

				Petr "Pasky" Baudis

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH] git-mv: Keep moved index entries inact
  2008-07-21  7:18               ` Petr Baudis
@ 2008-07-21  7:38                 ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-21  7:38 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, git

Petr Baudis <pasky@suse.cz> writes:

> On Mon, Jul 21, 2008 at 03:20:46AM +0200, Johannes Schindelin wrote:
>> On Mon, 21 Jul 2008, Petr Baudis wrote:
>> > I want to make sure the whole index entry is intact, not just the sha1.
>> 
>> "rev-parse :dirty" will have to read the index to get at the object name 
>> of "dirty".  So there you have your index validation for you.
>
>   it will test if the entry stays _valid_, but not whether it stays the
> _same_.

You are right.  You would want to catch breakages like a file losing its
executable bits, which you cannot detect by grabbing "rev-parse :dirty"
and comparing it with the expected value.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-21  0:25             ` [PATCHv2] " Petr Baudis
  2008-07-21  4:36               ` Junio C Hamano
@ 2008-07-26  6:46               ` Junio C Hamano
  2008-07-27 13:41                 ` Petr Baudis
                                   ` (2 more replies)
  1 sibling, 3 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-26  6:46 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis <pasky@suse.cz> writes:

> The rewrite of git-mv from a shell script to a builtin was perhaps
> a little too straightforward: the git add and git rm queues were
> emulated directly, which resulted in a rather complicated code and
> caused an inconsistent behaviour when moving dirty index entries;
> git mv would update the entry based on working tree state,
> except in case of overwrites, where the new entry would still have
> sha1 of the old file.
>
> This patch introduces rename_index_entry_at() into the index toolkit,
> which will rename an entry while removing any entries the new entry
> might render duplicate. This is then used in git mv instead
> of all the file queues, resulting in a major simplification
> of the code and an inevitable change in git mv -n output format.
> ...

Thanks.  I think I've managed to fix the rename_index_entry_at() in a
satisfactory way, and also made builtin-mv to allow "mv -f symlink file"
and "mv -f file symlink".

I do not agree with the semantics of this test seems to want, though.

> +test_expect_failure 'git mv should follow symlink to a directory' '
> +
> +	rm -fr .git &&
> +	git init &&
> +	echo 1 >moved &&
> +	mkdir -p dir &&
> +	touch dir/.keep &&
> +	ln -s dir symlink &&
> +	git add moved dir/.keep symlink &&
> +	git mv moved symlink &&
> +	[ ! -e moved ] &&
> +	[ -f symlink/moved ] &&
> +	[ $(cat symlink/moved) = 1 ] &&
> +	[ "$(ls dir)" = "$(ls symlink)" ] &&
> +	git diff-files --quiet
> +
> +'

A symlink to us is just a different kind of blob, and by definition a blob
is the leaf level of a tree structure that represents the working tree in
the index. There won't be anything hanging below it, and when adding
things to the index we should not dereference the symlink to see where it
leads to.

Traditionally we have been loose about this check, and the normal "git
add" and "git update-index" codepath is still forever broken, and we
allow:

	$ mkdir dir
        $ >dir/file
        $ ln -s dir symlink
        $ git add symlink/file

but some codepaths that matter more (because they do larger damage
unattended, as opposed to the above command sequence that can be avoided
by user education a bit more easily), such as "git apply" and "git
read-tree", have been corrected using has_symlink_leading_path() since mid
2007.  We would need to follow through c40641b (Optimize symlink/directory
detection, 2008-05-09) and further fix "git add" and "git update-index"
codepaths to forbid the above command sequence.

So my take on the above test piece is that after:

	>moved
	mkdir dir
        >dir/file
        ln -s dir symlink
	git add moved dir symlink

This should fail, as it is an overwrite:

	git mv moved symlink

and with "-f", the command should simply remove symlink and replace it
with a regular file whose contents come from the original "moved".

IOW, what a symlink points at should not matter.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-26  6:46               ` Junio C Hamano
@ 2008-07-27 13:41                 ` Petr Baudis
  2008-07-27 13:47                   ` [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries Petr Baudis
  2008-07-28 14:20                 ` [PATCHv2] git-mv: Keep moved index entries inact SZEDER Gábor
  2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
  2 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-27 13:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Jul 25, 2008 at 11:46:02PM -0700, Junio C Hamano wrote:
> Thanks.  I think I've managed to fix the rename_index_entry_at() in a
> satisfactory way, and also made builtin-mv to allow "mv -f symlink file"
> and "mv -f file symlink".

Oh, sorry, I didn't realize there were still problems with the original
one, I would try it on my own in that case.

> So my take on the above test piece is that after:
> 
> 	>moved
> 	mkdir dir
>         >dir/file
>         ln -s dir symlink
> 	git add moved dir symlink
> 
> This should fail, as it is an overwrite:
> 
> 	git mv moved symlink
> 
> and with "-f", the command should simply remove symlink and replace it
> with a regular file whose contents come from the original "moved".
> 
> IOW, what a symlink points at should not matter.

You convinced me, yes. (Especially since I started actually using
symlinks in some of my projects very recently and this would be the
exact semantic I would eventually expect as well.)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries
  2008-07-27 13:41                 ` Petr Baudis
@ 2008-07-27 13:47                   ` Petr Baudis
  2008-07-28  1:13                     ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-27 13:47 UTC (permalink / raw)
  To: git; +Cc: gitster

Currently, git-mv will declare "not under source control" on an attempt
to move a conflicted index entry. This patch adds an expect_failure
testcase for this case, since this is an artificial restriction. (However,
the scenario is not critical enough for the author to fix right now.)

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

I don't really know if it is ok to make "feature requests" like this by
adding failing testcases...

 t/t7001-mv.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 7e47931..241e9a2 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -173,6 +173,33 @@ test_expect_success 'git mv should not change sha1 of moved cache entry' '
 
 rm -f dirty dirty2
 
+cat >multistage <<EOT
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 1	staged
+100755 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 2	staged
+100755 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 3	staged
+EOT
+
+# Rationale: I cannot git mv around a conflicted file. This is unnecessary
+# restriction in case another part of conflict resolution requires me to
+# move the file around.
+test_expect_failure 'git mv should move all stages of cache entry' '
+
+	rm -fr .git &&
+	git init &&
+	# git mv requires object to exist in working tree (bug?)
+	touch staged &&
+	git update-index --index-info <multistage &&
+	git ls-files --stage >lsf_output &&
+	test_cmp multistage lsf_output &&
+	git mv staged staged-mv &&
+	sed "s/staged/staged-mv/" <multistage >multistage-mv &&
+	git ls-files --stage >lsf_output &&
+	test_cmp multistage-mv lsf_output
+
+'
+
+rm -f multistage multistage-mv lsf_output staged
+
 test_expect_failure 'git mv should overwrite symlink to a file' '
 
 	rm -fr .git &&

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries
  2008-07-27 13:47                   ` [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries Petr Baudis
@ 2008-07-28  1:13                     ` Junio C Hamano
  2008-07-28  1:21                       ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-07-28  1:13 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, gitster

Petr Baudis <pasky@suse.cz> writes:

> I don't really know if it is ok to make "feature requests" like this by
> adding failing testcases...

Of course it depends on who you are.  It's not Ok for somebody like you,
who is known to be competent, and certainly it is not Ok to do it on a
command that you know you care more about than I do.

I've neglected builtin-mv.c since its introduction, mostly because I never
say "git mv" myself (in other words, I haven't cared very much how broken
it was.  For one thing, its indentation style is peculiar and is hard to
read and maintain).

> +# Rationale: I cannot git mv around a conflicted file. This is unnecessary
> +# restriction in case another part of conflict resolution requires me to
> +# move the file around.

Yes, I would agree this is a reasonable thing to support.  Something like
this patch, perhaps.

---
 builtin-mv.c  |   21 ++++++++++++++-------
 t/t7001-mv.sh |   21 +++++++++++++++++++++
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 4f65b5a..cc9e505 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -96,7 +96,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length, src_is_dir, pos;
 		const char *bad = NULL;
 
 		if (show_only)
@@ -177,7 +177,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = "Cannot overwrite";
 			}
-		} else if (cache_name_pos(src, length) < 0)
+		} else if (((pos = cache_name_pos(src, length)) < 0) &&
+			   strcmp(active_cache[-1 - pos]->name, src))
 			bad = "not under version control";
 		else if (string_list_has_string(&src_for_dst, dst))
 			bad = "multiple sources for the same target";
@@ -202,7 +203,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
-		int pos;
 		if (show_only || verbose)
 			printf("Renaming %s to %s\n", src, dst);
 		if (!show_only && mode != INDEX &&
@@ -212,10 +212,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		pos = cache_name_pos(src, strlen(src));
-		assert(pos >= 0);
-		if (!show_only)
-			rename_cache_entry_at(pos, dst);
+		if (!show_only) {
+			while (1) {
+				int pos = cache_name_pos(src, strlen(src));
+				if (pos < 0)
+					pos = -1 - pos;
+				if ((active_nr <= pos) ||
+				    strcmp(active_cache[pos]->name, src))
+					break;
+				rename_cache_entry_at(pos, dst);
+			}
+		}
 	}
 
 	if (active_cache_changed) {
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..d538f88 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -173,6 +173,27 @@ test_expect_success 'git mv should not change sha1 of moved cache entry' '
 
 rm -f dirty dirty2
 
+cat >expect <<\EOT
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 1	staged
+100755 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 2	staged
+100755 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 3	staged
+EOT
+
+test_expect_success 'git mv should move all stages of cache entry' '
+	rm -fr .git &&
+	git init &&
+	>staged &&
+	git update-index --index-info <expect &&
+	git ls-files --stage >actual &&
+	test_cmp expect actual &&
+	git mv staged staged-mv &&
+	sed "s/staged/staged-mv/" <expect >expect-2 &&
+	git ls-files --stage >actual &&
+	test_cmp expect-2 actual
+'
+
+rm -f expect expect-2 staged actual staged-mv
+
 test_expect_success 'git mv should overwrite symlink to a file' '
 
 	rm -fr .git &&

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries
  2008-07-28  1:13                     ` Junio C Hamano
@ 2008-07-28  1:21                       ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-28  1:21 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, gitster

Junio C Hamano <gitster@pobox.com> writes:

>> +# Rationale: I cannot git mv around a conflicted file. This is unnecessary
>> +# restriction in case another part of conflict resolution requires me to
>> +# move the file around.
>
> Yes, I would agree this is a reasonable thing to support.  Something like
> this patch, perhaps.
> ...

Just in case if somebody is inclined to test the patch and polish it into
a shape good enough for inclusion...

> @@ -177,7 +177,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				} else
>  					bad = "Cannot overwrite";
>  			}
> -		} else if (cache_name_pos(src, length) < 0)
> +		} else if (((pos = cache_name_pos(src, length)) < 0) &&
> +			   strcmp(active_cache[-1 - pos]->name, src))

There is a bug here; "-1 - pos" needs to be checked against active_nr
before strcmp().

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-26  6:46               ` Junio C Hamano
  2008-07-27 13:41                 ` Petr Baudis
@ 2008-07-28 14:20                 ` SZEDER Gábor
  2008-07-28 15:06                   ` Johannes Schindelin
  2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
  2 siblings, 1 reply; 56+ messages in thread
From: SZEDER Gábor @ 2008-07-28 14:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Petr Baudis, git

Hi,

there is a race somewhere in these 'git-mv: Keep moved index entries
inact' changes.

The test cases 'git mv should overwrite symlink to a file' or 'git mv
should overwrite file with a symlink' fail occasionaly.  It's quite
non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
one or the other usually fails around 50 runs (but sometimes only
after 150).  Adding some tracing echos to the tests shows that both
tests fail when running 'git diff-files' at the end.

Regards,
Gábor


---
#!/bin/bash

ret=0
i=0
while test $ret = 0 ; do
        GIT_TEST_OPTS='--verbose --debug' make t7001-mv.sh
        ret=$?
        i=$((i+1))
done
echo "Failed at ${i}th run"

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 14:20                 ` [PATCHv2] git-mv: Keep moved index entries inact SZEDER Gábor
@ 2008-07-28 15:06                   ` Johannes Schindelin
  2008-07-28 15:14                     ` Johannes Schindelin
  2008-07-28 19:19                     ` Junio C Hamano
  0 siblings, 2 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-28 15:06 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Petr Baudis, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1086 bytes --]

Hi,

On Mon, 28 Jul 2008, SZEDER Gábor wrote:

> there is a race somewhere in these 'git-mv: Keep moved index entries
> inact' changes.
> 
> The test cases 'git mv should overwrite symlink to a file' or 'git mv
> should overwrite file with a symlink' fail occasionaly.  It's quite
> non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
> one or the other usually fails around 50 runs (but sometimes only
> after 150).  Adding some tracing echos to the tests shows that both
> tests fail when running 'git diff-files' at the end.

To make it more convenient to test: with this patch it fails all the time:

-- snipsnap --

 t/t7001-mv.sh |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..6699abd 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -180,6 +180,7 @@ test_expect_success 'git mv should overwrite symlink to a file' '
 	echo 1 >moved &&
 	ln -s moved symlink &&
 	git add moved symlink &&
+	sleep 1 &&
 	test_must_fail git mv moved symlink &&
 	git mv -f moved symlink &&
 	! test -e moved &&

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 15:06                   ` Johannes Schindelin
@ 2008-07-28 15:14                     ` Johannes Schindelin
  2008-07-28 18:24                       ` Johannes Schindelin
  2008-07-28 19:19                     ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-28 15:14 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Petr Baudis, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1132 bytes --]

Hi,

On Mon, 28 Jul 2008, Johannes Schindelin wrote:

> On Mon, 28 Jul 2008, SZEDER Gábor wrote:
> 
> > there is a race somewhere in these 'git-mv: Keep moved index entries
> > inact' changes.
> > 
> > The test cases 'git mv should overwrite symlink to a file' or 'git mv
> > should overwrite file with a symlink' fail occasionaly.  It's quite
> > non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
> > one or the other usually fails around 50 runs (but sometimes only
> > after 150).  Adding some tracing echos to the tests shows that both
> > tests fail when running 'git diff-files' at the end.
> 
> To make it more convenient to test: with this patch it fails all the time:

Ooops.  Seems like I changed the test 23 to fail, instead of test 24.  
However, I think it is the same bug: the index is newer by one second, so 
it seems that the patch for builtin-mv.c did not really keep the data 
"intact".

Note that a test case should use test-chmtime to force this scenario, not 
sleep a second.

Unfortunately, I already spent my Git time budget for today, so the ball 
is out of my half for now.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 15:14                     ` Johannes Schindelin
@ 2008-07-28 18:24                       ` Johannes Schindelin
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-28 18:24 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, Petr Baudis, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2110 bytes --]

Hi,

On Mon, 28 Jul 2008, Johannes Schindelin wrote:

> On Mon, 28 Jul 2008, Johannes Schindelin wrote:
> 
> > On Mon, 28 Jul 2008, SZEDER Gábor wrote:
> > 
> > > there is a race somewhere in these 'git-mv: Keep moved index entries 
> > > inact' changes.
> > > 
> > > The test cases 'git mv should overwrite symlink to a file' or 'git 
> > > mv should overwrite file with a symlink' fail occasionaly.  It's 
> > > quite non-deterministic:  I have run t7001-mv.sh in a loop (see 
> > > below) and one or the other usually fails around 50 runs (but 
> > > sometimes only after 150).  Adding some tracing echos to the tests 
> > > shows that both tests fail when running 'git diff-files' at the end.
> > 
> > To make it more convenient to test: with this patch it fails all the 
> > time:
> 
> Ooops.  Seems like I changed the test 23 to fail, instead of test 24.  
> However, I think it is the same bug: the index is newer by one second, 
> so it seems that the patch for builtin-mv.c did not really keep the data 
> "intact".
> 
> Note that a test case should use test-chmtime to force this scenario, 
> not sleep a second.
> 
> Unfortunately, I already spent my Git time budget for today, so the ball 
> is out of my half for now.

Hah!  I had a few minutes, and this is my analysis:

Just try to "mv" a file, and look at the _ctime_ before and after.  Yes, 
that is right, at least on my system (ext3) it _changes_.

So the test 23 and 24 in t7001-mv.sh are totally bogus.  They purport to 
test that git-mv retains the whole meta-information in the cache and 
therefore the index does not need to be updated.

However, it _does_ need to be updated, exactly because ctime changed.

Only that the test failed to test what it tried to test, instead 
succeeding erroneously, just because the index was racy most of the time 
and got silently updated.

So, this is the analysis.  The fixes will have to be done by somebody 
else, because /me goes running now.

(Possible fixes I envisage: update ctime via stat() after rename(), or 
just give up and scrap the whole "leave cache_entry inact" thing.)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 15:06                   ` Johannes Schindelin
  2008-07-28 15:14                     ` Johannes Schindelin
@ 2008-07-28 19:19                     ` Junio C Hamano
  2008-07-28 23:41                       ` Johannes Schindelin
  2008-07-29  0:17                       ` Petr Baudis
  1 sibling, 2 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-28 19:19 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: SZEDER Gábor, Petr Baudis, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Mon, 28 Jul 2008, SZEDER Gábor wrote:
>
>> there is a race somewhere in these 'git-mv: Keep moved index entries
>> inact' changes.
>> 
>> The test cases 'git mv should overwrite symlink to a file' or 'git mv
>> should overwrite file with a symlink' fail occasionaly.  It's quite
>> non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
>> one or the other usually fails around 50 runs (but sometimes only
>> after 150).  Adding some tracing echos to the tests shows that both
>> tests fail when running 'git diff-files' at the end.
>
> To make it more convenient to test: with this patch it fails all the time:

It's because we rename(2) but do not read back ctime, and reuse the cached
data from the old path that was renamed.  After the failed test that moves
a regular file "move" to "symlink":

$ stat symlink
  File: `symlink'
  Size: 2               Blocks: 8          IO Block: 4096   regular file
Device: 30ah/778d       Inode: 18104337    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1012/   junio)   Gid: (   40/     src)
Access: 2008-07-28 11:49:55.000000000 -0700
Modify: 2008-07-28 11:48:41.000000000 -0700
Change: 2008-07-28 11:48:42.000000000 -0700

But the cached stat information looks like this:

$ ../../git-ls-files --stat
ctime=1217270921, mtime=1217270921, ino=18104337, mode=100644, uid=1012, gid=40symlink

We need to refresh the entry to pick up potential ctime changes.


 read-cache.c       |    7 ++++++-
 builtin-ls-files.c |   21 +++++++++++++++------
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 1cae361..834096f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -40,7 +40,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 
 void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
 {
-	struct cache_entry *old = istate->cache[nr], *new;
+	struct cache_entry *old = istate->cache[nr], *new, *refreshed;
 	int namelen = strlen(new_name);
 
 	new = xmalloc(cache_entry_size(namelen));
@@ -51,6 +51,11 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 
 	cache_tree_invalidate_path(istate->cache_tree, old->name);
 	remove_index_entry_at(istate, nr);
+
+	/* the renaming could have smudged the ctime */
+	refreshed = refresh_cache_entry(new, 0);
+	if (refreshed && refreshed != new)
+		new = refreshed;
 	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
 
diff --git a/builtin-ls-files.c b/builtin-ls-files.c
index e8d568e..a6b30c8 100644
--- a/builtin-ls-files.c
+++ b/builtin-ls-files.c
@@ -16,6 +16,7 @@ static int show_deleted;
 static int show_cached;
 static int show_others;
 static int show_stage;
+static int show_stat;
 static int show_unmerged;
 static int show_modified;
 static int show_killed;
@@ -205,16 +206,20 @@ static void show_ce_entry(const char *tag, struct cache_entry *ce)
 		tag = alttag;
 	}
 
-	if (!show_stage) {
-		fputs(tag, stdout);
-	} else {
+	if (show_stage)
 		printf("%s%06o %s %d\t",
 		       tag,
 		       ce->ce_mode,
 		       abbrev ? find_unique_abbrev(ce->sha1,abbrev)
 				: sha1_to_hex(ce->sha1),
 		       ce_stage(ce));
-	}
+	else if (show_stat)
+		printf("ctime=%u, mtime=%u, ino=%u, mode=%o, uid=%u, gid=%u\t",
+		       ce->ce_ctime, ce->ce_mtime, ce->ce_ino,
+		       ce->ce_mode, ce->ce_uid, ce->ce_gid);
+
+	else
+		fputs(tag, stdout);
 	write_name_quoted(ce->name + offset, stdout, line_terminator);
 }
 
@@ -235,7 +240,7 @@ static void show_files(struct dir_struct *dir, const char *prefix)
 		if (show_killed)
 			show_killed_files(dir);
 	}
-	if (show_cached | show_stage) {
+	if (show_cached | show_stage | show_stat) {
 		for (i = 0; i < active_nr; i++) {
 			struct cache_entry *ce = active_cache[i];
 			int dtype = ce_to_dtype(ce);
@@ -488,6 +493,10 @@ int cmd_ls_files(int argc, const char **argv, const char *prefix)
 			show_stage = 1;
 			continue;
 		}
+		if (!strcmp(arg, "-S") || !strcmp(arg, "--stat")) {
+			show_stat = 1;
+			continue;
+		}
 		if (!strcmp(arg, "-k") || !strcmp(arg, "--killed")) {
 			show_killed = 1;
 			require_work_tree = 1;
@@ -593,7 +602,7 @@ int cmd_ls_files(int argc, const char **argv, const char *prefix)
 
 	/* With no flags, we default to showing the cached files */
 	if (!(show_stage | show_deleted | show_others | show_unmerged |
-	      show_killed | show_modified))
+	      show_killed | show_modified | show_stat))
 		show_cached = 1;
 
 	read_cache();

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 19:19                     ` Junio C Hamano
@ 2008-07-28 23:41                       ` Johannes Schindelin
  2008-07-28 23:55                         ` Johannes Schindelin
  2008-07-29  0:17                       ` Petr Baudis
  1 sibling, 1 reply; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-28 23:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, Petr Baudis, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1943 bytes --]

Hi,

On Mon, 28 Jul 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Mon, 28 Jul 2008, SZEDER Gábor wrote:
> >
> >> there is a race somewhere in these 'git-mv: Keep moved index entries
> >> inact' changes.
> >> 
> >> The test cases 'git mv should overwrite symlink to a file' or 'git mv
> >> should overwrite file with a symlink' fail occasionaly.  It's quite
> >> non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
> >> one or the other usually fails around 50 runs (but sometimes only
> >> after 150).  Adding some tracing echos to the tests shows that both
> >> tests fail when running 'git diff-files' at the end.
> >
> > To make it more convenient to test: with this patch it fails all the time:
> 
> It's because we rename(2) but do not read back ctime, and reuse the cached
> data from the old path that was renamed.  After the failed test that moves
> a regular file "move" to "symlink":
> 
> $ stat symlink
>   File: `symlink'
>   Size: 2               Blocks: 8          IO Block: 4096   regular file
> Device: 30ah/778d       Inode: 18104337    Links: 1
> Access: (0664/-rw-rw-r--)  Uid: ( 1012/   junio)   Gid: (   40/     src)
> Access: 2008-07-28 11:49:55.000000000 -0700
> Modify: 2008-07-28 11:48:41.000000000 -0700
> Change: 2008-07-28 11:48:42.000000000 -0700
> 
> But the cached stat information looks like this:
> 
> $ ../../git-ls-files --stat
> ctime=1217270921, mtime=1217270921, ino=18104337, mode=100644, uid=1012, gid=40symlink
> 
> We need to refresh the entry to pick up potential ctime changes.

Yep.

Tested-by: me

BTW I have no idea how we could test for this, short of introducing the 
"sleep 1" I did earlier.  Maybe guard it with a TEST_EXPENSIVE_CTIME 
variable or something similar.  Dunno.

And my suggestion to use test-chmtime: please just forget about it, and 
just assume that I had some very good wiid(1) in my pipe(2).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 23:41                       ` Johannes Schindelin
@ 2008-07-28 23:55                         ` Johannes Schindelin
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-07-28 23:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, Petr Baudis, git

Hi,

On Tue, 29 Jul 2008, Johannes Schindelin wrote:

> BTW I have no idea how we could test for this, short of introducing the 
> "sleep 1" I did earlier.  Maybe guard it with a TEST_EXPENSIVE_CTIME 
> variable or something similar.  Dunno.

IOW something like this squashed into your patch:

-- snipsnap --

 t/t7001-mv.sh |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..c749059 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -179,6 +179,10 @@ test_expect_success 'git mv should overwrite symlink to a file' '
 	git init &&
 	echo 1 >moved &&
 	ln -s moved symlink &&
+	if test ! -z "$TEST_EXPENSIVE_CTIME"
+	then
+		sleep 1
+	fi &&
 	git add moved symlink &&
 	test_must_fail git mv moved symlink &&
 	git mv -f moved symlink &&

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-28 19:19                     ` Junio C Hamano
  2008-07-28 23:41                       ` Johannes Schindelin
@ 2008-07-29  0:17                       ` Petr Baudis
  2008-07-29  0:46                         ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Petr Baudis @ 2008-07-29  0:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, SZEDER Gábor, git

On Mon, Jul 28, 2008 at 12:19:19PM -0700, Junio C Hamano wrote:
> We need to refresh the entry to pick up potential ctime changes.
> 
>  read-cache.c       |    7 ++++++-
>  builtin-ls-files.c |   21 +++++++++++++++------
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 1cae361..834096f 100644

Oops, silly me. Sorry for being slower than you at fixing this. ;-)

> diff --git a/builtin-ls-files.c b/builtin-ls-files.c
> index e8d568e..a6b30c8 100644

If you are going to apply this, you might be interested in squashing
this:

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index f43af41..2ead7af 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -53,7 +53,14 @@ OPTIONS
 
 -s::
 --stage::
-	Show stage files in the output
+	Show cached files in an extended format also listing the entry
+	mode, sha1 and stage number; see below for details.
+
+-S::
+--stat::
+	Show cached files in a special format listing stat information
+	stored in the index; this is useful probably only for Git
+	debugging purposes.
 
 --directory::
 	If a whole directory is classified as "other", show just its

(and even if you aren't, maybe the first part is useful, though it's
minor.)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-29  0:17                       ` Petr Baudis
@ 2008-07-29  0:46                         ` Junio C Hamano
  2008-07-29  5:23                           ` Junio C Hamano
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-07-29  0:46 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, SZEDER Gábor, git

Petr Baudis <pasky@suse.cz> writes:

> On Mon, Jul 28, 2008 at 12:19:19PM -0700, Junio C Hamano wrote:
>> We need to refresh the entry to pick up potential ctime changes.
>> 
>>  read-cache.c       |    7 ++++++-
>>  builtin-ls-files.c |   21 +++++++++++++++------
>>  2 files changed, 21 insertions(+), 7 deletions(-)
>> 
>> diff --git a/read-cache.c b/read-cache.c
>> index 1cae361..834096f 100644
>
> Oops, silly me. Sorry for being slower than you at fixing this. ;-)

I did think about ctime issues when I applied the patch.

rename(2) is hardlink to new name followed by unlink of old name, so
internally link count changes twice (and the first "link to new" can
exceed max links and is even allowed to make the whole thing fail); but I
do not think of any other reason for this change in ctime.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCHv2] git-mv: Keep moved index entries inact
  2008-07-29  0:46                         ` Junio C Hamano
@ 2008-07-29  5:23                           ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-07-29  5:23 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, SZEDER Gábor, git

Junio C Hamano <gitster@pobox.com> writes:

> Petr Baudis <pasky@suse.cz> writes:
>
>> On Mon, Jul 28, 2008 at 12:19:19PM -0700, Junio C Hamano wrote:
>>> We need to refresh the entry to pick up potential ctime changes.
>>> 
>>>  read-cache.c       |    7 ++++++-
>>>  builtin-ls-files.c |   21 +++++++++++++++------
>>>  2 files changed, 21 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/read-cache.c b/read-cache.c
>>> index 1cae361..834096f 100644
>>
>> Oops, silly me. Sorry for being slower than you at fixing this. ;-)
>
> I did think about ctime issues when I applied the patch.

Actually I changed my mind.

I think it is wrong for something as low-level as rename_index_entry_at()
to unconditionally look at the working tree and try refreshing the entry.
The next caller of this function may not even require a working tree.

I think Dscho is correct; expecting the saved cacheinfo to stay fresh
across rename does not make much sense.  What we care about with "git mv"
is that we keep what the user staged, i.e. kind of blob and the contents.
It cannot be guaranteed if the index entry is stat clean or not, as
st_ctime may or may not be updated across rename(2) according to POSIX.

So let's do this instead.

-- >8 --
t7001: fix "git mv" test 

The test assumed that we can keep the cached stat information fresh across
rename(2); many filesystems however update st_ctime (and POSIX allows them
to do so), so that assumption does not hold.  We can explicitly refresh
the index for the purpose of the test, as the only thing we are interested
in is the staged contents and the mode bits are preserved across "mv".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7001-mv.sh |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..910a28c 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -185,6 +185,7 @@ test_expect_success 'git mv should overwrite symlink to a file' '
 	! test -e moved &&
 	test -f symlink &&
 	test "$(cat symlink)" = 1 &&
+	git update-index --refresh &&
 	git diff-files --quiet
 
 '
@@ -202,6 +203,7 @@ test_expect_success 'git mv should overwrite file with a symlink' '
 	git mv -f symlink moved &&
 	! test -e symlink &&
 	test -h moved &&
+	git update-index --refresh &&
 	git diff-files --quiet
 
 '

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Not going beyond symbolic links
  2008-07-26  6:46               ` Junio C Hamano
  2008-07-27 13:41                 ` Petr Baudis
  2008-07-28 14:20                 ` [PATCHv2] git-mv: Keep moved index entries inact SZEDER Gábor
@ 2008-08-04  7:49                 ` Junio C Hamano
  2008-08-04  7:51                   ` [PATCH 1/2] update-index: refuse to add working tree items beyond symlinks Junio C Hamano
                                     ` (2 more replies)
  2 siblings, 3 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-08-04  7:49 UTC (permalink / raw)
  To: git; +Cc: Petr Baudis, Linus Torvalds, Shawn O. Pearce, Johannes Schindelin

Junio C Hamano <gitster@pobox.com> writes:

> A symlink to us is just a different kind of blob, and by definition a blob
> is the leaf level of a tree structure that represents the working tree in
> the index. There won't be anything hanging below it, and when adding
> things to the index we should not dereference the symlink to see where it
> leads to.
>
> Traditionally we have been loose about this check, and the normal "git
> add" and "git update-index" codepath is still forever broken, and we
> allow:
>
>       $ mkdir dir
>       $ >dir/file
>       $ ln -s dir symlink
>       $ git add symlink/file
>
> but some codepaths that matter more (because they do larger damage
> unattended, as opposed to the above command sequence that can be avoided
> by user education a bit more easily), such as "git apply" and "git
> read-tree", have been corrected using has_symlink_leading_path() since mid
> 2007.  We would need to follow through c40641b (Optimize symlink/directory
> detection, 2008-05-09) and further fix "git add" and "git update-index"
> codepaths to forbid the above command sequence.

I started to revisit this issue and patched "git update-index --add"
and "git add" so far.  Patches follow.

I think we should also check the following and fix them if any of them is
broken.  Under the same "dir/file exists but symlink points at dir"
scenario:

 * "git rm symlink/file" --- should fail without removing dir/file;

 * "git ls-tree $commit -- symlink/file" should *not* fail if $commit does
   have "symlink/file" in it (iow, we cannot add the logic to get_pathspec());

 * "git ls-files --exclude-standard -o -- symlink/file" should not talk
   about "symlink/file".

If we reword the paragraph I quoted at the beginning of this message
slightly:

	A gitlink to is is just a leaf level of a tree structure that
	represents the working tree in the index.  There won't be anything
	hanging below it.

We would need a similar check to stop at module boundary, just like the
helper function we use for these patches, has_symlink_leading_path(),
stops at a symlink.

So without further ado...

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 1/2] update-index: refuse to add working tree items beyond symlinks
  2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
@ 2008-08-04  7:51                   ` Junio C Hamano
  2008-08-04  7:52                   ` [PATCH 2/2] add: " Junio C Hamano
  2008-08-05  0:21                   ` Not going beyond symbolic links Linus Torvalds
  2 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-08-04  7:51 UTC (permalink / raw)
  To: git; +Cc: Petr Baudis, Linus Torvalds, Shawn O. Pearce, Johannes Schindelin

When "sym" is a symbolic link that is inside the working tree, and it
points at a directory "dir" that has "path" in it, "update-index --add
sym/path" used to mistakenly add "sym/path" as if "sym" were a normal
directory.

"git apply", "git diff" and "git merge" have been taught about this issue
some time ago, but "update-index" and "add" have been left ignorant for
too long.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-update-index.c     |    5 ++++-
 t/t0055-beyond-symlinks.sh |   20 ++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletions(-)
 create mode 100755 t/t0055-beyond-symlinks.sh

diff --git a/builtin-update-index.c b/builtin-update-index.c
index 38eb53c..434cb8e 100644
--- a/builtin-update-index.c
+++ b/builtin-update-index.c
@@ -194,6 +194,10 @@ static int process_path(const char *path)
 	int len;
 	struct stat st;
 
+	len = strlen(path);
+	if (has_symlink_leading_path(len, path))
+		return error("'%s' is beyond a symbolic link", path);
+
 	/*
 	 * First things first: get the stat information, to decide
 	 * what to do about the pathname!
@@ -201,7 +205,6 @@ static int process_path(const char *path)
 	if (lstat(path, &st) < 0)
 		return process_lstat_error(path, errno);
 
-	len = strlen(path);
 	if (S_ISDIR(st.st_mode))
 		return process_directory(path, len, &st);
 
diff --git a/t/t0055-beyond-symlinks.sh b/t/t0055-beyond-symlinks.sh
new file mode 100755
index 0000000..eb11dd7
--- /dev/null
+++ b/t/t0055-beyond-symlinks.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+
+test_description='update-index refuses to add beyond symlinks'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	>a &&
+	mkdir b &&
+	ln -s b c &&
+	>c/d &&
+	git update-index --add a b/d
+'
+
+test_expect_success 'update-index --add beyond symlinks' '
+	test_must_fail git update-index --add c/d &&
+	! ( git ls-files | grep c/d )
+'
+
+test_done
-- 
1.6.0.rc1.64.g61192

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 2/2] add: refuse to add working tree items beyond symlinks
  2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
  2008-08-04  7:51                   ` [PATCH 1/2] update-index: refuse to add working tree items beyond symlinks Junio C Hamano
@ 2008-08-04  7:52                   ` Junio C Hamano
  2008-08-05  0:21                   ` Not going beyond symbolic links Linus Torvalds
  2 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-08-04  7:52 UTC (permalink / raw)
  To: git; +Cc: Petr Baudis, Linus Torvalds, Shawn O. Pearce, Johannes Schindelin

This is the same fix for the issue of adding "sym/path" when "sym" is a
symblic link that points at a directory "dir" with "path" in it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-add.c              |   12 +++++++++++-
 dir.c                      |    6 +++++-
 t/t0055-beyond-symlinks.sh |    7 ++++++-
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin-add.c b/builtin-add.c
index fc3f96e..81b64d7 100644
--- a/builtin-add.c
+++ b/builtin-add.c
@@ -153,6 +153,16 @@ static const char **validate_pathspec(int argc, const char **argv, const char *p
 {
 	const char **pathspec = get_pathspec(prefix, argv);
 
+	if (pathspec) {
+		const char **p;
+		for (p = pathspec; *p; p++) {
+			if (has_symlink_leading_path(strlen(*p), *p)) {
+				int len = prefix ? strlen(prefix) : 0;
+				die("'%s' is beyond a symbolic link", *p + len);
+			}
+		}
+	}
+
 	return pathspec;
 }
 
@@ -278,7 +288,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, "Maybe you wanted to say 'git add .'?\n");
 		return 0;
 	}
-	pathspec = get_pathspec(prefix, argv);
+	pathspec = validate_pathspec(argc, argv, prefix);
 
 	/*
 	 * If we are adding new files, we need to scan the working
diff --git a/dir.c b/dir.c
index 29d1d5b..ae7046f 100644
--- a/dir.c
+++ b/dir.c
@@ -727,8 +727,12 @@ static void free_simplify(struct path_simplify *simplify)
 
 int read_directory(struct dir_struct *dir, const char *path, const char *base, int baselen, const char **pathspec)
 {
-	struct path_simplify *simplify = create_simplify(pathspec);
+	struct path_simplify *simplify;
 
+	if (has_symlink_leading_path(strlen(path), path))
+		return dir->nr;
+
+	simplify = create_simplify(pathspec);
 	read_directory_recursive(dir, path, base, baselen, 0, simplify);
 	free_simplify(simplify);
 	qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
diff --git a/t/t0055-beyond-symlinks.sh b/t/t0055-beyond-symlinks.sh
index eb11dd7..b29c37a 100755
--- a/t/t0055-beyond-symlinks.sh
+++ b/t/t0055-beyond-symlinks.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-test_description='update-index refuses to add beyond symlinks'
+test_description='update-index and add refuse to add beyond symlinks'
 
 . ./test-lib.sh
 
@@ -17,4 +17,9 @@ test_expect_success 'update-index --add beyond symlinks' '
 	! ( git ls-files | grep c/d )
 '
 
+test_expect_success 'add beyond symlinks' '
+	test_must_fail git add c/d &&
+	! ( git ls-files | grep c/d )
+'
+
 test_done
-- 
1.6.0.rc1.64.g61192

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
  2008-08-04  7:51                   ` [PATCH 1/2] update-index: refuse to add working tree items beyond symlinks Junio C Hamano
  2008-08-04  7:52                   ` [PATCH 2/2] add: " Junio C Hamano
@ 2008-08-05  0:21                   ` Linus Torvalds
  2008-08-05  0:54                     ` Junio C Hamano
  2008-08-07  6:52                     ` Junio C Hamano
  2 siblings, 2 replies; 56+ messages in thread
From: Linus Torvalds @ 2008-08-05  0:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin



On Mon, 4 Aug 2008, Junio C Hamano wrote:
> 
> I started to revisit this issue and patched "git update-index --add"
> and "git add" so far.  Patches follow.

Patches look good to me, but did you check the performance impact?

The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
still be a huge performance downside to check all the paths for things 
like "git add -u".

I didn't test, though.

		Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  0:21                   ` Not going beyond symbolic links Linus Torvalds
@ 2008-08-05  0:54                     ` Junio C Hamano
  2008-08-05  1:43                       ` Linus Torvalds
  2008-08-07  6:52                     ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-08-05  0:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, 4 Aug 2008, Junio C Hamano wrote:
>> 
>> I started to revisit this issue and patched "git update-index --add"
>> and "git add" so far.  Patches follow.
>
> Patches look good to me, but did you check the performance impact?
>
> The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
> still be a huge performance downside to check all the paths for things 
> like "git add -u".

Not yet.

I think this is a necessary "correctness" thing to do regardless of the
performance impact, and adding the logic to stop at submodule boundary
(aka gitlinks) should come before optimization.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  0:54                     ` Junio C Hamano
@ 2008-08-05  1:43                       ` Linus Torvalds
  2008-08-05  1:59                         ` Johannes Schindelin
  2008-08-05  3:01                         ` Junio C Hamano
  0 siblings, 2 replies; 56+ messages in thread
From: Linus Torvalds @ 2008-08-05  1:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin



On Mon, 4 Aug 2008, Junio C Hamano wrote:
> >
> > The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
> > still be a huge performance downside to check all the paths for things 
> > like "git add -u".
> 
> Not yet.
> 
> I think this is a necessary "correctness" thing to do regardless of the
> performance impact, and adding the logic to stop at submodule boundary
> (aka gitlinks) should come before optimization.

Well, "performance" is a feature too, and it's not correct to say that "X 
should be fixed before optimization". If "X" slows things down, the 
question should be whether it really needs fixing..

Yes, we find symlinks when we do _new_ files, but is it really so bad to 
assume that existing directories that we have already added to the index 
are stable? It can easily be seen as a feature too that you can force git 
to ignore the symlink and see it as a real directory.

So that's why it would be interesting to hear about the performance 
impact. Because this is definitely not a black-and-white "one behavior is 
wrong and one behavior is right".

			Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  1:43                       ` Linus Torvalds
@ 2008-08-05  1:59                         ` Johannes Schindelin
  2008-08-05  2:28                           ` Linus Torvalds
  2008-08-05  4:44                           ` Junio C Hamano
  2008-08-05  3:01                         ` Junio C Hamano
  1 sibling, 2 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-08-05  1:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, Petr Baudis, Shawn O. Pearce

Hi,

On Mon, 4 Aug 2008, Linus Torvalds wrote:

> On Mon, 4 Aug 2008, Junio C Hamano wrote:
> > >
> > > The rewritten 'has_symlink_leading_path()' should do ok, but it 
> > > migth still be a huge performance downside to check all the paths 
> > > for things like "git add -u".
> > 
> > Not yet.
> > 
> > I think this is a necessary "correctness" thing to do regardless of 
> > the performance impact, and adding the logic to stop at submodule 
> > boundary (aka gitlinks) should come before optimization.
> 
> Well, "performance" is a feature too, and it's not correct to say that 
> "X should be fixed before optimization". If "X" slows things down, the 
> question should be whether it really needs fixing..
> 
> Yes, we find symlinks when we do _new_ files, but is it really so bad to 
> assume that existing directories that we have already added to the index 
> are stable? It can easily be seen as a feature too that you can force git 
> to ignore the symlink and see it as a real directory.

Actually, whatever you want, it needs fixing.

I vividly remember being quite pissed by Git replacing a symbolic link in 
my working directory with a directory, and instead of updating the files 
which were technically outside of the repository, Git populated that newly 
created directory.

However, please note that Junio's patch affects git-add, AFAIR, not 
git-update-index.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  1:59                         ` Johannes Schindelin
@ 2008-08-05  2:28                           ` Linus Torvalds
  2008-08-05  6:11                             ` Junio C Hamano
  2008-08-05  4:44                           ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Linus Torvalds @ 2008-08-05  2:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git, Petr Baudis, Shawn O. Pearce



On Tue, 5 Aug 2008, Johannes Schindelin wrote:
> 
> I vividly remember being quite pissed by Git replacing a symbolic link in 
> my working directory with a directory, and instead of updating the files 
> which were technically outside of the repository, Git populated that newly 
> created directory.

Well, that can cut both ways. For example, I vividly remember a time in 
the distant past when harddisks were tiny, and I didn't have insanely 
high-end hardware, and I was building the X server, but had to split 
things up over two partitions because each individual partition was 
too full.

IOW, sometimes you may _want_ to use symlinks that way, even within one 
project - with a symlink allowing you to move parts of it around 
"transparently".

Of course, these days under Linux we can just use bind mounts, so the use 
of symlinks to stitch together two or more different trees is fairly 
old-fashioned, but is still the only option on some systems or if you 
don't have root.

(These days harddisks are also generally so big that it never happens. But 
on my EeePC laptop, I still end up with two filesystems, 4GB  and 8GB 
each. So it's not inconceivable to be in that kind of situation even 
today).

			Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  1:43                       ` Linus Torvalds
  2008-08-05  1:59                         ` Johannes Schindelin
@ 2008-08-05  3:01                         ` Junio C Hamano
  2008-08-05  3:04                           ` david
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-08-05  3:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin

Linus Torvalds <torvalds@linux-foundation.org> writes:

> ... Because this is definitely not a black-and-white "one behavior is 
> wrong and one behavior is right".

I wish I could agree with you that this is a feature, but 16a4c61
(read-tree -m -u: avoid getting confused by intermediate symlinks.,
2007-05-10) and 64cab59 (apply: do not get confused by symlinks in the
middle, 2007-05-11) came from real world breakage cases and the root cause
was that we were too lenient to allow such a "feature" that pretends the
symlink not to be there.

Right now, we are being careful only while branch switching and patch
application, but the codepaths that add directly to the index (add and
update-index) are not fixed (or "still has the feature").

I do not see a clean way to keep such a "feature" without hurting users
who suffered the bugs these two commits from May 2007 fixed.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  3:01                         ` Junio C Hamano
@ 2008-08-05  3:04                           ` david
  0 siblings, 0 replies; 56+ messages in thread
From: david @ 2008-08-05  3:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, git, Petr Baudis, Shawn O. Pearce,
	Johannes Schindelin

On Mon, 4 Aug 2008, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> ... Because this is definitely not a black-and-white "one behavior is
>> wrong and one behavior is right".
>
> I wish I could agree with you that this is a feature, but 16a4c61
> (read-tree -m -u: avoid getting confused by intermediate symlinks.,
> 2007-05-10) and 64cab59 (apply: do not get confused by symlinks in the
> middle, 2007-05-11) came from real world breakage cases and the root cause
> was that we were too lenient to allow such a "feature" that pretends the
> symlink not to be there.
>
> Right now, we are being careful only while branch switching and patch
> application, but the codepaths that add directly to the index (add and
> update-index) are not fixed (or "still has the feature").
>
> I do not see a clean way to keep such a "feature" without hurting users
> who suffered the bugs these two commits from May 2007 fixed.

config option?

I think a command line is too much work for too little value, but if the 
check could be ignored based on a config option without costing too much 
it may be reasonable.

David Lang

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  1:59                         ` Johannes Schindelin
  2008-08-05  2:28                           ` Linus Torvalds
@ 2008-08-05  4:44                           ` Junio C Hamano
  2008-08-05 11:23                             ` Johannes Schindelin
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-08-05  4:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, git, Petr Baudis, Shawn O. Pearce

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> However, please note that Junio's patch affects git-add, AFAIR, not 
> git-update-index.

Really?  I thought I added a test case to cover the plumbing as well...

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  2:28                           ` Linus Torvalds
@ 2008-08-05  6:11                             ` Junio C Hamano
  2008-08-05 12:54                               ` Dmitry Potapov
  2008-08-05 17:15                               ` Linus Torvalds
  0 siblings, 2 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-08-05  6:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Johannes Schindelin, git, Petr Baudis, Shawn O. Pearce

Linus Torvalds <torvalds@linux-foundation.org> writes:

> IOW, sometimes you may _want_ to use symlinks that way, even within one 
> project - with a symlink allowing you to move parts of it around 
> "transparently".

While I admit that I have managed a large directory split across
partitions grafted via symlinks in pre-git days myself, ever since you
started "tracking" symbolic links with 8ae0a8c (git and symlinks as
tracked content, 2005-05-05), you have pretty much been committed to
"track" symbolic links.

This goes even before that commit. The readdir() loop done in show-files.c
with 8695c8b (Add "show-files" command to show the list of managed (or
non-managed) files., 2005-04-11) does not dereference symbolic links
pointing at a directory elsewhere, which we still have as read_directory()
in dir.c without much change in the basic structure.

I would give some leeway to other people who made comments in this thread,
who may not be so familiar with the low-level git codebase, but I have to
say that if you claim that dereferencing symbolic links in the middle is a
feature, you are not being completely honest, you haven't thought through
the issues, and/or you simply forgot the details.  I'd suspect most likely
it is the last one ;-).

The thing is, the "feature" is not very well supported, even without the
fixes from last night.  If you have a symlink "sym" that points at "dir"
that has "file" in it, and if neither "sym" nor "dir/file" are tracked,
you can "git add sym/file" to add it (I called it a bug).

However:

 (1) starting from the same condition, "git add ." does _not_ add it (you
     get the symbolic link "sym" added to your index instead, as well as
     "dir/file");

 (2) after you add "sym/file" through the bug, if you say "git add .", it
     will be removed from the index and you will instead have "sym" and
     "dir/file" (with an ancient git before 1.5.0, you will get "unable to
     add sym" error instead).

 (3) after you add "sym/file", "git diff" will immediately notice that you
     have removed it (this is a fairly recent fix; 1.5.4.X doesn't notice
     it).

You cannot have it as a reliably usable feature without a major surgery,
and this is fundamental. You simply cannot have it both ways without
telling git which symlink is "tracked" and which are only there for
storage sizing.

If you seriously want to claim that we support such a feature, you would
at least need to:

 (0) have a way for the user to say, "the project tree may have a
     directory D, but I do not want to check it out as a directory because
     my partition is too small.  Whenever you need to create a directory
     there and hang a tree underneath, instead create a symlink that
     points at /export/large/D instead".  Most likely this information
     would go to .git/config;

 (1) whereever we run "create leading directories", we notice and honor
     the above configuration (mostly entry.c::create_directories() called
     from entry.c::checkout_entry());

 (2) whenever we need to check out a file to path D, instead of
     recursively remove everything under it, we remove the symlink and
     deposit the file there (mostly entry.c::remove_subtree() and
     unpack-trees.c::verify_absent());

 (3) whenever we traverse working tree using readdir(), notice that the
     symbolic link we are looking at is the funny "pointing elsewhere but
     this is really a directory" specified in (0) and recurse into the
     directory pointed by it (dir.c::read_directory_recursive() but there
     may be others)

 (4) and we teach has_symlink_leading_path() to special case such a path
     you configured in (0).

I personally do not think adding these to support such a "feature" is such
a high priority, and I do not think it is honest to claim we support such
a feature without doing any of the above.  The current reality is that our
symlink support is still broken in corner cases, and being able to easily
add "sym/path" via "git add" and "git update-index --add" is one of them.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  4:44                           ` Junio C Hamano
@ 2008-08-05 11:23                             ` Johannes Schindelin
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Schindelin @ 2008-08-05 11:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git, Petr Baudis, Shawn O. Pearce

Hi,

On Mon, 4 Aug 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > However, please note that Junio's patch affects git-add, AFAIR, not 
> > git-update-index.
> 
> Really?  I thought I added a test case to cover the plumbing as well...

Right, I missed that one.  I kinda hoped that only porcelain is affected.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  6:11                             ` Junio C Hamano
@ 2008-08-05 12:54                               ` Dmitry Potapov
  2008-08-05 23:57                                 ` Junio C Hamano
  2008-08-05 17:15                               ` Linus Torvalds
  1 sibling, 1 reply; 56+ messages in thread
From: Dmitry Potapov @ 2008-08-05 12:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Johannes Schindelin, git, Petr Baudis,
	Shawn O. Pearce

On Mon, Aug 04, 2008 at 11:11:11PM -0700, Junio C Hamano wrote:
> 
> The thing is, the "feature" is not very well supported, even without the
> fixes from last night.  If you have a symlink "sym" that points at "dir"
> that has "file" in it, and if neither "sym" nor "dir/file" are tracked,
> you can "git add sym/file" to add it (I called it a bug).

Well, I actually used this feature less than a year ago and did not have
any problem with that. Not that it is very important for me now, but I had
makefiles in a separate build directory, and I used a symlink to move
building to a tmpfs partion, which sped up building slightly.

Obviously, if a symlink points to a directory inside of the repository
and then you use "git add sym/file", it is definitely a mistake. OTOH,
let's consider the following situation:

git init
mkdir newdir
touch newdir/foo
git add newdir/foo
git commit -m 'add foo'
mv newdir /tmp/
ln -s /tmp/newdir
touch newdir/bar
git add newdir/bar
git commit -m 'add bar'
git ls-files

And you can see:
newdir/bar
newdir/foo

Git does exactly what I want here!

Anyway, I more concern with performance impact of your patch. If it
is noticeable, then it would be nice to have an option to turn this
check off.

Dmitry

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  6:11                             ` Junio C Hamano
  2008-08-05 12:54                               ` Dmitry Potapov
@ 2008-08-05 17:15                               ` Linus Torvalds
  1 sibling, 0 replies; 56+ messages in thread
From: Linus Torvalds @ 2008-08-05 17:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git, Petr Baudis, Shawn O. Pearce



On Mon, 4 Aug 2008, Junio C Hamano wrote:
> 
> While I admit that I have managed a large directory split across
> partitions grafted via symlinks in pre-git days myself, ever since you
> started "tracking" symbolic links with 8ae0a8c (git and symlinks as
> tracked content, 2005-05-05), you have pretty much been committed to
> "track" symbolic links.

Yes, but my point is:

 - IF the cost is exorbitant (which was my question that triggered this: 
   it sure as h*ll _would_ have been too high back in the days of the 
   original symlink-matching code) ...

 - ... then there really are valid cases that say that we could just call 
   it a feature.

That's really my whole argument. Saying that "ok, we will assume that 
existing paths that git already knows about are stable" is not really an 
odd feature. It's one we have lived with for years, and it's one that is 
actually pretty dang trivial to work with. Yes, it can cause unexpected 
behaviour, but if you hit it, that unexpected behavior is actually not 
_that_ hard to work around.

This is all I'm arguing for. People claim that there are 'correctness' 
issues. I say that 'performance issues' are important, and we can and we 
_should_ take performance issues very seriously. So seriously that we'll 
make performance a _feature_ if required.

			Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05 12:54                               ` Dmitry Potapov
@ 2008-08-05 23:57                                 ` Junio C Hamano
  0 siblings, 0 replies; 56+ messages in thread
From: Junio C Hamano @ 2008-08-05 23:57 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Linus Torvalds, Johannes Schindelin, git, Petr Baudis,
	Shawn O. Pearce

Dmitry Potapov <dpotapov@gmail.com> writes:

> Obviously, if a symlink points to a directory inside of the repository
> and then you use "git add sym/file", it is definitely a mistake. OTOH,
> let's consider the following situation:
>
> git init
> mkdir newdir
> touch newdir/foo
> git add newdir/foo
> git commit -m 'add foo'
> mv newdir /tmp/
> ln -s /tmp/newdir
> touch newdir/bar
> git add newdir/bar
> git commit -m 'add bar'
> git ls-files
>
> And you can see:
> newdir/bar
> newdir/foo
>
> Git does exactly what I want here!

Now, do this after the above sequence:

  git diff
  git add . && git ls-files

and tell me what you see.  Does it show exactly what you want here?

I've already told you what kind of changes you would need to make if you
really want to claim this is a feature.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-05  0:21                   ` Not going beyond symbolic links Linus Torvalds
  2008-08-05  0:54                     ` Junio C Hamano
@ 2008-08-07  6:52                     ` Junio C Hamano
  2008-08-08 20:55                       ` Junio C Hamano
  1 sibling, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-08-07  6:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, 4 Aug 2008, Junio C Hamano wrote:
>> 
>> I started to revisit this issue and patched "git update-index --add"
>> and "git add" so far.  Patches follow.
>
> Patches look good to me, but did you check the performance impact?
>
> The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
> still be a huge performance downside to check all the paths for things 
> like "git add -u".

I couldn't quite measure much meaningful performance impact.

My test repository was the kernel tree at v2.6.27-rc2-20-g685d87f, without
any build products nor editor temporary crufts.

By the way, if anybody wants to reproduce this, be careful with the tests
that run "rm -f .git/index" before adding everything.  After doing so, if
you check the result with "git diff --stat HEAD", you will notice many
missing files --- I almost got a heart attack before inspecting this file:

	$ cat arch/powerpc/.gitignore
        include

Yes, it excludes 261 already tracked files.  Is it intended?  I dunno.

The file is there since 06f2138 ([POWERPC] Add files build to .gitignore,
2006-11-26).  Not that having tracked files in an entirely ignored
directory is a bug, but the .gitignore entry seems to me an Oops waiting
to happen.  There are three commits that affect this directory that is
entirely ignored after that entry is added:

    b5b9309 (remove unnecessary <linux/hdreg.h> includes, 2008-08-05)
    9c4cb82 (powerpc: Remove use of CONFIG_PPC_MERGE, 2008-08-02)
    b8b572e (powerpc: Move include files to arch/powerpc/include/asm, 2008-08-01)

Anyhow, back on topic.  Here are the numbers.

Test #1: With fully up-to-date .git/index, "git add .", best of 5 runs

* v1.6.0-rc2
0.20user 0.20system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+3622minor)pagefaults 0swaps

* v1.6.0-rc1-73-g725b060 (with patch)
0.22user 0.19system 0:00.41elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+3635minor)pagefaults 0swaps


Test #2: After "rm -f .git/index", "git add .", best of 5 runs

* v1.6.0-rc2
1.81user 0.55system 0:02.51elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92855minor)pagefaults 0swaps

* v1.6.0-rc1-73-g725b060 (with patch)
1.76user 0.56system 0:02.58elapsed 89%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92852minor)pagefaults 0swaps


Test #3: same as #2, after dropping cache with "echo 3 > /proc/sys/vm/drop_caches".
(Yes, I have slow disks).

* v1.6.0-rc2
3.43user 2.29system 2:02.45elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (29611major+63244minor)pagefaults 0swaps

* v1.6.0-rc1-73-g725b060 (with patch)
3.55user 2.33system 1:54.77elapsed 5%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (29616major+63236minor)pagefaults 0swaps

Test #4: same as #1, "strace -c -e lstat"

* v1.6.0-rc2
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000247           0     23993           lstat

* v1.6.0-rc1-73-g725b060 (with patch)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000261           0     23993           lstat

Test #5: same as #2, "strace -c -e lstat"

* v1.6.0-rc2
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000639           0     23993           lstat

* v1.6.0-rc1-73-g725b060 (with patch)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000427           0     23993           lstat

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-07  6:52                     ` Junio C Hamano
@ 2008-08-08 20:55                       ` Junio C Hamano
  2008-08-08 23:45                         ` Linus Torvalds
  0 siblings, 1 reply; 56+ messages in thread
From: Junio C Hamano @ 2008-08-08 20:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin

Junio C Hamano <gitster@pobox.com> writes:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> On Mon, 4 Aug 2008, Junio C Hamano wrote:
>>> 
>>> I started to revisit this issue and patched "git update-index --add"
>>> and "git add" so far.  Patches follow.
>>
>> Patches look good to me, but did you check the performance impact?
>>
>> The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
>> still be a huge performance downside to check all the paths for things 
>> like "git add -u".
>
> I couldn't quite measure much meaningful performance impact.
>
> My test repository was the kernel tree at v2.6.27-rc2-20-g685d87f, without
> any build products nor editor temporary crufts.
>
> By the way, if anybody wants to reproduce this, be careful with the tests
> that run "rm -f .git/index" before adding everything.  After doing so, if
> you check the result with "git diff --stat HEAD", you will notice many
> missing files --- I almost got a heart attack before inspecting this file:
>
> 	$ cat arch/powerpc/.gitignore
>         include
>
> Yes, it excludes 261 already tracked files.  Is it intended?  I dunno.
> ...

Seeing that you applied my arch/powerpc/.gitignore patch to the kernel
(Yaay, I now have a short-log entry in the kernel history ;-), you have
seen the message with some benchmarks I am replying to as well?

I actually was expecting to see measurable slowdown by checking leading
symbolic link more often, but I couldn't, which means either there is not
much downside that the bugfix would hurt real-life usage, or I was too
incompetent to come up with an example to show that the bugfix actually
hurts the performance.

If the former, I think it is a very good news (If the latter, well, we
have a bigger problem :-<), as the affected codepaths are exactly the ones
that need to be fixed when somebody tries to add files in a submodule
directory by mistake, which is an issue (iirc) Dscho had a separate patch
to fix earlier.  Tracked symbolic links may be rare, but it seems that
many people are using submodules, and that case cannot be hand-waved
around by calling it is a feature.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Not going beyond symbolic links
  2008-08-08 20:55                       ` Junio C Hamano
@ 2008-08-08 23:45                         ` Linus Torvalds
  0 siblings, 0 replies; 56+ messages in thread
From: Linus Torvalds @ 2008-08-08 23:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Shawn O. Pearce, Johannes Schindelin



On Fri, 8 Aug 2008, Junio C Hamano wrote:
> 
> Seeing that you applied my arch/powerpc/.gitignore patch to the kernel
> (Yaay, I now have a short-log entry in the kernel history ;-), you have
> seen the message with some benchmarks I am replying to as well?

Yes, I just felt that closed the discussion.

I only brought up the issue in the first place because I worried about the 
performance impact. And I was unhappy about how that worry was dismissed 
as being less important than some specious "correctness" issue (for the 
last 3+ years, performance has mattered a _lot_, and the claimed big 
"correctness" issue has not mattered one whit).

The thing is, sometimes "pi = 3.14" is (a) infinitely faster than the 
"correct" answer and (b) the difference between the "correct" and the 
"wrong" answer is meaningless. And this is why I get upset when somebody 
dismisses performance issues based on "correctness".

The thing is, some specious value of "correctness" is often irrelevant 
because it doesn't matter. While performance almost _always_ matters. And 
I absolutely _detest_ the fact that people so often dismiss performance 
concerns so readily.

But once the performance numbers are in and they don't show any issues, I 
think that simply settles the original query, and I'm happy.

		Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2008-08-08 23:47 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-16 19:11 [RFC][PATCH 0/7] Submodule support in git mv, git rm Petr Baudis
2008-07-16 19:11 ` [PATCH 1/7] git-mv: Remove dead code branch Petr Baudis
2008-07-16 19:11 ` [PATCH 2/7] t7400: Add short "git submodule add" testsuite Petr Baudis
2008-07-16 19:11 ` [PATCH 3/7] git submodule add: Fix naming clash handling Petr Baudis
2008-07-16 19:11 ` [PATCH 4/7] submodule.*: Introduce simple C interface for submodule lookup by path Petr Baudis
2008-07-16 19:11 ` [PATCH 5/7] git mv: Support moving submodules Petr Baudis
2008-07-17  2:37   ` Junio C Hamano
2008-07-17 13:06     ` Petr Baudis
2008-07-17 22:31       ` [PATCH] git-mv: Keep moved index entries inact Petr Baudis
2008-07-17 22:34         ` [PATCH] git mv: Support moving submodules Petr Baudis
2008-07-19 23:54         ` [PATCH] git-mv: Keep moved index entries inact Junio C Hamano
2008-07-21  0:23           ` Petr Baudis
2008-07-21  0:25             ` [PATCHv2] " Petr Baudis
2008-07-21  4:36               ` Junio C Hamano
2008-07-26  6:46               ` Junio C Hamano
2008-07-27 13:41                 ` Petr Baudis
2008-07-27 13:47                   ` [PATCH] t/t7001-mv.sh: Propose ability to use git-mv on conflicting entries Petr Baudis
2008-07-28  1:13                     ` Junio C Hamano
2008-07-28  1:21                       ` Junio C Hamano
2008-07-28 14:20                 ` [PATCHv2] git-mv: Keep moved index entries inact SZEDER Gábor
2008-07-28 15:06                   ` Johannes Schindelin
2008-07-28 15:14                     ` Johannes Schindelin
2008-07-28 18:24                       ` Johannes Schindelin
2008-07-28 19:19                     ` Junio C Hamano
2008-07-28 23:41                       ` Johannes Schindelin
2008-07-28 23:55                         ` Johannes Schindelin
2008-07-29  0:17                       ` Petr Baudis
2008-07-29  0:46                         ` Junio C Hamano
2008-07-29  5:23                           ` Junio C Hamano
2008-08-04  7:49                 ` Not going beyond symbolic links Junio C Hamano
2008-08-04  7:51                   ` [PATCH 1/2] update-index: refuse to add working tree items beyond symlinks Junio C Hamano
2008-08-04  7:52                   ` [PATCH 2/2] add: " Junio C Hamano
2008-08-05  0:21                   ` Not going beyond symbolic links Linus Torvalds
2008-08-05  0:54                     ` Junio C Hamano
2008-08-05  1:43                       ` Linus Torvalds
2008-08-05  1:59                         ` Johannes Schindelin
2008-08-05  2:28                           ` Linus Torvalds
2008-08-05  6:11                             ` Junio C Hamano
2008-08-05 12:54                               ` Dmitry Potapov
2008-08-05 23:57                                 ` Junio C Hamano
2008-08-05 17:15                               ` Linus Torvalds
2008-08-05  4:44                           ` Junio C Hamano
2008-08-05 11:23                             ` Johannes Schindelin
2008-08-05  3:01                         ` Junio C Hamano
2008-08-05  3:04                           ` david
2008-08-07  6:52                     ` Junio C Hamano
2008-08-08 20:55                       ` Junio C Hamano
2008-08-08 23:45                         ` Linus Torvalds
2008-07-21  1:20             ` [PATCH] git-mv: Keep moved index entries inact Johannes Schindelin
2008-07-21  7:18               ` Petr Baudis
2008-07-21  7:38                 ` Junio C Hamano
2008-07-16 19:11 ` [PATCH 6/7] git rm: Support for removing submodules Petr Baudis
2008-07-16 22:41   ` Johannes Schindelin
2008-07-17 12:35     ` Petr Baudis
2008-07-17 12:59       ` Johannes Schindelin
2008-07-16 19:11 ` [PATCH 7/7] t7403: Submodule git mv, git rm testsuite Petr Baudis

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).