git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [WIP v1 0/4] mv: fix out-of-cone file/directory move logic
@ 2022-03-31  9:17 Shaoxuan Yuan
  2022-03-31  9:17 ` [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
                   ` (10 more replies)
  0 siblings, 11 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:17 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, Shaoxuan Yuan

Before integrating 'mv' with sparse-index, I still find some possibly buggy
UX when 'mv' is interacting with 'sparse-checkout'. 

So I kept sparse-index off in order to sort things out without a sparse index.
We can proceed to integrate with sparse-index once these changes are solid.

Note that this patch is tentative, and still have known glitches, but it 
illustrates a general approach that I intended to harmonize 'mv' 
with 'sparse-checkout'.

Shaoxuan Yuan (4):
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: add check_dir_in_index() and solve general dir check issue
  mv: add advise_to_reapply hint for moving file into cone
  t7002: add tests for moving out-of-cone file/directory

 builtin/mv.c                  | 76 ++++++++++++++++++++++++++++++++---
 t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+), 6 deletions(-)


base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b
-- 
2.35.1


^ permalink raw reply	[flat|nested] 95+ messages in thread

* [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
@ 2022-03-31  9:17 ` Shaoxuan Yuan
  2022-03-31 16:39   ` Victoria Dye
  2022-03-31  9:17 ` [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:17 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, Shaoxuan Yuan

Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
I found a new problem introduced by this patch, it is written in the TODO.
I still haven't found a better way to reconcile this conflict. Please enlighten
me on this :-)

 builtin/mv.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba83..32ad4d5682 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
+			/*
+			 * TODO: for now, when you try to overwrite a <destination>
+			 * with your <source> as a sparse file, if you supply a "--sparse"
+			 * flag, then the action will be done without providing "--force"
+			 * and no warning.
+			 *
+			 * This is mainly because the sparse <source>
+			 * is not on-disk, and this if-else chain will be cut off early in
+			 * this check, thus the "--force" check is ignored. Need fix.
+			 */
+
+			int pos = cache_name_pos(src, length);
+			if (pos >= 0) {
+				const struct cache_entry *ce = active_cache[pos];
+
+				if (ce_skip_worktree(ce)) {
+					if (!ignore_sparse)
+						string_list_append(&only_match_skip_worktree, src);
+					else
+						modes[i] = SPARSE;
+				}
+				else
+					bad = _("bad source");
+			}
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			else if (modes[i] != SPARSE)
 				bad = _("bad source");
 		} else if (!strncmp(src, dst, length) &&
 				(dst[length] == 0 || dst[length] == '/')) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
  2022-03-31  9:17 ` [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-03-31  9:17 ` Shaoxuan Yuan
  2022-03-31 10:25   ` Ævar Arnfjörð Bjarmason
  2022-03-31 21:28   ` Victoria Dye
  2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:17 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, Shaoxuan Yuan

Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
Since I'm so new to C language (not an acquaintance until this patch), 
the "check_dir_in_index()" function I added might not be ideal (in terms of 
safety and correctness?). I have digging into the APIs provided in the codebase 
but I haven't found anything to do this very job: find out if a directory is 
in the index (am I missing something?). 
Probably because contents are stored in the index as blobs and 
they all represent regular files. So I came up with this dull solution...

 builtin/mv.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 32ad4d5682..9da9205e01 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -115,6 +115,25 @@ static int index_range_of_same_dir(const char *src, int length,
 	return last - first;
 }
 
+static int check_dir_in_index(const char *dir)
+{
+	int ret = 0;
+	int length = sizeof(dir) + 1;
+	char *substr = malloc(length);
+
+	for (int i = 0; i < the_index.cache_nr; i++) {
+		memcpy(substr, the_index.cache[i]->name, length);
+		memset(substr + length - 1, 0, 1);
+
+		if (strcmp(dir, substr) == 0) {
+			ret = 1;
+			return ret;
+		}
+	}
+	free(substr);
+	return ret;
+}
+
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
@@ -129,7 +148,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
+	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE,
+	SPARSE_DIRECTORY } *modes;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -197,6 +217,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			 */
 
 			int pos = cache_name_pos(src, length);
+			const char *src_w_slash = add_slash(src);
+
 			if (pos >= 0) {
 				const struct cache_entry *ce = active_cache[pos];
 
@@ -209,6 +231,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				else
 					bad = _("bad source");
 			}
+			else if (check_dir_in_index(src_w_slash) &&
+			!path_in_sparse_checkout(src_w_slash, &the_index)) {
+				modes[i] = SPARSE_DIRECTORY;
+				goto dir_check;
+			}
 			/* only error if existence is expected. */
 			else if (modes[i] != SPARSE)
 				bad = _("bad source");
@@ -219,7 +246,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				&& lstat(dst, &st) == 0)
 			bad = _("cannot move directory over file");
 		else if (src_is_dir) {
-			int first = cache_name_pos(src, length), last;
+			int first, last;
+dir_check:
+			first = cache_name_pos(src, length);
 
 			if (first >= 0)
 				prepare_move_submodule(src, first,
@@ -230,7 +259,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			else { /* last - first >= 1 */
 				int j, dst_len, n;
 
-				modes[i] = WORKING_DIRECTORY;
+				if (!modes[i])
+					modes[i] = WORKING_DIRECTORY;
 				n = argc + last - first;
 				REALLOC_ARRAY(source, n);
 				REALLOC_ARRAY(destination, n);
@@ -332,7 +362,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
+		if (mode != INDEX && mode != SPARSE && mode != SPARSE_DIRECTORY &&
+		 rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
 			die_errno(_("renaming '%s' failed"), src);
@@ -346,7 +377,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode == WORKING_DIRECTORY)
+		if (mode == WORKING_DIRECTORY || mode == SPARSE_DIRECTORY)
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
  2022-03-31  9:17 ` [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
  2022-03-31  9:17 ` [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-03-31  9:17 ` Shaoxuan Yuan
  2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
                     ` (2 more replies)
  2022-03-31  9:17 ` [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
                   ` (7 subsequent siblings)
  10 siblings, 3 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:17 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, Shaoxuan Yuan

Originally, the SKIP_WORKTREE bit is not removed when moving an out-of-cone
file into sparse cone, thus the moved file does not show up in the working tree.
Hint the user to use "git sparse-checkout reapply" to reapply the sparsity rules
to the working tree, by which the SKIP_WORKTREE bit is removed.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
I offered this solution becasue I'm not sure how to turn a cache_entry's 
ce_flags back to a non-sparse state. I tried directly set it to 0 like this:

	ce->ce_flags = 0;

But the behavior after this seems undefined. The file still won't show up
in the working tree.

And I found that "git sparse-checkout reapply" seems to be a nice fit for the
job. But I guess if there is a way (there must be but I don't know) to do it
direcly in the code, that could also be nice.

 builtin/mv.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/builtin/mv.c b/builtin/mv.c
index 9da9205e01..5f511fb8da 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -138,6 +138,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
 	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
+	int advise_to_reapply = 0;
 	struct option builtin_mv_options[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
 		OPT__DRY_RUN(&show_only, N_("dry run")),
@@ -383,6 +384,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
 		rename_cache_entry_at(pos, dst);
+		if (!advise_to_reapply &&
+			!path_in_sparse_checkout(src, &the_index) &&
+			path_in_sparse_checkout(dst, &the_index)) {
+				advise_to_reapply = 1;
+			}
 	}
 
 	if (gitmodules_modified)
@@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 		die(_("Unable to write new index file"));
 
+	if (advise_to_reapply)
+		printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));
+
 	string_list_clear(&src_for_dst, 0);
 	UNLEAK(source);
 	UNLEAK(dest_path);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (2 preceding siblings ...)
  2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
@ 2022-03-31  9:17 ` Shaoxuan Yuan
  2022-03-31 10:33   ` Ævar Arnfjörð Bjarmason
  2022-03-31 22:11   ` Victoria Dye
  2022-03-31  9:28 ` [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:17 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, Shaoxuan Yuan

Add corresponding tests to test following situations:

* 'refuse to move out-of-cone directory without --sparse'
* 'can move out-of-cone directory with --sparse'
* 'refuse to move out-of-cone file without --sparse'
* 'can move out-of-cone file with --sparse'

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 1d3d2aca21..efb260d015 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -206,4 +206,76 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	test_must_fail git mv folder1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_success 'can move out-of-cone directory with --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	git mv --sparse folder1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
+	>expect &&
+	test_cmp actual expect &&
+
+	git sparse-checkout reapply &&
+	test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/folder1/file1
+'
+
+test_expect_success 'refuse to move out-of-cone file without --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	test_must_fail git mv folder1/file1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_success 'can move out-of-cone file with --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
+	>expect &&
+	test_cmp actual expect &&
+
+	git sparse-checkout reapply &&
+	! test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/file1
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v1 0/4] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (3 preceding siblings ...)
  2022-03-31  9:17 ` [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-03-31  9:28 ` Shaoxuan Yuan
  2022-03-31 22:21 ` Victoria Dye
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-03-31  9:28 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster

On Thu, Mar 31, 2022 at 5:20 PM Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> wrote:
>
> Before integrating 'mv' with sparse-index, I still find some possibly buggy
> UX when 'mv' is interacting with 'sparse-checkout'.
>
> So I kept sparse-index off in order to sort things out without a sparse index.
> We can proceed to integrate with sparse-index once these changes are solid.
>
> Note that this patch is tentative, and still have known glitches, but it
> illustrates a general approach that I intended to harmonize 'mv'
> with 'sparse-checkout'.
>
> Shaoxuan Yuan (4):
>   mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
>   mv: add check_dir_in_index() and solve general dir check issue
>   mv: add advise_to_reapply hint for moving file into cone
>   t7002: add tests for moving out-of-cone file/directory
>
>  builtin/mv.c                  | 76 ++++++++++++++++++++++++++++++++---
>  t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++
>  2 files changed, 142 insertions(+), 6 deletions(-)
>
>
> base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b
> --
> 2.35.1
>

The original related RFC patch is [1], and this patch should be
--in-reply-to [2].

[1] https://lore.kernel.org/git/20220315100145.214054-1-shaoxuan.yuan02@gmail.com/
[2] https://lore.kernel.org/git/97a665fe-07c9-c4f6-4ab6-b6c0e1397c31@github.com/

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-03-31  9:17 ` [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-03-31 10:25   ` Ævar Arnfjörð Bjarmason
  2022-04-01  3:51     ` Shaoxuan Yuan
  2022-03-31 21:28   ` Victoria Dye
  1 sibling, 1 reply; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-31 10:25 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster


On Thu, Mar 31 2022, Shaoxuan Yuan wrote:

> +static int check_dir_in_index(const char *dir)
> +{
> +	int ret = 0;
> +	int length = sizeof(dir) + 1;
> +	char *substr = malloc(length);
> +
> +	for (int i = 0; i < the_index.cache_nr; i++) {

See https://lore.kernel.org/git/xmqqy20r3rv7.fsf@gitster.g/ for how
we're not quite using this syntax yet.

This should also be "unsigned int" to go with the "cache_nr" member.

> +		memcpy(substr, the_index.cache[i]->name, length);
> +		memset(substr + length - 1, 0, 1);
> +
> +		if (strcmp(dir, substr) == 0) {

Style: don't compare against == 0, or == NULL, use !, see CodingGuidelines.

> +			else if (check_dir_in_index(src_w_slash) &&
> +			!path_in_sparse_checkout(src_w_slash, &the_index)) {

Funny indentation, the ! should be aligned with "(".

> -				modes[i] = WORKING_DIRECTORY;
> +				if (!modes[i])
> +					modes[i] = WORKING_DIRECTORY;

This works, but assuming things about enum values (even if 0) always
seems a bit nasty, can this be a comparison to BOTH instead of !? May or
may not be better...

But then again we do xcalloc() to allocate it, so we assume that
already, nevermind... :)

(there were also indentation issues below)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
@ 2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
  2022-04-01  4:00     ` Shaoxuan Yuan
  2022-03-31 21:56   ` Victoria Dye
  2022-04-01 14:55   ` Derrick Stolee
  2 siblings, 1 reply; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-31 10:30 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster, Tao Klerks


On Thu, Mar 31 2022, Shaoxuan Yuan wrote:

> Originally, the SKIP_WORKTREE bit is not removed when moving an out-of-cone
> file into sparse cone, thus the moved file does not show up in the working tree.
> Hint the user to use "git sparse-checkout reapply" to reapply the sparsity rules
> to the working tree, by which the SKIP_WORKTREE bit is removed.
>
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
> I offered this solution becasue I'm not sure how to turn a cache_entry's 
> ce_flags back to a non-sparse state. I tried directly set it to 0 like this:
>
> 	ce->ce_flags = 0;
>
> But the behavior after this seems undefined. The file still won't show up
> in the working tree.
>
> And I found that "git sparse-checkout reapply" seems to be a nice fit for the
> job. But I guess if there is a way (there must be but I don't know) to do it
> direcly in the code, that could also be nice.
>
>  builtin/mv.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 9da9205e01..5f511fb8da 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -138,6 +138,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  {
>  	int i, flags, gitmodules_modified = 0;
>  	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
> +	int advise_to_reapply = 0;
>  	struct option builtin_mv_options[] = {
>  		OPT__VERBOSE(&verbose, N_("be verbose")),
>  		OPT__DRY_RUN(&show_only, N_("dry run")),
> @@ -383,6 +384,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		pos = cache_name_pos(src, strlen(src));
>  		assert(pos >= 0);
>  		rename_cache_entry_at(pos, dst);
> +		if (!advise_to_reapply &&
> +			!path_in_sparse_checkout(src, &the_index) &&
> +			path_in_sparse_checkout(dst, &the_index)) {
> +				advise_to_reapply = 1;
> +			}

More odd indentation, and the braces aren't needed.

>  	}
>  
>  	if (gitmodules_modified)
> @@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  		die(_("Unable to write new index file"));
>  
> +	if (advise_to_reapply)
> +		printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));
> +

Please see 93026558512 (tracking branches: add advice to ambiguous
refspec error, 2022-03-28) (the OID may change after I send this, as
it's in "seen") for how to add new advise, i.e. we use advise(), add an
enum field, config var etc.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory
  2022-03-31  9:17 ` [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-03-31 10:33   ` Ævar Arnfjörð Bjarmason
  2022-03-31 22:11   ` Victoria Dye
  1 sibling, 0 replies; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-31 10:33 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster


On Thu, Mar 31 2022, Shaoxuan Yuan wrote:

> +	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
> +	>expect &&

style: space before that \ at the end, but I think it's much better to put this on one line.

This looks at a glance like it's creating an empty file, until you
notice the \ instead of && at the end..

> +	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
> +	>expect &&

.. so if you want to keep it wrapped we usually tab-indent the +1th line
to make it clear that it's continuing the above command

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-03-31  9:17 ` [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-03-31 16:39   ` Victoria Dye
  2022-04-01 14:30     ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-03-31 16:39 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster

Shaoxuan Yuan wrote:
> Originally, moving a <source> file which is not on-disk but exists in
> index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
> out with "bad source".
> 
> Change the checking logic, so that such <source>
> file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
> instead of "bad source"; also user now can supply a "--sparse" flag so
> this operation can be carried out successfully.
> 

Good commit message, this clearly explains your changes!

> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
> I found a new problem introduced by this patch, it is written in the TODO.
> I still haven't found a better way to reconcile this conflict. Please enlighten
> me on this :-)
> 
>  builtin/mv.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 83a465ba83..32ad4d5682 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  
>  		length = strlen(src);
>  		if (lstat(src, &st) < 0) {
> +			/*
> +			 * TODO: for now, when you try to overwrite a <destination>
> +			 * with your <source> as a sparse file, if you supply a "--sparse"
> +			 * flag, then the action will be done without providing "--force"
> +			 * and no warning.
> +			 *
> +			 * This is mainly because the sparse <source>
> +			 * is not on-disk, and this if-else chain will be cut off early in
> +			 * this check, thus the "--force" check is ignored. Need fix.
> +			 */
> +

I can clarify this a bit. 'mv' is done in two steps: first the file-on-disk
rename (in the call to 'rename()'), then the index entry (in
'rename_cache_entry_at()'). In the case of a sparse file, you're only
dealing with the latter. However, 'rename_cache_entry_at()' moves the index
entry with the flag 'ADD_CACHE_OK_TO_REPLACE', since it leaves it up to
'cmd_mv()' to enforce the "no overwrite" rule. 

So, in the case of moving *to* a SKIP_WORKTREE entry (where a file being
present won't trigger the failure), you'll want to check that the
destination *index entry* doesn't exist in addition to the 'lstat()' check.
It might require some rearranging of if-statements in this block, but I
think it can be done in 'cmd_mv'. 

> +			int pos = cache_name_pos(src, length);
> +			if (pos >= 0) {
> +				const struct cache_entry *ce = active_cache[pos];
> +
> +				if (ce_skip_worktree(ce)) {
> +					if (!ignore_sparse)
> +						string_list_append(&only_match_skip_worktree, src);
> +					else
> +						modes[i] = SPARSE;
> +				}
> +				else
> +					bad = _("bad source");

This block is good. At first, I thought it was mishandling the
'!ignore_sparse' case (i.e., that case should have included the "bad source"
assignment), but using the 'only_match_skip_worktree' list is the
appropriate way to handle it.

> +			}
>  			/* only error if existence is expected. */
> -			if (modes[i] != SPARSE)
> +			else if (modes[i] != SPARSE)
>  				bad = _("bad source");
>  		} else if (!strncmp(src, dst, length) &&
>  				(dst[length] == 0 || dst[length] == '/')) {

For a change like this, it would be really helpful to include the tests
showing how sparse file moves should now be treated in this commit. I see
that you've added some in patch 4 - could you move the ones related to this
change into this commit?

Another way you could do this is to put your "add tests" commit first in
this series, changing the condition on the ones that are fixed later in the
series to "test_expect_failure". Then, in each commit that "fixes" a test's
behavior, change that test to "test_expect_success". This approach had the
added benefit of showing that, before this series, the tests would fail and
that this series explicitly fixes those scenarios.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-03-31  9:17 ` [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
  2022-03-31 10:25   ` Ævar Arnfjörð Bjarmason
@ 2022-03-31 21:28   ` Victoria Dye
  2022-04-01 12:49     ` Shaoxuan Yuan
  1 sibling, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-03-31 21:28 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster

Shaoxuan Yuan wrote:
> Originally, moving a <source> directory which is not on-disk due
> to its existence outside of sparse-checkout cone, "giv mv" command
> errors out with "bad source".
> 
> Add a helper check_dir_in_index() function to see if a directory
> name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
> such directories.
> 

Hmm, I think this patch would fit better in your eventual "sparse index
integration" series than this "prerequisite fixes to sparse-checkout"
series. Sparse directories *only* appear when you're using a sparse index
so, theoretically, this shouldn't ever come up (and thus isn't testable)
until you're using a sparse index. 

Since it's here, though, I'm happy to review what you have (even if you
eventually move it to a later series)!

> Change the checking logic, so that such <source> directory makes
> "giv mv" command warns with "advise_on_updating_sparse_paths()"
> instead of "bad source"; also user now can supply a "--sparse" flag so
> this operation can be carried out successfully.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
> Since I'm so new to C language (not an acquaintance until this patch), 
> the "check_dir_in_index()" function I added might not be ideal (in terms of 
> safety and correctness?). I have digging into the APIs provided in the codebase 
> but I haven't found anything to do this very job: find out if a directory is 
> in the index (am I missing something?). 
> Probably because contents are stored in the index as blobs and 
> they all represent regular files. So I came up with this dull solution...
> 
>  builtin/mv.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 32ad4d5682..9da9205e01 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -115,6 +115,25 @@ static int index_range_of_same_dir(const char *src, int length,
>  	return last - first;
>  }
>  
> +static int check_dir_in_index(const char *dir)
> +{

This function can be made a lot simpler - you can use `cache_name_pos()` to
find the index entry - if it's found, the directory exists as a sparse
directory. And, because 'add_slash()' enforces the trailing slash later on,
you don't need to worry about adjusting the name before you look for the
entry.

> +	int ret = 0;
> +	int length = sizeof(dir) + 1;
> +	char *substr = malloc(length);
> +
> +	for (int i = 0; i < the_index.cache_nr; i++) {
> +		memcpy(substr, the_index.cache[i]->name, length);
> +		memset(substr + length - 1, 0, 1);
> +
> +		if (strcmp(dir, substr) == 0) {
> +			ret = 1;
> +			return ret;
> +		}
> +	}
> +	free(substr);
> +	return ret;
> +}
> +
>  int cmd_mv(int argc, const char **argv, const char *prefix)
>  {
>  	int i, flags, gitmodules_modified = 0;
> @@ -129,7 +148,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		OPT_END(),
>  	};
>  	const char **source, **destination, **dest_path, **submodule_gitfile;
> -	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
> +	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE,
> +	SPARSE_DIRECTORY } *modes;
>  	struct stat st;
>  	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
>  	struct lock_file lock_file = LOCK_INIT;
> @@ -197,6 +217,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			 */
>  
>  			int pos = cache_name_pos(src, length);
> +			const char *src_w_slash = add_slash(src);
> +
>  			if (pos >= 0) {
>  				const struct cache_entry *ce = active_cache[pos];
>  
> @@ -209,6 +231,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				else
>  					bad = _("bad source");
>  			}
> +			else if (check_dir_in_index(src_w_slash) &&
> +			!path_in_sparse_checkout(src_w_slash, &the_index)) {
> +				modes[i] = SPARSE_DIRECTORY;
> +				goto dir_check;
> +			}

In if-statements like this, you'll want to line up the statements in
parentheses on subsequent lines, like:

	else if (check_dir_in_index(src_w_slash) &&
		 !path_in_sparse_checkout(src_w_slash, &the_index)) {

...where the second line is indented 1 (8-space-sized) tab + 1 space. 

In general, if you're trying to align code (in this repository), align first
with as many tabs as possible, then the "remainder" with spaces. Note that
this isn't 100% consistent throughout the repository - older lines might not
have been aligned properly - but you should go for this styling on any new
lines that you add.

>  			/* only error if existence is expected. */
>  			else if (modes[i] != SPARSE)
>  				bad = _("bad source");
> @@ -219,7 +246,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				&& lstat(dst, &st) == 0)
>  			bad = _("cannot move directory over file");
>  		else if (src_is_dir) {
> -			int first = cache_name_pos(src, length), last;
> +			int first, last;
> +dir_check:
> +			first = cache_name_pos(src, length);
>  
>  			if (first >= 0)
>  				prepare_move_submodule(src, first,
> @@ -230,7 +259,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			else { /* last - first >= 1 */
>  				int j, dst_len, n;
>  
> -				modes[i] = WORKING_DIRECTORY;
> +				if (!modes[i])
> +					modes[i] = WORKING_DIRECTORY;
>  				n = argc + last - first;
>  				REALLOC_ARRAY(source, n);
>  				REALLOC_ARRAY(destination, n);
> @@ -332,7 +362,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			printf(_("Renaming %s to %s\n"), src, dst);
>  		if (show_only)
>  			continue;
> -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> +		if (mode != INDEX && mode != SPARSE && mode != SPARSE_DIRECTORY &&
> +		 rename(src, dst) < 0) {
>  			if (ignore_errors)
>  				continue;
>  			die_errno(_("renaming '%s' failed"), src);
> @@ -346,7 +377,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  							      1);
>  		}
>  
> -		if (mode == WORKING_DIRECTORY)
> +		if (mode == WORKING_DIRECTORY || mode == SPARSE_DIRECTORY)

I'm a bit confused - doesn't this mean the sparse dir move will be skipped?
In your commit description, you mention that this 'mv' succeeds with the
'--sparse' option, but I don't see any place where the sparse directory
would be moved. 

>  			continue;
>  
>  		pos = cache_name_pos(src, strlen(src));


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
  2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
@ 2022-03-31 21:56   ` Victoria Dye
  2022-04-01 14:55   ` Derrick Stolee
  2 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-03-31 21:56 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster

Shaoxuan Yuan wrote:
> Originally, the SKIP_WORKTREE bit is not removed when moving an out-of-cone
> file into sparse cone, thus the moved file does not show up in the working tree.
> Hint the user to use "git sparse-checkout reapply" to reapply the sparsity rules
> to the working tree, by which the SKIP_WORKTREE bit is removed.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
> I offered this solution becasue I'm not sure how to turn a cache_entry's 
> ce_flags back to a non-sparse state. I tried directly set it to 0 like this:
> 
> 	ce->ce_flags = 0;
> 
> But the behavior after this seems undefined. The file still won't show up
> in the working tree.
> 

What (I think) you're looking for is something like this:

	ce->ce_flags &= ~CE_SKIP_WORKTREE;

This disables only the SKIP_WORKTREE flag on the entry (leaving the others
unchanged). Similarly, you can enable SKIP_WORKTREE with:

	ce->ce_flags |= CE_SKIP_WORKTREE;

> And I found that "git sparse-checkout reapply" seems to be a nice fit for the
> job. But I guess if there is a way (there must be but I don't know) to do it
> direcly in the code, that could also be nice.
> 

This brings up an interesting point - what *do* we want to do when you move
an entry and, based on sparse checkout patterns, the SKIP_WORKTREE status
changes? In the case of sparse-checkout, I like your approach: disable
SKIP_WORKTREE if you're moving inside the sparse checkout definition. And,
this only happens if you're using the '--sparse' flag, so a user will have
to acknowledge that they're moving something sparse to do any of this. 

The other situation to consider is when you're moving something *out* of the
sparse cone; your approach right now is to move it out *without* enabling
SKIP_WORKTREE. I think this is also probably valid - you might not want to
assume a user wants to remove the file from their worktree *just* yet.
However, it does create a situation where a user has a "sparse" file active
in their repo, so *this* might be a situation where you want to advise the
user to call 'git sparse-checkout reapply' to "sparsify" that file.

In any case, you'll only want to do this if the global variable
'core_apply_sparse_checkout' is non-zero. 'SKIP_WORKTREE' can be used when
not in a sparse checkout, so you can't always change the flag based on any
sparse patterns (because there might not be any!). 

>  builtin/mv.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 9da9205e01..5f511fb8da 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -138,6 +138,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  {
>  	int i, flags, gitmodules_modified = 0;
>  	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
> +	int advise_to_reapply = 0;
>  	struct option builtin_mv_options[] = {
>  		OPT__VERBOSE(&verbose, N_("be verbose")),
>  		OPT__DRY_RUN(&show_only, N_("dry run")),
> @@ -383,6 +384,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		pos = cache_name_pos(src, strlen(src));
>  		assert(pos >= 0);
>  		rename_cache_entry_at(pos, dst);
> +		if (!advise_to_reapply &&
> +			!path_in_sparse_checkout(src, &the_index) &&
> +			path_in_sparse_checkout(dst, &the_index)) {
> +				advise_to_reapply = 1;
> +			}
>  	}
>  
>  	if (gitmodules_modified)
> @@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  		die(_("Unable to write new index file"));
>  
> +	if (advise_to_reapply)
> +		printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));

I know you're hoping to change your implementation so that you don't need
this advice. But, if you *do* end up needing it (or some other advice)
somewhere else, you can implement it using the 'advise()' API. For an
example on how that's used, see how 'ADVICE_SKIPPED_CHERRY_PICKS' is used in
'sequencer.c'. 

In your case, you could have something like 'ADVICE_REAPPLY_SPARSE_PATTERNS'
that, if enabled, prints a message like the one you have here.

> +
>  	string_list_clear(&src_for_dst, 0);
>  	UNLEAK(source);
>  	UNLEAK(dest_path);

As with patch 1, it would help paint a clear picture of what this patch does
if you could incorporate the tests from patch 4 into this one.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory
  2022-03-31  9:17 ` [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-03-31 10:33   ` Ævar Arnfjörð Bjarmason
@ 2022-03-31 22:11   ` Victoria Dye
  1 sibling, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-03-31 22:11 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster

Shaoxuan Yuan wrote:
> Add corresponding tests to test following situations:
> 
> * 'refuse to move out-of-cone directory without --sparse'
> * 'can move out-of-cone directory with --sparse'
> * 'refuse to move out-of-cone file without --sparse'
> * 'can move out-of-cone file with --sparse'
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++++
>  1 file changed, 72 insertions(+)
> 
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index 1d3d2aca21..efb260d015 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -206,4 +206,76 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>  	test_cmp expect stderr
>  '
>  
> +test_expect_success 'refuse to move out-of-cone directory without --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	test_must_fail git mv folder1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_success 'can move out-of-cone directory with --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse folder1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
> +	>expect &&
> +	test_cmp actual expect &&
> +
> +	git sparse-checkout reapply &&
> +	test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/folder1/file1
> +'
> +
> +test_expect_success 'refuse to move out-of-cone file without --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	test_must_fail git mv folder1/file1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_success 'can move out-of-cone file with --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
> +	>expect &&
> +	test_cmp actual expect &&
> +
> +	git sparse-checkout reapply &&
> +	! test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/file1
> +'
> +
>  test_done

Other than my earlier comments about moving the tests to another point in
the series, the content of the tests looks great!

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 0/4] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (4 preceding siblings ...)
  2022-03-31  9:28 ` [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
@ 2022-03-31 22:21 ` Victoria Dye
  2022-04-01 12:18   ` Shaoxuan Yuan
  2022-04-08 12:22 ` Shaoxuan Yuan
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-03-31 22:21 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster

Shaoxuan Yuan wrote:
> Before integrating 'mv' with sparse-index, I still find some possibly buggy
> UX when 'mv' is interacting with 'sparse-checkout'. 
> 
> So I kept sparse-index off in order to sort things out without a sparse index.
> We can proceed to integrate with sparse-index once these changes are solid.
> 
> Note that this patch is tentative, and still have known glitches, but it 
> illustrates a general approach that I intended to harmonize 'mv' 
> with 'sparse-checkout'.
> 

Thanks for working out some ways to make 'mv' behave more nicely with sparse
checkouts! I did my best to address some of the specific implementation
questions you had in your commit messages. Beyond that, my main points of
feedback (beyond some formatting nits and implementation questions) are:

* Patch 2 deals with sparse directories, which won't show up until you
  enable sparse index; since you can't test that yet, you should save the
  patch for your "sparse index integration" series.
* Patch 4 should either be moved to the beginning of the series (with the
  tests flagged with 'test_expect_failure' until the patch that fixes the
  associated behavior), or split up with the tests associated with a change
  moved into the patch that makes that change.

And, as always, I'm happy to answer any questions and/or clarify weird
behavior you encounter while making changes to this (or subsequent) series!

> Shaoxuan Yuan (4):
>   mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
>   mv: add check_dir_in_index() and solve general dir check issue
>   mv: add advise_to_reapply hint for moving file into cone
>   t7002: add tests for moving out-of-cone file/directory
> 
>  builtin/mv.c                  | 76 ++++++++++++++++++++++++++++++++---
>  t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++
>  2 files changed, 142 insertions(+), 6 deletions(-)
> 
> 
> base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-03-31 10:25   ` Ævar Arnfjörð Bjarmason
@ 2022-04-01  3:51     ` Shaoxuan Yuan
  0 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-01  3:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, vdye, derrickstolee, gitster

On Thu, Mar 31, 2022 at 6:30 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Mar 31 2022, Shaoxuan Yuan wrote:
>
> > +static int check_dir_in_index(const char *dir)
> > +{
> > +     int ret = 0;
> > +     int length = sizeof(dir) + 1;
> > +     char *substr = malloc(length);
> > +
> > +     for (int i = 0; i < the_index.cache_nr; i++) {
>
> See https://lore.kernel.org/git/xmqqy20r3rv7.fsf@gitster.g/ for how
> we're not quite using this syntax yet.
>
> This should also be "unsigned int" to go with the "cache_nr" member.
>
> > +             memcpy(substr, the_index.cache[i]->name, length);
> > +             memset(substr + length - 1, 0, 1);
> > +
> > +             if (strcmp(dir, substr) == 0) {
>
> Style: don't compare against == 0, or == NULL, use !, see CodingGuidelines.
>
> > +                     else if (check_dir_in_index(src_w_slash) &&
> > +                     !path_in_sparse_checkout(src_w_slash, &the_index)) {
>
> Funny indentation, the ! should be aligned with "(".
>
> > -                             modes[i] = WORKING_DIRECTORY;
> > +                             if (!modes[i])
> > +                                     modes[i] = WORKING_DIRECTORY;
>
> This works, but assuming things about enum values (even if 0) always
> seems a bit nasty, can this be a comparison to BOTH instead of !? May or
> may not be better...
>
> But then again we do xcalloc() to allocate it, so we assume that
> already, nevermind... :)
>
> (there were also indentation issues below)

Thanks for the styling reminders! I should go back and reread CodingGuidelines
more often... Things just slipped off my mind since I'm still trying to remember
the guideline...

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
@ 2022-04-01  4:00     ` Shaoxuan Yuan
  2022-04-01  8:02       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-01  4:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, vdye, derrickstolee, gitster, Tao Klerks

On Thu, Mar 31, 2022 at 6:31 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> More odd indentation, and the braces aren't needed.

Got me again :-( Will make a change.

> >       }
> >
> >       if (gitmodules_modified)
> > @@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                              COMMIT_LOCK | SKIP_IF_UNCHANGED))
> >               die(_("Unable to write new index file"));
> >
> > +     if (advise_to_reapply)
> > +             printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));
> > +
>
> Please see 93026558512 (tracking branches: add advice to ambiguous
> refspec error, 2022-03-28) (the OID may change after I send this, as
> it's in "seen") for how to add new advise, i.e. we use advise(), add an
> enum field, config var etc.

I actually did use advise(), but I noticed that it prints to stderr
and ... nevermind,
I realized that printing to stderr is OK. But can I print to stdout
since I think users should
be "reminded" instead of "warned"?

Anyway, I think using advice() is probably better.

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-04-01  4:00     ` Shaoxuan Yuan
@ 2022-04-01  8:02       ` Ævar Arnfjörð Bjarmason
  2022-04-03  2:01         ` Eric Sunshine
  0 siblings, 1 reply; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-04-01  8:02 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster, Tao Klerks


On Fri, Apr 01 2022, Shaoxuan Yuan wrote:

> On Thu, Mar 31, 2022 at 6:31 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> More odd indentation, and the braces aren't needed.
>
> Got me again :-( Will make a change.
>
>> >       }
>> >
>> >       if (gitmodules_modified)
>> > @@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>> >                              COMMIT_LOCK | SKIP_IF_UNCHANGED))
>> >               die(_("Unable to write new index file"));
>> >
>> > +     if (advise_to_reapply)
>> > +             printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));
>> > +
>>
>> Please see 93026558512 (tracking branches: add advice to ambiguous
>> refspec error, 2022-03-28) (the OID may change after I send this, as
>> it's in "seen") for how to add new advise, i.e. we use advise(), add an
>> enum field, config var etc.
>
> I actually did use advise(), but I noticed that it prints to stderr
> and ... nevermind,
> I realized that printing to stderr is OK. But can I print to stdout
> since I think users should
> be "reminded" instead of "warned"?
>
> Anyway, I think using advice() is probably better.

We've typically used stderr in git not to mean "error", but to
distinguish "chatty" and non-primary output from non-chatty.

So (leaving aside that we're unlikely to add advice to plumbing) if you
emitted a warning() or advice from git-ls-tree you should be able to run
something like:

    git ls-tree -r HEAD >output-for-a-script

And get the advise() on stderr, while the "primary" output is on stdout.

There's a recent-ish (last year or so) thread where I think Jeff King
explained this better than I'm doing here, but I couldn't find it with a
quick search.

In other words, you can just use advise() here, don't worry about it
writing to stderr.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 0/4] mv: fix out-of-cone file/directory move logic
  2022-03-31 22:21 ` Victoria Dye
@ 2022-04-01 12:18   ` Shaoxuan Yuan
  0 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-01 12:18 UTC (permalink / raw)
  To: Victoria Dye; +Cc: git, derrickstolee, gitster

On Fri, Apr 1, 2022 at 6:21 AM Victoria Dye <vdye@github.com> wrote:
> Thanks for working out some ways to make 'mv' behave more nicely with sparse
> checkouts! I did my best to address some of the specific implementation
> questions you had in your commit messages. Beyond that, my main points of
> feedback (beyond some formatting nits and implementation questions) are:
>
> * Patch 2 deals with sparse directories, which won't show up until you
>   enable sparse index; since you can't test that yet, you should save the
>   patch for your "sparse index integration" series.
> * Patch 4 should either be moved to the beginning of the series (with the
>   tests flagged with 'test_expect_failure' until the patch that fixes the
>   associated behavior), or split up with the tests associated with a change
>   moved into the patch that makes that change.
>
> And, as always, I'm happy to answer any questions and/or clarify weird
> behavior you encounter while making changes to this (or subsequent) series!

Hi Victoria,

Thanks a lot for the detailed and informative feedback, they are
incredibly helpful
to guide me through this first attempt! I have read them all and am preparing
corresponding fixes :-)

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-03-31 21:28   ` Victoria Dye
@ 2022-04-01 12:49     ` Shaoxuan Yuan
  2022-04-01 14:49       ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-01 12:49 UTC (permalink / raw)
  To: Victoria Dye; +Cc: git, derrickstolee, gitster

On Fri, Apr 1, 2022 at 5:28 AM Victoria Dye <vdye@github.com> wrote:
>
> Shaoxuan Yuan wrote:
> > Originally, moving a <source> directory which is not on-disk due
> > to its existence outside of sparse-checkout cone, "giv mv" command
> > errors out with "bad source".
> >
> > Add a helper check_dir_in_index() function to see if a directory
> > name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
> > such directories.
> >
>
> Hmm, I think this patch would fit better in your eventual "sparse index
> integration" series than this "prerequisite fixes to sparse-checkout"
> series. Sparse directories *only* appear when you're using a sparse index
> so, theoretically, this shouldn't ever come up (and thus isn't testable)
> until you're using a sparse index.

After reading your feedback, I realized that I totally misused
the phrase "sparse directory". Clearly, this patch series does not
deal with sparse-
index yet, as "sparse directory" is a cache entry that points to a
tree, if sparse-index
is enabled. Silly me ;)

What I was *actually* trying to say is: I want to change the checking
logic of moving
a "directory that exists outside of sparse-checkout cone", and I
apparently misused
"sparse directory" to reference such a thing.

> Since it's here, though, I'm happy to review what you have (even if you
> eventually move it to a later series)!

Thanks!

> > diff --git a/builtin/mv.c b/builtin/mv.c
> > index 32ad4d5682..9da9205e01 100644
> > --- a/builtin/mv.c
> > +++ b/builtin/mv.c
> > @@ -115,6 +115,25 @@ static int index_range_of_same_dir(const char *src, int length,
> >       return last - first;
> >  }
> >
> > +static int check_dir_in_index(const char *dir)
> > +{
>
> This function can be made a lot simpler - you can use `cache_name_pos()` to
> find the index entry - if it's found, the directory exists as a sparse
> directory. And, because 'add_slash()' enforces the trailing slash later on,
> you don't need to worry about adjusting the name before you look for the
> entry.

Yes, if I correctly used the phrase "sparse directory", but I did not...
I think I can use 'cache_name_pos()' to
check a directory *iff* it is a legit sparse directory when using sparse-index?

In my case, I just want to check a regular directory that is not in
the worktree,
since the cone pattern excludes it. And in a non-sparse index, cache
entry points only
to blobs, not trees, and that's the reason I wrote this weird function
to look into the
index. I understand that sounds not compatible with how git manages
index, but all
I want to know is "does this directory exist in the index?" (this
question is also quasi-correct).

I tried to find an existing API for this job, but I failed to find
any. Though I have a hunch
that there must be something to do it...

> > +     int ret = 0;
> > +     int length = sizeof(dir) + 1;
> > +     char *substr = malloc(length);
> > +
> > +     for (int i = 0; i < the_index.cache_nr; i++) {
> > +             memcpy(substr, the_index.cache[i]->name, length);
> > +             memset(substr + length - 1, 0, 1);
> > +
> > +             if (strcmp(dir, substr) == 0) {
> > +                     ret = 1;
> > +                     return ret;
> > +             }
> > +     }
> > +     free(substr);
> > +     return ret;
> > +}
> > +
> >  int cmd_mv(int argc, const char **argv, const char *prefix)
> >  {
> >       int i, flags, gitmodules_modified = 0;
> > @@ -129,7 +148,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >               OPT_END(),
> >       };
> >       const char **source, **destination, **dest_path, **submodule_gitfile;
> > -     enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
> > +     enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE,
> > +     SPARSE_DIRECTORY } *modes;
> >       struct stat st;
> >       struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
> >       struct lock_file lock_file = LOCK_INIT;
> > @@ -197,6 +217,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                        */
> >
> >                       int pos = cache_name_pos(src, length);
> > +                     const char *src_w_slash = add_slash(src);
> > +
> >                       if (pos >= 0) {
> >                               const struct cache_entry *ce = active_cache[pos];
> >
> > @@ -209,6 +231,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                               else
> >                                       bad = _("bad source");
> >                       }
> > +                     else if (check_dir_in_index(src_w_slash) &&
> > +                     !path_in_sparse_checkout(src_w_slash, &the_index)) {
> > +                             modes[i] = SPARSE_DIRECTORY;
> > +                             goto dir_check;
> > +                     }
>
> In if-statements like this, you'll want to line up the statements in
> parentheses on subsequent lines, like:
>
>         else if (check_dir_in_index(src_w_slash) &&
>                  !path_in_sparse_checkout(src_w_slash, &the_index)) {
>
> ...where the second line is indented 1 (8-space-sized) tab + 1 space.
>
> In general, if you're trying to align code (in this repository), align first
> with as many tabs as possible, then the "remainder" with spaces. Note that
> this isn't 100% consistent throughout the repository - older lines might not
> have been aligned properly - but you should go for this styling on any new
> lines that you add.

Will do.

>
> >                       /* only error if existence is expected. */
> >                       else if (modes[i] != SPARSE)
> >                               bad = _("bad source");
> > @@ -219,7 +246,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                               && lstat(dst, &st) == 0)
> >                       bad = _("cannot move directory over file");
> >               else if (src_is_dir) {
> > -                     int first = cache_name_pos(src, length), last;
> > +                     int first, last;
> > +dir_check:
> > +                     first = cache_name_pos(src, length);
> >
> >                       if (first >= 0)
> >                               prepare_move_submodule(src, first,
> > @@ -230,7 +259,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       else { /* last - first >= 1 */
> >                               int j, dst_len, n;
> >
> > -                             modes[i] = WORKING_DIRECTORY;
> > +                             if (!modes[i])
> > +                                     modes[i] = WORKING_DIRECTORY;
> >                               n = argc + last - first;
> >                               REALLOC_ARRAY(source, n);
> >                               REALLOC_ARRAY(destination, n);
> > @@ -332,7 +362,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       printf(_("Renaming %s to %s\n"), src, dst);
> >               if (show_only)
> >                       continue;
> > -             if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> > +             if (mode != INDEX && mode != SPARSE && mode != SPARSE_DIRECTORY &&
> > +              rename(src, dst) < 0) {
> >                       if (ignore_errors)
> >                               continue;
> >                       die_errno(_("renaming '%s' failed"), src);
> > @@ -346,7 +377,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                                                             1);
> >               }
> >
> > -             if (mode == WORKING_DIRECTORY)
> > +             if (mode == WORKING_DIRECTORY || mode == SPARSE_DIRECTORY)
>
> I'm a bit confused - doesn't this mean the sparse dir move will be skipped?
> In your commit description, you mention that this 'mv' succeeds with the
> '--sparse' option, but I don't see any place where the sparse directory
> would be moved.

Well, you know the drill, I did not use "sparse directory" correctly, let alone
'SPARSE_DIRECTORY' enum bit in this hunk. I think it makes some sense
if you apply my actual meaning of 'SPARSE_DIRECTORY' here (it should be
something like OUT_OF_CONE_WORKING_DIRECTORY)? Because such
directory is not on disk, it cannot be "rename()"d, and should also skip the
"rename_cache_entry_at()" function. If all the files under the directory are
moved/renamed, then (in my opinion) the directory is both moved to the
destination,
both in the worktree and in the index.

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-03-31 16:39   ` Victoria Dye
@ 2022-04-01 14:30     ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-04-01 14:30 UTC (permalink / raw)
  To: Victoria Dye, Shaoxuan Yuan, git; +Cc: gitster

On 3/31/2022 12:39 PM, Victoria Dye wrote:
> Shaoxuan Yuan wrote:

>>  		if (lstat(src, &st) < 0) {
>> +			/*
>> +			 * TODO: for now, when you try to overwrite a <destination>
>> +			 * with your <source> as a sparse file, if you supply a "--sparse"
>> +			 * flag, then the action will be done without providing "--force"
>> +			 * and no warning.
>> +			 *
>> +			 * This is mainly because the sparse <source>
>> +			 * is not on-disk, and this if-else chain will be cut off early in
>> +			 * this check, thus the "--force" check is ignored. Need fix.
>> +			 */
>> +
> 
> I can clarify this a bit. 'mv' is done in two steps: first the file-on-disk
> rename (in the call to 'rename()'), then the index entry (in
> 'rename_cache_entry_at()'). In the case of a sparse file, you're only
> dealing with the latter. However, 'rename_cache_entry_at()' moves the index
> entry with the flag 'ADD_CACHE_OK_TO_REPLACE', since it leaves it up to
> 'cmd_mv()' to enforce the "no overwrite" rule. 
> 
> So, in the case of moving *to* a SKIP_WORKTREE entry (where a file being
> present won't trigger the failure), you'll want to check that the
> destination *index entry* doesn't exist in addition to the 'lstat()' check.
> It might require some rearranging of if-statements in this block, but I
> think it can be done in 'cmd_mv'. 

This also explains the issue when going from sparse to non-sparse: the
file move is the expected way to populate the end-result, but we skip that
part in the sparse case. We need to do an extra step to populate the file
from the version in the index (after moving the cache entry).

Related to this chain of if/else if/else blocks, it might be worth
refactoring them to be sequential "if ()" blocks where we jump to a
"cleanup:" label via a 'goto' if we know that we are in a failure mode.

The previous organization made sense because any of the if () or else if
() conditions were a failure mode. However, it might be better to
rearrange things to be clearer about the situation.

Here is a diff from what I was playing with. It's... unclear if this is a
better arrangement, but I thought it worth discussing.

--- >8 ---

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba831..683a412a3fc 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -186,15 +186,22 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			if (modes[i] != SPARSE) {
 				bad = _("bad source");
-		} else if (!strncmp(src, dst, length) &&
+				goto checked_move;
+			}
+		}
+		if (!strncmp(src, dst, length) &&
 				(dst[length] == 0 || dst[length] == '/')) {
 			bad = _("can not move directory into itself");
-		} else if ((src_is_dir = S_ISDIR(st.st_mode))
-				&& lstat(dst, &st) == 0)
+			goto checked_move;
+		}
+		if ((src_is_dir = S_ISDIR(st.st_mode))
+				&& lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
-		else if (src_is_dir) {
+			goto checked_move;
+		}
+		if (src_is_dir) {
 			int first = cache_name_pos(src, length), last;
 
 			if (first >= 0)
@@ -227,11 +234,18 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				}
 				argc += last - first;
 			}
-		} else if (!(ce = cache_file_exists(src, length, 0))) {
+
+			goto checked_move;
+		}
+		if (!(ce = cache_file_exists(src, length, 0))) {
 			bad = _("not under version control");
-		} else if (ce_stage(ce)) {
+			goto checked_move;
+		}
+		if (ce_stage(ce)) {
 			bad = _("conflicted");
-		} else if (lstat(dst, &st) == 0 &&
+			goto checked_move;
+		}
+		if (lstat(dst, &st) == 0 &&
 			 (!ignore_case || strcasecmp(src, dst))) {
 			bad = _("destination exists");
 			if (force) {
@@ -246,34 +260,40 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = _("Cannot overwrite");
 			}
-		} else if (string_list_has_string(&src_for_dst, dst))
+			goto checked_move;
+		}
+		if (string_list_has_string(&src_for_dst, dst)) {
 			bad = _("multiple sources for the same target");
-		else if (is_dir_sep(dst[strlen(dst) - 1]))
+			goto checked_move;
+		}
+		if (is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
-		else {
-			/*
-			 * We check if the paths are in the sparse-checkout
-			 * definition as a very final check, since that
-			 * allows us to point the user to the --sparse
-			 * option as a way to have a successful run.
-			 */
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(src, &the_index)) {
-				string_list_append(&only_match_skip_worktree, src);
-				skip_sparse = 1;
-			}
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
-				string_list_append(&only_match_skip_worktree, dst);
-				skip_sparse = 1;
-			}
-
-			if (skip_sparse)
-				goto remove_entry;
+			goto checked_move;
+		}
 
-			string_list_insert(&src_for_dst, dst);
+		/*
+		 * We check if the paths are in the sparse-checkout
+		 * definition as a very final check, since that
+		 * allows us to point the user to the --sparse
+		 * option as a way to have a successful run.
+		 */
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(src, &the_index)) {
+			string_list_append(&only_match_skip_worktree, src);
+			skip_sparse = 1;
+		}
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(dst, &the_index)) {
+			string_list_append(&only_match_skip_worktree, dst);
+			skip_sparse = 1;
 		}
 
+		if (skip_sparse)
+			goto remove_entry;
+
+		string_list_insert(&src_for_dst, dst);
+
+checked_move:
 		if (!bad)
 			continue;
 		if (!ignore_errors)

--- >8 --- 
>> +			}
>>  			/* only error if existence is expected. */
>> -			if (modes[i] != SPARSE)
>> +			else if (modes[i] != SPARSE)
>>  				bad = _("bad source");
>>  		} else if (!strncmp(src, dst, length) &&
>>  				(dst[length] == 0 || dst[length] == '/')) {
> 
> For a change like this, it would be really helpful to include the tests
> showing how sparse file moves should now be treated in this commit. I see
> that you've added some in patch 4 - could you move the ones related tothis
> change into this commit?

I completely agree: it's nice to see how behavior is intended to change
next to your code change.

> Another way you could do this is to put your "add tests" commit first in
> this series, changing the condition on the ones that are fixed later in the
> series to "test_expect_failure". Then, in each commit that "fixes" a test's
> behavior, change that test to "test_expect_success". This approach had the
> added benefit of showing that, before this series, the tests would fail and
> that this series explicitly fixes those scenarios.

And this would be easier to adapt your current patch structure to this model:
move the last commit to be first, but flip the expectation. Then modify the
expectation for the tests that pass as you go.

This only works as long as you can make an entire test pass with each change.
If multiple changes are needed to make any one test pass, then we don't get
the benefit we're looking for. In that case, your test might be covering too
much behavior in a single test, so it would be worth rewriting the tests to
check a smaller part of the behavior.

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-04-01 12:49     ` Shaoxuan Yuan
@ 2022-04-01 14:49       ` Derrick Stolee
  2022-04-04  7:25         ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-04-01 14:49 UTC (permalink / raw)
  To: Shaoxuan Yuan, Victoria Dye; +Cc: git, gitster

On 4/1/2022 8:49 AM, Shaoxuan Yuan wrote:> On Fri, Apr 1, 2022 at 5:28 AM Victoria Dye <vdye@github.com> wrote:
>>
>> Shaoxuan Yuan wrote:
>>> Originally, moving a <source> directory which is not on-disk due
>>> to its existence outside of sparse-checkout cone, "giv mv" command
>>> errors out with "bad source".
>>>
>>> Add a helper check_dir_in_index() function to see if a directory
>>> name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
>>> such directories.
>>>
>>
>> Hmm, I think this patch would fit better in your eventual "sparse index
>> integration" series than this "prerequisite fixes to sparse-checkout"
>> series. Sparse directories *only* appear when you're using a sparse index
>> so, theoretically, this shouldn't ever come up (and thus isn't testable)
>> until you're using a sparse index.
> 
> After reading your feedback, I realized that I totally misused
> the phrase "sparse directory". Clearly, this patch series does not
> deal with sparse-
> index yet, as "sparse directory" is a cache entry that points to a
> tree, if sparse-index
> is enabled. Silly me ;)
> 
> What I was *actually* trying to say is: I want to change the checking
> logic of moving
> a "directory that exists outside of sparse-checkout cone", and I
> apparently misused
> "sparse directory" to reference such a thing.

In the case of a full index (or an expanded sparse index, which is
currently always the case for `git mv`), you detect a sparse directory
by looking for the directory in the index, _not_ finding it, and then
seeing if the cache entry at the position where the directory _would_
exist is marked with the SKIP_WORKTREE bit.

This works in cone mode and the old mode because I assume you've already
checked for the existence of the directory, so if there _was_ any
non-SKIP_WORKTREE cache entry within the directory, then the directory
would exist in the worktree.

(These are good details to include in the commit message.)

>>> +static int check_dir_in_index(const char *dir)
>>> +{
>>
>> This function can be made a lot simpler - you can use `cache_name_pos()` to
>> find the index entry - if it's found, the directory exists as a sparse
>> directory. And, because 'add_slash()' enforces the trailing slash later on,
>> you don't need to worry about adjusting the name before you look for the
>> entry.
> 
> Yes, if I correctly used the phrase "sparse directory", but I did not...
> I think I can use 'cache_name_pos()' to
> check a directory *iff* it is a legit sparse directory when using sparse-index?
> 
> In my case, I just want to check a regular directory that is not in
> the worktree,
> since the cone pattern excludes it. And in a non-sparse index, cache
> entry points only
> to blobs, not trees, and that's the reason I wrote this weird function
> to look into the
> index. I understand that sounds not compatible with how git manages
> index, but all
> I want to know is "does this directory exist in the index?" (this
> question is also quasi-correct).
> 
> I tried to find an existing API for this job, but I failed to find
> any. Though I have a hunch
> that there must be something to do it...

You can still use cache_name_pos() and if the resulting value is negative,
then you can "flip it" (pos = -1 - pos) to get the array index where the
directory _would_ be inserted.

For example, here is a case in unpack-trees.c (that uses the synonym
index_name_pos()):

static int locate_in_src_index(const struct cache_entry *ce,
			       struct unpack_trees_options *o)
{
	struct index_state *index = o->src_index;
	int len = ce_namelen(ce);
	int pos = index_name_pos(index, ce->name, len);
	if (pos < 0)
		pos = -1 - pos;
	return pos;
}

This uses a binary search inside the method, so it will be much faster
than the loop you wrote here.

If you have this helper, then you can integrate with the sparse index later
by checking for a sparse directory entry when pos is non-negative. But that
can wait for the next series.

>>>                       /* only error if existence is expected. */
>>>                       else if (modes[i] != SPARSE)
>>>                               bad = _("bad source");
>>> @@ -219,7 +246,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>>                               && lstat(dst, &st) == 0)
>>>                       bad = _("cannot move directory over file");
>>>               else if (src_is_dir) {
>>> -                     int first = cache_name_pos(src, length), last;
>>> +                     int first, last;
>>> +dir_check:
>>> +                     first = cache_name_pos(src, length);
>>>
>>>                       if (first >= 0)
>>>                               prepare_move_submodule(src, first,
>>> @@ -230,7 +259,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>>                       else { /* last - first >= 1 */
>>>                               int j, dst_len, n;
>>>
>>> -                             modes[i] = WORKING_DIRECTORY;
>>> +                             if (!modes[i])
>>> +                                     modes[i] = WORKING_DIRECTORY;

This is curious that we could get here with an existing mode. I wonder if
it would be worthwhile to make the enum using a "flags" mode (each state
is a different bit in the word) so instead of

	modes[i] = WORKING_DIRECTORY;

we would write

	modes[i] |= WORKING_DIRECTORY;

>>>                               n = argc + last - first;
>>>                               REALLOC_ARRAY(source, n);
>>>                               REALLOC_ARRAY(destination, n);
>>> @@ -332,7 +362,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>>                       printf(_("Renaming %s to %s\n"), src, dst);
>>>               if (show_only)
>>>                       continue;
>>> -             if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
>>> +             if (mode != INDEX && mode != SPARSE && mode != SPARSE_DIRECTORY &&

And here we would write something like

	if (!(mode & (INDEX | SPARSE | SPARSE_DIRECTORY)) &&

>>> +              rename(src, dst) < 0) {
>>>                       if (ignore_errors)
>>>                               continue;
>>>                       die_errno(_("renaming '%s' failed"), src);
>>> @@ -346,7 +377,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>>                                                             1);
>>>               }
>>>
>>> -             if (mode == WORKING_DIRECTORY)
>>> +             if (mode == WORKING_DIRECTORY || mode == SPARSE_DIRECTORY)

and here:

	if (mode & (WORKING_DIRECTORY | SPARSE_DIRECTORY))

This requires changing your enum definition. It got lost in the previous
quoting, it seems, but here it is again:

>>> @@ -129,7 +148,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>>  		OPT_END(),
>>>  	};
>>>  	const char **source, **destination, **dest_path, **submodule_gitfile;
>>> -	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
>>> +	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE,
>>> +	SPARSE_DIRECTORY } *modes;
>>>  	struct stat st;
>>>  	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
>>>  	struct lock_file lock_file = LOCK_INIT;

I think it is time to split out "enum update_mode" to the top of the file
instead of defining it inline here. Here is what it could look like:

enum update_mode {
	BOTH = 0,
	WORKING_DIRECTORY = (1 << 1),
	INDEX = (1 << 2),
	SPARSE = (1 << 3),
};

(This is how it would look before adding your new value.)

I can imagine making a new commit that does the following:

* Move update_mode to the top and set the values to be independent bits.
* Change "mode[i] =" to "mode[i] |="
* Change "mode ==" checks to "mode &" checks.

Think about it.

>> I'm a bit confused - doesn't this mean the sparse dir move will be skipped?
>> In your commit description, you mention that this 'mv' succeeds with the
>> '--sparse' option, but I don't see any place where the sparse directory
>> would be moved.
> 
> Well, you know the drill, I did not use "sparse directory" correctly, let alone
> 'SPARSE_DIRECTORY' enum bit in this hunk. I think it makes some sense
> if you apply my actual meaning of 'SPARSE_DIRECTORY' here (it should be
> something like OUT_OF_CONE_WORKING_DIRECTORY)? Because such
> directory is not on disk, it cannot be "rename()"d, and should also skip the
> "rename_cache_entry_at()" function. If all the files under the directory are
> moved/renamed, then (in my opinion) the directory is both moved to the
> destination,
> both in the worktree and in the index.

Perhaps a better name would be SKIP_WORKTREE_DIR.

But yes, we need to make sure that all cache entries under the directory
have their SKIP_WORKTREE bits re-checked and any that lose the bit need to
be written to the worktree.

I wonder if it is as simple as marking a boolean that says "I moved at
least one sparse entry" and then calling update_sparsity() at the end of
cmd_mv() if that boolean is true.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
  2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
  2022-03-31 21:56   ` Victoria Dye
@ 2022-04-01 14:55   ` Derrick Stolee
  2 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-04-01 14:55 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: vdye, gitster

On 3/31/2022 5:17 AM, Shaoxuan Yuan wrote:
> Originally, the SKIP_WORKTREE bit is not removed when moving an out-of-cone
> file into sparse cone, thus the moved file does not show up in the working tree.
> Hint the user to use "git sparse-checkout reapply" to reapply the sparsity rules
> to the working tree, by which the SKIP_WORKTREE bit is removed.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
> I offered this solution becasue I'm not sure how to turn a cache_entry's 
> ce_flags back to a non-sparse state. I tried directly set it to 0 like this:
> 
> 	ce->ce_flags = 0;
> 
> But the behavior after this seems undefined. The file still won't show up
> in the working tree.
> 
> And I found that "git sparse-checkout reapply" seems to be a nice fit for the
> job. But I guess if there is a way (there must be but I don't know) to do it
> direcly in the code, that could also be nice.

I point out lower down the API that the 'reapply' command uses to gain
this functionality.

>  	if (gitmodules_modified)
> @@ -392,6 +398,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  		die(_("Unable to write new index file"));
>  
> +	if (advise_to_reapply)
> +		printf(_("Please use \"git sparse-checkout reapply\" to reapply the sparsity.\n"));
> +

I think we can skip the advice here and instead run update_sparsity()
from unpack-trees.c (see update_working_directory() in builtin/sparse-checkout.c
for an example using it). Something along those lines would double-check the
new cache entries against the sparse-checkout patterns and update the worktree
to match expectation.

I originally thought we wouldn't want to run update_sparsity() unless we
interacted with a cache entry with SKIP_WORKTREE, but this could be important
for moving things from inside the cone to outside. In addition, in non-cone
mode there is no way to predict how the new cache entry could match the
patterns once its name changes.

Hopefully this gives you something to try, especially because it should be
simple to check that a file exists or not after the 'git mv' call, making
testing relatively easy.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone
  2022-04-01  8:02       ` Ævar Arnfjörð Bjarmason
@ 2022-04-03  2:01         ` Eric Sunshine
  0 siblings, 0 replies; 95+ messages in thread
From: Eric Sunshine @ 2022-04-03  2:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Shaoxuan Yuan, Git List, Victoria Dye, Derrick Stolee,
	Junio C Hamano, Tao Klerks

On Sat, Apr 2, 2022 at 11:15 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Fri, Apr 01 2022, Shaoxuan Yuan wrote:
> > I actually did use advise(), but I noticed that it prints to stderr
> > and ... nevermind,
>
> We've typically used stderr in git not to mean "error", but to
> distinguish "chatty" and non-primary output from non-chatty.
>
> So (leaving aside that we're unlikely to add advice to plumbing) if you
> emitted a warning() or advice from git-ls-tree you should be able to run
> something like:
>
>     git ls-tree -r HEAD >output-for-a-script
>
> And get the advise() on stderr, while the "primary" output is on stdout.
>
> There's a recent-ish (last year or so) thread where I think Jeff King
> explained this better than I'm doing here, but I couldn't find it with a
> quick search.

You're probably thinking of [1] in which Peff referenced an earlier
email[2] in which he stated his thoughts on the subject. The result of
[1] was that git-workree was changed[3] to use stderr rather than
stdout for its chatty messages, and (significant to this discussion)
CodingGuidelines was updated[4] to cover this topic, so we can now
refer people to CodingGuidelines.

[1]: https://lore.kernel.org/git/YaaN0pibKWgjcVk3@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/20110907215716.GJ13364@sigill.intra.peff.net/
[3]: https://lore.kernel.org/git/20211203034420.47447-2-sunshine@sunshineco.com/
[4]: https://lore.kernel.org/git/20211202223110.22062-1-sunshine@sunshineco.com/

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-04-01 14:49       ` Derrick Stolee
@ 2022-04-04  7:25         ` Shaoxuan Yuan
  2022-04-04  7:49           ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-04  7:25 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye, git, gitster

On Fri, Apr 1, 2022 at 10:49 PM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 4/1/2022 8:49 AM, Shaoxuan Yuan wrote:> On Fri, Apr 1, 2022 at 5:28 AM Victoria Dye <vdye@github.com> wrote:
> >>
> >> Shaoxuan Yuan wrote:
> >>> Originally, moving a <source> directory which is not on-disk due
> >>> to its existence outside of sparse-checkout cone, "giv mv" command
> >>> errors out with "bad source".
> >>>
> >>> Add a helper check_dir_in_index() function to see if a directory
> >>> name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
> >>> such directories.
> >>>
> >>
> >> Hmm, I think this patch would fit better in your eventual "sparse index
> >> integration" series than this "prerequisite fixes to sparse-checkout"
> >> series. Sparse directories *only* appear when you're using a sparse index
> >> so, theoretically, this shouldn't ever come up (and thus isn't testable)
> >> until you're using a sparse index.
> >
> > After reading your feedback, I realized that I totally misused
> > the phrase "sparse directory". Clearly, this patch series does not
> > deal with sparse-
> > index yet, as "sparse directory" is a cache entry that points to a
> > tree, if sparse-index
> > is enabled. Silly me ;)
> >
> > What I was *actually* trying to say is: I want to change the checking
> > logic of moving
> > a "directory that exists outside of sparse-checkout cone", and I
> > apparently misused
> > "sparse directory" to reference such a thing.
>
> In the case of a full index (or an expanded sparse index, which is
> currently always the case for `git mv`), you detect a sparse directory
> by looking for the directory in the index, _not_ finding it, and then
> seeing if the cache entry at the position where the directory _would_
> exist is marked with the SKIP_WORKTREE bit.
>
> This works in cone mode and the old mode because I assume you've already
> checked for the existence of the directory, so if there _was_ any
> non-SKIP_WORKTREE cache entry within the directory, then the directory
> would exist in the worktree.
>
> (These are good details to include in the commit message.)

I read and think about this part a few times, but I'm still confused.

As Victoria pointed out earlier, and I quote, "Sparse directories *only* appear
when you're using a sparse index, so, theoretically, this shouldn't ever
come up (and thus isn't testable) until you're using a sparse index."
So I'm not so sure what do you mean by putting "full index" and "sparse
directory" together.

Thus, I go ahead and try to detect a directory that is outside of
sparse-checkout cone, without sparse-index enabled.

I found a problem that if you use cache_name_pos() to do this
detection, I imagined the following example (I'm trying to imitate an
output of "git ls-files -t"):

H a
H b
S d/file1
H e/file1

So in this index, I use cache_name_pos() to find a directory "c/". I imagine the
the value returned would be -3, which indicates this directory would be inserted
at index position 2. However, the cache entry at position 2 is
"d/file1", which is
marked with SKIP_WORKTREE, and this fact cannot guarantee that "c/" is
a sparse directory, since ''c/" is not in the index per se.

Probably I'm missing something, or I'm just dumb.

> Thanks,
> -Stolee

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-04-04  7:25         ` Shaoxuan Yuan
@ 2022-04-04  7:49           ` Shaoxuan Yuan
  2022-04-04 12:43             ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-04  7:49 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye, git, gitster

On Mon, Apr 4, 2022 at 3:25 PM Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> wrote:
> I read and think about this part a few times, but I'm still confused.
>
> As Victoria pointed out earlier, and I quote, "Sparse directories *only* appear
> when you're using a sparse index, so, theoretically, this shouldn't ever
> come up (and thus isn't testable) until you're using a sparse index."
> So I'm not so sure what do you mean by putting "full index" and "sparse
> directory" together.
>
> Thus, I go ahead and try to detect a directory that is outside of
> sparse-checkout cone, without sparse-index enabled.
>
> I found a problem that if you use cache_name_pos() to do this
> detection, I imagined the following example (I'm trying to imitate an
> output of "git ls-files -t"):
>
> H a
> H b
> S d/file1
> H e/file1
>
> So in this index, I use cache_name_pos() to find a directory "c/". I imagine the
> the value returned would be -3, which indicates this directory would be inserted
> at index position 2. However, the cache entry at position 2 is
> "d/file1", which is
> marked with SKIP_WORKTREE, and this fact cannot guarantee that "c/" is
> a sparse directory, since ''c/" is not in the index per se.
>
> Probably I'm missing something, or I'm just dumb.

Though I think doing a strncmp() after the cache_name_pos()
can get the job done :)

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue
  2022-04-04  7:49           ` Shaoxuan Yuan
@ 2022-04-04 12:43             ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-04-04 12:43 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: Victoria Dye, git, gitster

On 4/4/2022 3:49 AM, Shaoxuan Yuan wrote:
> On Mon, Apr 4, 2022 at 3:25 PM Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> wrote:
>> I read and think about this part a few times, but I'm still confused.
>>
>> As Victoria pointed out earlier, and I quote, "Sparse directories *only* appear
>> when you're using a sparse index, so, theoretically, this shouldn't ever
>> come up (and thus isn't testable) until you're using a sparse index."
>> So I'm not so sure what do you mean by putting "full index" and "sparse
>> directory" together.
>>
>> Thus, I go ahead and try to detect a directory that is outside of
>> sparse-checkout cone, without sparse-index enabled.
>>
>> I found a problem that if you use cache_name_pos() to do this
>> detection, I imagined the following example (I'm trying to imitate an
>> output of "git ls-files -t"):
>>
>> H a
>> H b
>> S d/file1
>> H e/file1
>>
>> So in this index, I use cache_name_pos() to find a directory "c/". I imagine the
>> the value returned would be -3, which indicates this directory would be inserted
>> at index position 2. However, the cache entry at position 2 is
>> "d/file1", which is
>> marked with SKIP_WORKTREE, and this fact cannot guarantee that "c/" is
>> a sparse directory, since ''c/" is not in the index per se.

I was thinking more about the case where you find a tracked directory that
has all contained files marked with SKIP_WORTKREE.

If you search for "d/" you will also get -3. Then, you will see that at
position 2 the cache entry d/file1 has SKIP_WORKTREE. The directory "d/"
should exist on disk if there are any entries starting with "d/" that do
not have the SKIP_WORKTREE bit.

Interesting things happen if you are in the scenario where d/f is in the
cone-mode sparse-checkout definition and you see this list of cache entries:

S d/a
H d/f/b
S d/g

Again, d/f/b should imply the existence of the directory d/ in the worktree.
>> Probably I'm missing something, or I'm just dumb.
> 
> Though I think doing a strncmp() after the cache_name_pos()
> can get the job done :)
 
I think this is the way to do it, including index_range_of_same_dir() in
builtin/mv.c.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v1 0/4] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (5 preceding siblings ...)
  2022-03-31 22:21 ` Victoria Dye
@ 2022-04-08 12:22 ` Shaoxuan Yuan
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-04-08 12:22 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster

On Thu, Mar 31, 2022 at 5:20 PM Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> wrote:
>
> Before integrating 'mv' with sparse-index, I still find some possibly buggy
> UX when 'mv' is interacting with 'sparse-checkout'.
>
> So I kept sparse-index off in order to sort things out without a sparse index.
> We can proceed to integrate with sparse-index once these changes are solid.
>
> Note that this patch is tentative, and still have known glitches, but it
> illustrates a general approach that I intended to harmonize 'mv'
> with 'sparse-checkout'.
>
> Shaoxuan Yuan (4):
>   mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
>   mv: add check_dir_in_index() and solve general dir check issue
>   mv: add advise_to_reapply hint for moving file into cone
>   t7002: add tests for moving out-of-cone file/directory
>
>  builtin/mv.c                  | 76 ++++++++++++++++++++++++++++++++---
>  t/t7002-mv-sparse-checkout.sh | 72 +++++++++++++++++++++++++++++++++
>  2 files changed, 142 insertions(+), 6 deletions(-)
>
>
> base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b
> --
> 2.35.1
>

Hi, to whom it may concern,

I'm writing to say that I'm making useful (possibly) progress on this topic.

I've been busy composing a GSoC proposal last week, so the progress
paused for a while. And I'm going to have a trip for around a week, starting
next week, so I may not have enough time next week to push forward on
this topic.

But I will always be available through email ;-)


--
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [WIP v2 0/5] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (6 preceding siblings ...)
  2022-04-08 12:22 ` Shaoxuan Yuan
@ 2022-05-27 10:07 ` Shaoxuan Yuan
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
                     ` (4 more replies)
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (2 subsequent siblings)
  10 siblings, 5 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:07 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

## Changes since WIP v1 ##

1. Move t7002 tests to the front and turn corresponding tests to 
   test_expect_success along with corresponding commits.

2. Add two tests to t7002.

3. Update check_dir_in_index() and added corresponding documentation.

4. Turn update_mode into enum flags.

5. Use update_sparsity() to replace advise*() function after touching
   sparse contents (this change is INCOMPLETE, NEED FIX).

6. Fix some format issues.

## Limitations ##

This series has not considered moving file from in-cone area to out-of-cone
area *yet*. Moving from in-cone to out-of-cone has not been covered/tested. The
plan is to add the "in-cone to out-of-cone" functionality later, since this 
series is WIP for now.

Shaoxuan Yuan (5):
  t7002: add tests for moving out-of-cone file/directory
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: check if <destination> exists in index to handle overwriting
  mv: add check_dir_in_index() and solve general dir check issue
  mv: use update_sparsity() after touching sparse contents

 builtin/mv.c                  | 104 +++++++++++++++++++++++++++++--
 t/t7002-mv-sparse-checkout.sh | 114 ++++++++++++++++++++++++++++++++++
 2 files changed, 212 insertions(+), 6 deletions(-)

Range-diff against v1:
4:  1dd2fcb234 ! 1:  485d1e9102 t7002: add tests for moving out-of-cone file/directory
    @@ Commit message
         * 'can move out-of-cone directory with --sparse'
         * 'refuse to move out-of-cone file without --sparse'
         * 'can move out-of-cone file with --sparse'
    +    * 'refuse to move sparse file to existing destination'
    +    * 'move sparse file to existing destination with --force and --sparse'
     
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
      	test_cmp expect stderr
      '
      
    -+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
    ++test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
     +	git sparse-checkout disable &&
     +	git reset --hard &&
     +	mkdir folder1 &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +	test_cmp expect stderr
     +'
     +
    -+test_expect_success 'can move out-of-cone directory with --sparse' '
    ++test_expect_failure 'can move out-of-cone directory with --sparse' '
     +	git sparse-checkout disable &&
     +	git reset --hard &&
     +	mkdir folder1 &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +
     +	git mv --sparse folder1 sub 1>actual 2>stderr &&
     +	test_must_be_empty stderr &&
    -+	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
    -+	>expect &&
    -+	test_cmp actual expect &&
     +
     +	git sparse-checkout reapply &&
     +	test_path_is_dir sub/folder1 &&
     +	test_path_is_file sub/folder1/file1
     +'
     +
    -+test_expect_success 'refuse to move out-of-cone file without --sparse' '
    ++test_expect_failure 'refuse to move out-of-cone file without --sparse' '
     +	git sparse-checkout disable &&
     +	git reset --hard &&
     +	mkdir folder1 &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +	test_cmp expect stderr
     +'
     +
    -+test_expect_success 'can move out-of-cone file with --sparse' '
    ++test_expect_failure 'can move out-of-cone file with --sparse' '
     +	git sparse-checkout disable &&
     +	git reset --hard &&
     +	mkdir folder1 &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +
     +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
     +	test_must_be_empty stderr &&
    -+	echo "Please use \"git sparse-checkout reapply\" to reapply the sparsity."\
    -+	>expect &&
    -+	test_cmp actual expect &&
     +
     +	git sparse-checkout reapply &&
     +	! test_path_is_dir sub/folder1 &&
     +	test_path_is_file sub/file1
     +'
    ++
    ++test_expect_failure 'refuse to move sparse file to existing destination' '
    ++	git sparse-checkout disable &&
    ++	git reset --hard &&
    ++	mkdir folder1 &&
    ++	touch folder1/file1 &&
    ++	touch sub/file1 &&
    ++	git add folder1 sub/file1 &&
    ++	git sparse-checkout init --cone &&
    ++	git sparse-checkout set sub &&
    ++
    ++	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
    ++	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
    ++	test_cmp expect stderr
    ++'
    ++
    ++test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    ++	git sparse-checkout disable &&
    ++	git reset --hard &&
    ++	mkdir folder1 &&
    ++	touch folder1/file1 &&
    ++	touch sub/file1 &&
    ++	echo "overwrite" >folder1/file1 &&
    ++	git add folder1 sub/file1 &&
    ++	git sparse-checkout init --cone &&
    ++	git sparse-checkout set sub &&
    ++
    ++	git mv --sparse --force folder1/file1 sub 2>stderr &&
    ++	test_must_be_empty stderr &&
    ++	echo "overwrite" >expect &&
    ++	test_cmp expect sub/file1
    ++'
     +
      test_done
1:  5cf6b860e3 ! 2:  c99df4fc1a mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      				bad = _("bad source");
      		} else if (!strncmp(src, dst, length) &&
      				(dst[length] == 0 || dst[length] == '/')) {
    +
    + ## t/t7002-mv-sparse-checkout.sh ##
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'can move out-of-cone directory with --sparse' '
    + 	test_path_is_file sub/folder1/file1
    + '
    + 
    +-test_expect_failure 'refuse to move out-of-cone file without --sparse' '
    ++test_expect_success 'refuse to move out-of-cone file without --sparse' '
    + 	git sparse-checkout disable &&
    + 	git reset --hard &&
    + 	mkdir folder1 &&
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone file without --sparse' '
    + 	test_cmp expect stderr
    + '
    + 
    +-test_expect_failure 'can move out-of-cone file with --sparse' '
    ++test_expect_success 'can move out-of-cone file with --sparse' '
    + 	git sparse-checkout disable &&
    + 	git reset --hard &&
    + 	mkdir folder1 &&
-:  ---------- > 3:  8f1193188b mv: check if <destination> exists in index to handle overwriting
2:  7b3c931f3f ! 4:  e195bfbc73 mv: add check_dir_in_index() and solve general dir check issue
    @@ Commit message
         errors out with "bad source".
     
         Add a helper check_dir_in_index() function to see if a directory
    -    name exists in the index. Also add a SPARSE_DIRECTORY bit to mark
    +    name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
         such directories.
     
         Change the checking logic, so that such <source> directory makes
    @@ Commit message
         instead of "bad source"; also user now can supply a "--sparse" flag so
         this operation can be carried out successfully.
     
    +    Also, as suggested by Derrick [1],
    +    move the in-line definition of "enum update_mode" to the top
    +    of the file and make it use "flags" mode (each state is a different
    +    bit in the word).
    +
    +    [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
    +
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
    +@@ builtin/mv.c: static const char * const builtin_mv_usage[] = {
    + 	NULL
    + };
    + 
    ++enum update_mode {
    ++	BOTH = 0,
    ++	WORKING_DIRECTORY = (1 << 1),
    ++	INDEX = (1 << 2),
    ++	SPARSE = (1 << 3),
    ++	SKIP_WORKTREE_DIR = (1 << 4),
    ++};
    ++
    + #define DUP_BASENAME 1
    + #define KEEP_TRAILING_SLASH 2
    + 
     @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
      	return last - first;
      }
      
    -+static int check_dir_in_index(const char *dir)
    ++/*
    ++ * Check if an out-of-cone directory should be in the index. Imagine this case
    ++ * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
    ++ * and thus the directory is sparsified.
    ++ *
    ++ * Return 0 if such directory exist (i.e. with any of its contained files not
    ++ * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
    ++ * Return 1 otherwise.
    ++ */
    ++static int check_dir_in_index(const char *name, int namelen)
     +{
    -+	int ret = 0;
    -+	int length = sizeof(dir) + 1;
    -+	char *substr = malloc(length);
    ++	int ret = 1;
    ++	const char *with_slash = add_slash(name);
    ++	int length = namelen + 1;
     +
    -+	for (int i = 0; i < the_index.cache_nr; i++) {
    -+		memcpy(substr, the_index.cache[i]->name, length);
    -+		memset(substr + length - 1, 0, 1);
    ++	int pos = cache_name_pos(with_slash, length);
    ++	const struct cache_entry *ce;
     +
    -+		if (strcmp(dir, substr) == 0) {
    -+			ret = 1;
    ++	if (pos < 0) {
    ++		pos = -pos - 1;
    ++		if (pos >= the_index.cache_nr)
     +			return ret;
    -+		}
    ++		ce = active_cache[pos];
    ++		if (strncmp(with_slash, ce->name, length))
    ++			return ret;
    ++		if (ce_skip_worktree(ce))
    ++			return ret = 0;
     +	}
    -+	free(substr);
     +	return ret;
     +}
     +
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      	};
      	const char **source, **destination, **dest_path, **submodule_gitfile;
     -	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
    -+	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE,
    -+	SPARSE_DIRECTORY } *modes;
    ++	enum update_mode *modes;
      	struct stat st;
      	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
      	struct lock_file lock_file = LOCK_INIT;
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 			 */
    + 		if (lstat(src, &st) < 0) {
      
      			int pos = cache_name_pos(src, length);
     +			const char *src_w_slash = add_slash(src);
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      				else
      					bad = _("bad source");
      			}
    -+			else if (check_dir_in_index(src_w_slash) &&
    -+			!path_in_sparse_checkout(src_w_slash, &the_index)) {
    -+				modes[i] = SPARSE_DIRECTORY;
    ++			else if (!check_dir_in_index(src, length) &&
    ++					 !path_in_sparse_checkout(src_w_slash, &the_index)) {
    ++				modes[i] = SKIP_WORKTREE_DIR;
     +				goto dir_check;
     +			}
      			/* only error if existence is expected. */
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      
     -				modes[i] = WORKING_DIRECTORY;
     +				if (!modes[i])
    -+					modes[i] = WORKING_DIRECTORY;
    ++					modes[i] |= WORKING_DIRECTORY;
      				n = argc + last - first;
      				REALLOC_ARRAY(source, n);
      				REALLOC_ARRAY(destination, n);
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		if (show_only)
      			continue;
     -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
    -+		if (mode != INDEX && mode != SPARSE && mode != SPARSE_DIRECTORY &&
    -+		 rename(src, dst) < 0) {
    ++		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
    ++		 	rename(src, dst) < 0) {
      			if (ignore_errors)
      				continue;
      			die_errno(_("renaming '%s' failed"), src);
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		}
      
     -		if (mode == WORKING_DIRECTORY)
    -+		if (mode == WORKING_DIRECTORY || mode == SPARSE_DIRECTORY)
    ++		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
      			continue;
      
      		pos = cache_name_pos(src, strlen(src));
    +
    + ## t/t7002-mv-sparse-checkout.sh ##
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
    + 	test_cmp expect stderr
    + '
    + 
    +-test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
    ++test_expect_success 'refuse to move out-of-cone directory without --sparse' '
    + 	git sparse-checkout disable &&
    + 	git reset --hard &&
    + 	mkdir folder1 &&
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
    + 	test_cmp expect stderr
    + '
    + 
    +-test_expect_failure 'can move out-of-cone directory with --sparse' '
    ++test_expect_success 'can move out-of-cone directory with --sparse' '
    + 	git sparse-checkout disable &&
    + 	git reset --hard &&
    + 	mkdir folder1 &&
3:  4be4c4f34d < -:  ---------- mv: add advise_to_reapply hint for moving file into cone
-:  ---------- > 5:  aa82ba56b0 mv: use update_sparsity() after touching sparse contents
-- 
2.35.1


^ permalink raw reply	[flat|nested] 95+ messages in thread

* [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
@ 2022-05-27 10:08   ` Shaoxuan Yuan
  2022-05-27 12:07     ` Ævar Arnfjörð Bjarmason
                       ` (2 more replies)
  2022-05-27 10:08   ` [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
                     ` (3 subsequent siblings)
  4 siblings, 3 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:08 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

Add corresponding tests to test following situations:

* 'refuse to move out-of-cone directory without --sparse'
* 'can move out-of-cone directory with --sparse'
* 'refuse to move out-of-cone file without --sparse'
* 'can move out-of-cone file with --sparse'
* 'refuse to move sparse file to existing destination'
* 'move sparse file to existing destination with --force and --sparse'

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t7002-mv-sparse-checkout.sh | 98 +++++++++++++++++++++++++++++++++++
 1 file changed, 98 insertions(+)

diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 1d3d2aca21..963cb512e2 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -206,4 +206,102 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
+test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	test_must_fail git mv folder1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone directory with --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	git mv --sparse folder1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+
+	git sparse-checkout reapply &&
+	test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/folder1/file1
+'
+
+test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	test_must_fail git mv folder1/file1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone file with --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+
+	git sparse-checkout reapply &&
+	! test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/file1
+'
+
+test_expect_failure 'refuse to move sparse file to existing destination' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
+	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+	git sparse-checkout disable &&
+	git reset --hard &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	echo "overwrite" >folder1/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout init --cone &&
+	git sparse-checkout set sub &&
+
+	git mv --sparse --force folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "overwrite" >expect &&
+	test_cmp expect sub/file1
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-05-27 10:08   ` Shaoxuan Yuan
  2022-05-27 15:13     ` Derrick Stolee
  2022-05-27 10:08   ` [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:08 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 26 +++++++++++++++++++++++++-
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba83..32ad4d5682 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
+			/*
+			 * TODO: for now, when you try to overwrite a <destination>
+			 * with your <source> as a sparse file, if you supply a "--sparse"
+			 * flag, then the action will be done without providing "--force"
+			 * and no warning.
+			 *
+			 * This is mainly because the sparse <source>
+			 * is not on-disk, and this if-else chain will be cut off early in
+			 * this check, thus the "--force" check is ignored. Need fix.
+			 */
+
+			int pos = cache_name_pos(src, length);
+			if (pos >= 0) {
+				const struct cache_entry *ce = active_cache[pos];
+
+				if (ce_skip_worktree(ce)) {
+					if (!ignore_sparse)
+						string_list_append(&only_match_skip_worktree, src);
+					else
+						modes[i] = SPARSE;
+				}
+				else
+					bad = _("bad source");
+			}
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			else if (modes[i] != SPARSE)
 				bad = _("bad source");
 		} else if (!strncmp(src, dst, length) &&
 				(dst[length] == 0 || dst[length] == '/')) {
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 963cb512e2..581ef4c0f6 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -239,7 +239,7 @@ test_expect_failure 'can move out-of-cone directory with --sparse' '
 	test_path_is_file sub/folder1/file1
 '
 
-test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+test_expect_success 'refuse to move out-of-cone file without --sparse' '
 	git sparse-checkout disable &&
 	git reset --hard &&
 	mkdir folder1 &&
@@ -255,7 +255,7 @@ test_expect_failure 'refuse to move out-of-cone file without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone file with --sparse' '
+test_expect_success 'can move out-of-cone file with --sparse' '
 	git sparse-checkout disable &&
 	git reset --hard &&
 	mkdir folder1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-05-27 10:08   ` [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-05-27 10:08   ` Shaoxuan Yuan
  2022-05-27 22:04     ` Victoria Dye
  2022-05-27 10:08   ` [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
  2022-05-27 10:08   ` [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents Shaoxuan Yuan
  4 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:08 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 23 +++++++++++------------
 t/t7002-mv-sparse-checkout.sh |  2 +-
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 32ad4d5682..62284e3f86 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -185,16 +185,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
-			/*
-			 * TODO: for now, when you try to overwrite a <destination>
-			 * with your <source> as a sparse file, if you supply a "--sparse"
-			 * flag, then the action will be done without providing "--force"
-			 * and no warning.
-			 *
-			 * This is mainly because the sparse <source>
-			 * is not on-disk, and this if-else chain will be cut off early in
-			 * this check, thus the "--force" check is ignored. Need fix.
-			 */
 
 			int pos = cache_name_pos(src, length);
 			if (pos >= 0) {
@@ -203,8 +193,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				if (ce_skip_worktree(ce)) {
 					if (!ignore_sparse)
 						string_list_append(&only_match_skip_worktree, src);
-					else
-						modes[i] = SPARSE;
+					else {
+						/* Check if dst exists in index */
+						if (cache_name_pos(dst, strlen(dst)) >= 0) {
+							if (force)
+								modes[i] = SPARSE;
+							else
+								bad = _("destination exists");
+						}
+						else
+							modes[i] = SPARSE;
+					}
 				}
 				else
 					bad = _("bad source");
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 581ef4c0f6..2c9008573a 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -272,7 +272,7 @@ test_expect_success 'can move out-of-cone file with --sparse' '
 	test_path_is_file sub/file1
 '
 
-test_expect_failure 'refuse to move sparse file to existing destination' '
+test_expect_success 'refuse to move sparse file to existing destination' '
 	git sparse-checkout disable &&
 	git reset --hard &&
 	mkdir folder1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
                     ` (2 preceding siblings ...)
  2022-05-27 10:08   ` [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
@ 2022-05-27 10:08   ` Shaoxuan Yuan
  2022-05-27 15:27     ` Derrick Stolee
  2022-05-27 10:08   ` [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents Shaoxuan Yuan
  4 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:08 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Also, as suggested by Derrick [1],
move the in-line definition of "enum update_mode" to the top
of the file and make it use "flags" mode (each state is a different
bit in the word).

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 60 ++++++++++++++++++++++++++++++++---
 t/t7002-mv-sparse-checkout.sh |  4 +--
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 62284e3f86..e64f251a69 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -19,6 +19,14 @@ static const char * const builtin_mv_usage[] = {
 	NULL
 };
 
+enum update_mode {
+	BOTH = 0,
+	WORKING_DIRECTORY = (1 << 1),
+	INDEX = (1 << 2),
+	SPARSE = (1 << 3),
+	SKIP_WORKTREE_DIR = (1 << 4),
+};
+
 #define DUP_BASENAME 1
 #define KEEP_TRAILING_SLASH 2
 
@@ -115,6 +123,37 @@ static int index_range_of_same_dir(const char *src, int length,
 	return last - first;
 }
 
+/*
+ * Check if an out-of-cone directory should be in the index. Imagine this case
+ * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
+ * and thus the directory is sparsified.
+ *
+ * Return 0 if such directory exist (i.e. with any of its contained files not
+ * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
+ * Return 1 otherwise.
+ */
+static int check_dir_in_index(const char *name, int namelen)
+{
+	int ret = 1;
+	const char *with_slash = add_slash(name);
+	int length = namelen + 1;
+
+	int pos = cache_name_pos(with_slash, length);
+	const struct cache_entry *ce;
+
+	if (pos < 0) {
+		pos = -pos - 1;
+		if (pos >= the_index.cache_nr)
+			return ret;
+		ce = active_cache[pos];
+		if (strncmp(with_slash, ce->name, length))
+			return ret;
+		if (ce_skip_worktree(ce))
+			return ret = 0;
+	}
+	return ret;
+}
+
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
@@ -129,7 +168,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
+	enum update_mode *modes;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -187,6 +226,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (lstat(src, &st) < 0) {
 
 			int pos = cache_name_pos(src, length);
+			const char *src_w_slash = add_slash(src);
+
 			if (pos >= 0) {
 				const struct cache_entry *ce = active_cache[pos];
 
@@ -208,6 +249,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				else
 					bad = _("bad source");
 			}
+			else if (!check_dir_in_index(src, length) &&
+					 !path_in_sparse_checkout(src_w_slash, &the_index)) {
+				modes[i] = SKIP_WORKTREE_DIR;
+				goto dir_check;
+			}
 			/* only error if existence is expected. */
 			else if (modes[i] != SPARSE)
 				bad = _("bad source");
@@ -218,7 +264,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				&& lstat(dst, &st) == 0)
 			bad = _("cannot move directory over file");
 		else if (src_is_dir) {
-			int first = cache_name_pos(src, length), last;
+			int first, last;
+dir_check:
+			first = cache_name_pos(src, length);
 
 			if (first >= 0)
 				prepare_move_submodule(src, first,
@@ -229,7 +277,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			else { /* last - first >= 1 */
 				int j, dst_len, n;
 
-				modes[i] = WORKING_DIRECTORY;
+				if (!modes[i])
+					modes[i] |= WORKING_DIRECTORY;
 				n = argc + last - first;
 				REALLOC_ARRAY(source, n);
 				REALLOC_ARRAY(destination, n);
@@ -331,7 +380,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
+		 	rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
 			die_errno(_("renaming '%s' failed"), src);
@@ -345,7 +395,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode == WORKING_DIRECTORY)
+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 2c9008573a..cf2f5dc46f 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -206,7 +206,7 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
 	git sparse-checkout disable &&
 	git reset --hard &&
 	mkdir folder1 &&
@@ -222,7 +222,7 @@ test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone directory with --sparse' '
+test_expect_success 'can move out-of-cone directory with --sparse' '
 	git sparse-checkout disable &&
 	git reset --hard &&
 	mkdir folder1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
                     ` (3 preceding siblings ...)
  2022-05-27 10:08   ` [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-05-27 10:08   ` Shaoxuan Yuan
  2022-05-27 12:10     ` Ævar Arnfjörð Bjarmason
  2022-05-27 19:36     ` Victoria Dye
  4 siblings, 2 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-27 10:08 UTC (permalink / raw)
  To: git; +Cc: vdye, derrickstolee, gitster, newren, Shaoxuan Yuan

Originally, "git mv" a sparse file/directory from out/in-cone to
in/out-cone does not update the sparsity following the sparse-checkout
patterns.

Use update_sparsity() after touching sparse contents, so the sparsity
will be updated after the move.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 19 +++++++++++++++++++
 t/t7002-mv-sparse-checkout.sh | 16 ++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/builtin/mv.c b/builtin/mv.c
index e64f251a69..2c02120941 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -13,6 +13,7 @@
 #include "string-list.h"
 #include "parse-options.h"
 #include "submodule.h"
+#include "unpack-trees.h"
 
 static const char * const builtin_mv_usage[] = {
 	N_("git mv [<options>] <source>... <destination>"),
@@ -158,6 +159,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
 	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
+	int sparse_moved = 0;
 	struct option builtin_mv_options[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
 		OPT__DRY_RUN(&show_only, N_("dry run")),
@@ -376,6 +378,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
 		int pos;
+		if (!sparse_moved && mode & (SPARSE | SKIP_WORKTREE_DIR))
+			sparse_moved = 1;
 		if (show_only || verbose)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
@@ -403,6 +407,21 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		rename_cache_entry_at(pos, dst);
 	}
 
+	if (sparse_moved) {
+		struct unpack_trees_options o;
+		memset(&o, 0, sizeof(o));
+		o.verbose_update = isatty(2);
+		o.update = 1;
+		o.head_idx = -1;
+		o.src_index = &the_index;
+		o.dst_index = &the_index;
+		o.skip_sparse_checkout = 0;
+		o.pl = the_index.sparse_checkout_patterns;
+		setup_unpack_trees_porcelain(&o, "mv");
+		update_sparsity(&o);
+		clear_unpack_trees_porcelain(&o);
+	}
+
 	if (gitmodules_modified)
 		stage_updated_gitmodules(&the_index);
 
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index cf2f5dc46f..1fd3e3c0fc 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -287,6 +287,22 @@ test_expect_success 'refuse to move sparse file to existing destination' '
 	test_cmp expect stderr
 '
 
+# Need fix.
+#
+# The *expected* behavior:
+#
+# Using --sparse to accept a sparse file, --force to overwrite the destination.
+# The folder1/file1 should replace the sub/file1 without error.
+#
+# The *actual* behavior:
+#
+# It emits a warning:
+#
+# warning: Path ' sub/file1
+# ' already present; will not overwrite with sparse update.
+# After fixing the above paths, you may want to run `git sparse-checkout
+# reapply`.
+
 test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
 	git sparse-checkout disable &&
 	git reset --hard &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-05-27 12:07     ` Ævar Arnfjörð Bjarmason
  2022-05-27 14:48     ` Derrick Stolee
  2022-05-27 15:51     ` Victoria Dye
  2 siblings, 0 replies; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-27 12:07 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster, newren


On Fri, May 27 2022, Shaoxuan Yuan wrote:

> +test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&

We don't usually use "touch file", don't you mean just ">file" here and
below?

> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	test_must_fail git mv folder1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'can move out-of-cone directory with --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse folder1 sub 1>actual 2>stderr &&

use e.g. "out" and "err" instead of "actual" and "stderr". I.e. when we
test both we use something like that usually.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 10:08   ` [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents Shaoxuan Yuan
@ 2022-05-27 12:10     ` Ævar Arnfjörð Bjarmason
  2022-05-27 19:36     ` Victoria Dye
  1 sibling, 0 replies; 95+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-27 12:10 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, derrickstolee, gitster, newren


On Fri, May 27 2022, Shaoxuan Yuan wrote:

> +	if (sparse_moved) {
> +		struct unpack_trees_options o;
> +		memset(&o, 0, sizeof(o));
> +		o.verbose_update = isatty(2);
> +		o.update = 1;
> +		o.head_idx = -1;
> +		o.src_index = &the_index;
> +		o.dst_index = &the_index;
> +		o.skip_sparse_checkout = 0;
> +		o.pl = the_index.sparse_checkout_patterns;

You can drop the memset here and use the designated init syntax
instead. I.e. "struct x o = { .verbose_update .... };"

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-05-27 12:07     ` Ævar Arnfjörð Bjarmason
@ 2022-05-27 14:48     ` Derrick Stolee
  2022-05-27 15:51     ` Victoria Dye
  2 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-05-27 14:48 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: vdye, gitster, newren

On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
> Add corresponding tests to test following situations:
> 
> * 'refuse to move out-of-cone directory without --sparse'
> * 'can move out-of-cone directory with --sparse'
> * 'refuse to move out-of-cone file without --sparse'
> * 'can move out-of-cone file with --sparse'
> * 'refuse to move sparse file to existing destination'
> * 'move sparse file to existing destination with --force and --sparse'

Style nit: bulleted lists like this don't add too much value
on top of reading the patch. You can use prose to describe how
you decided that these tests are the ones to write. Something
like:

  We do not have sufficient coverage of moving files outside
  of a sparse-checkout cone. Create new tests covering this
  behavior, keeping in mind that the user can include --sparse
  (or not), move a file or directory, and the destination can
  already exist in the index.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-05-27 10:08   ` [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-05-27 15:13     ` Derrick Stolee
  2022-05-27 22:38       ` Victoria Dye
  2022-05-31  8:06       ` Shaoxuan Yuan
  0 siblings, 2 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-05-27 15:13 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: vdye, gitster, newren



On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
> Originally, moving a <source> file which is not on-disk but exists in
> index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
> out with "bad source".
> 
> Change the checking logic, so that such <source>
> file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
> instead of "bad source"; also user now can supply a "--sparse" flag so
> this operation can be carried out successfully.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 26 +++++++++++++++++++++++++-
>  t/t7002-mv-sparse-checkout.sh |  4 ++--
>  2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 83a465ba83..32ad4d5682 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  
>  		length = strlen(src);
>  		if (lstat(src, &st) < 0) {
> +			/*
> +			 * TODO: for now, when you try to overwrite a <destination>
> +			 * with your <source> as a sparse file, if you supply a "--sparse"
> +			 * flag, then the action will be done without providing "--force"
> +			 * and no warning.
> +			 *
> +			 * This is mainly because the sparse <source>
> +			 * is not on-disk, and this if-else chain will be cut off early in
> +			 * this check, thus the "--force" check is ignored. Need fix.
> +			 */

I wonder if this is worth the comment here, or if we'd rather see
the mention in the commit message. You have documented tests that
fail in this case, so we already have something that marks this
as "TODO" in a more discoverable place.

> +			int pos = cache_name_pos(src, length);
> +			if (pos >= 0) {
> +				const struct cache_entry *ce = active_cache[pos];
> +
> +				if (ce_skip_worktree(ce)) {
> +					if (!ignore_sparse)
> +						string_list_append(&only_match_skip_worktree, src);
> +					else
> +						modes[i] = SPARSE;


> +				}
> +				else
> +					bad = _("bad source");

style nit:

	} else {
		bad = _("bad source");
	}

> +			}
>  			/* only error if existence is expected. */
> -			if (modes[i] != SPARSE)
> +			else if (modes[i] != SPARSE)
>  				bad = _("bad source");

For this one, the comment makes it difficult to connect the 'else
if' to its corresponding 'if'. Perhaps:

	} else if (modes[i] != SPARSE) {
		/* only error if existence is expected. */
		bad = _("bad source");
	}

>  		} else if (!strncmp(src, dst, length) &&
>  				(dst[length] == 0 || dst[length] == '/')) {

In general, I found this if/else-if chain hard to grok, and
a lot of it is because we have "simple" cases at the end
and the complicated parts have ever-increasing nesting. This
is mostly due to the existing if/else-if chain in this method.

Here is a diff that replaces that if/else-if chain with a
'goto' trick to jump ahead, allowing some code to decrease in
tabbing:

---- >8 ----

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba831..1ca2c21da89 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -186,53 +186,68 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			if (modes[i] != SPARSE) {
 				bad = _("bad source");
-		} else if (!strncmp(src, dst, length) &&
-				(dst[length] == 0 || dst[length] == '/')) {
+				goto act_on_entry;
+			}
+		}
+		if (!strncmp(src, dst, length) &&
+		    (dst[length] == 0 || dst[length] == '/')) {
 			bad = _("can not move directory into itself");
-		} else if ((src_is_dir = S_ISDIR(st.st_mode))
-				&& lstat(dst, &st) == 0)
+			goto act_on_entry;
+		}
+		if ((src_is_dir = S_ISDIR(st.st_mode))
+		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
-		else if (src_is_dir) {
+			goto act_on_entry;
+		}
+		if (src_is_dir) {
+			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
-			if (first >= 0)
+			if (first >= 0) {
 				prepare_move_submodule(src, first,
 						       submodule_gitfile + i);
-			else if (index_range_of_same_dir(src, length,
-							 &first, &last) < 1)
+				goto act_on_entry;
+			} else if (index_range_of_same_dir(src, length,
+							   &first, &last) < 1) {
 				bad = _("source directory is empty");
-			else { /* last - first >= 1 */
-				int j, dst_len, n;
-
-				modes[i] = WORKING_DIRECTORY;
-				n = argc + last - first;
-				REALLOC_ARRAY(source, n);
-				REALLOC_ARRAY(destination, n);
-				REALLOC_ARRAY(modes, n);
-				REALLOC_ARRAY(submodule_gitfile, n);
-
-				dst = add_slash(dst);
-				dst_len = strlen(dst);
-
-				for (j = 0; j < last - first; j++) {
-					const struct cache_entry *ce = active_cache[first + j];
-					const char *path = ce->name;
-					source[argc + j] = path;
-					destination[argc + j] =
-						prefix_path(dst, dst_len, path + length + 1);
-					modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
-					submodule_gitfile[argc + j] = NULL;
-				}
-				argc += last - first;
+				goto act_on_entry;
 			}
-		} else if (!(ce = cache_file_exists(src, length, 0))) {
+
+			/* last - first >= 1 */
+			modes[i] = WORKING_DIRECTORY;
+			n = argc + last - first;
+			REALLOC_ARRAY(source, n);
+			REALLOC_ARRAY(destination, n);
+			REALLOC_ARRAY(modes, n);
+			REALLOC_ARRAY(submodule_gitfile, n);
+
+			dst = add_slash(dst);
+			dst_len = strlen(dst);
+
+			for (j = 0; j < last - first; j++) {
+				const struct cache_entry *ce = active_cache[first + j];
+				const char *path = ce->name;
+				source[argc + j] = path;
+				destination[argc + j] =
+					prefix_path(dst, dst_len, path + length + 1);
+				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				submodule_gitfile[argc + j] = NULL;
+			}
+			argc += last - first;
+			goto act_on_entry;
+		}
+		if (!(ce = cache_file_exists(src, length, 0))) {
 			bad = _("not under version control");
-		} else if (ce_stage(ce)) {
+			goto act_on_entry;
+		}
+		if (ce_stage(ce)) {
 			bad = _("conflicted");
-		} else if (lstat(dst, &st) == 0 &&
-			 (!ignore_case || strcasecmp(src, dst))) {
+			goto act_on_entry;
+		}
+		if (lstat(dst, &st) == 0 &&
+		    (!ignore_case || strcasecmp(src, dst))) {
 			bad = _("destination exists");
 			if (force) {
 				/*
@@ -246,34 +261,40 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = _("Cannot overwrite");
 			}
-		} else if (string_list_has_string(&src_for_dst, dst))
+			goto act_on_entry;
+		}
+		if (string_list_has_string(&src_for_dst, dst)) {
 			bad = _("multiple sources for the same target");
-		else if (is_dir_sep(dst[strlen(dst) - 1]))
+			goto act_on_entry;
+		}
+		if (is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
-		else {
-			/*
-			 * We check if the paths are in the sparse-checkout
-			 * definition as a very final check, since that
-			 * allows us to point the user to the --sparse
-			 * option as a way to have a successful run.
-			 */
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(src, &the_index)) {
-				string_list_append(&only_match_skip_worktree, src);
-				skip_sparse = 1;
-			}
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
-				string_list_append(&only_match_skip_worktree, dst);
-				skip_sparse = 1;
-			}
-
-			if (skip_sparse)
-				goto remove_entry;
+			goto act_on_entry;
+		}
 
-			string_list_insert(&src_for_dst, dst);
+		/*
+		 * We check if the paths are in the sparse-checkout
+		 * definition as a very final check, since that
+		 * allows us to point the user to the --sparse
+		 * option as a way to have a successful run.
+		 */
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(src, &the_index)) {
+			string_list_append(&only_match_skip_worktree, src);
+			skip_sparse = 1;
+		}
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(dst, &the_index)) {
+			string_list_append(&only_match_skip_worktree, dst);
+			skip_sparse = 1;
 		}
 
+		if (skip_sparse)
+			goto remove_entry;
+
+		string_list_insert(&src_for_dst, dst);
+
+act_on_entry:
 		if (!bad)
 			continue;
 		if (!ignore_errors)

---- >8 ----

But mostly the reason for this refactor is that the following
diff should be equivalent to yours:

---- >8 ----

diff --git a/builtin/mv.c b/builtin/mv.c
index d8b5c24fb5..add48e23b4 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -185,11 +185,28 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
-			/* only error if existence is expected. */
-			if (modes[i] != SPARSE) {
+			int pos;
+			const struct cache_entry *ce;
+
+			pos = cache_name_pos(src, length);
+			if (pos < 0) {
+				/* only error if existence is expected. */
+				if (modes[i] != SPARSE)
+					bad = _("bad source");
+				goto act_on_entry;
+			}
+
+			ce = active_cache[pos];
+			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
 				goto act_on_entry;
 			}
+
+			if (!ignore_sparse)
+				string_list_append(&only_match_skip_worktree, src);
+			else
+				modes[i] = SPARSE;
+			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
 		    (dst[length] == 0 || dst[length] == '/')) {
---- >8 ---

To me, this is a bit easier to parse, since we find the error
cases and jump to the action before continuing on the "happy
path". It does involve that first big refactor first, so I'd
like to hear opinions of other contributors before you jump to
taking this suggestion.

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue
  2022-05-27 10:08   ` [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-05-27 15:27     ` Derrick Stolee
  2022-05-31  9:56       ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-05-27 15:27 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: vdye, gitster, newren

On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
> +/*
> + * Check if an out-of-cone directory should be in the index. Imagine this case
> + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
> + * and thus the directory is sparsified.
> + *
> + * Return 0 if such directory exist (i.e. with any of its contained files not
> + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
> + * Return 1 otherwise.
> + */
> +static int check_dir_in_index(const char *name, int namelen)
> +{
> +	int ret = 1;
> +	const char *with_slash = add_slash(name);
> +	int length = namelen + 1;
> +
> +	int pos = cache_name_pos(with_slash, length);
> +	const struct cache_entry *ce;
> +
> +	if (pos < 0) {
> +		pos = -pos - 1;
> +		if (pos >= the_index.cache_nr)
> +			return ret;
> +		ce = active_cache[pos];
> +		if (strncmp(with_slash, ce->name, length))
> +			return ret;
> +		if (ce_skip_worktree(ce))
> +			return ret = 0;

This appears to check if the _first_ entry under the directory
is sparse, but not if _all_ entries are sparse. These are not
the same thing, even in cone-mode sparse-checkout. The t1092
test directory has files like "folder1/0/0/a" but if
"folder1/1" is in the sparse-checkout cone, then that first
entry has the skip-worktree bit, but "folder1/1/a" and "folder1/a"
do not.

> +	}
> +	return ret;

At the moment, it doesn't seem like we need 'ret' since the
only place you set it is in "return ret = 0;" (which could
just be "return 0;" while the others are "return 1;"). But,
perhaps you intended to create a loop over 'pos' while
with_slash is a prefix of the cache entry?

> +			else if (!check_dir_in_index(src, length) &&
> +					 !path_in_sparse_checkout(src_w_slash, &the_index)) {

style-nit: You'll want to align the different parts of your
logical statement to agree with the end of the "else if (",

	else if (A &&
		 B) {


> +				modes[i] = SKIP_WORKTREE_DIR;

If we are moving to a flags-based model, should we convert all
"modes[i] =" to "modes[i] |=" as a first step (before adding the
SKIP_WORTKREE_DIR flag)?

> +				goto dir_check;

Hm. While I did recommend using 'goto' to jump to a common end
place in the loop body, I'm not sure about jumping into another
else-if statement. This might be a good time to extract the
code from "else if (src_is_dir)" below into a helper method that
can be used in both places.

> +			}
>  			/* only error if existence is expected. */
>  			else if (modes[i] != SPARSE)
>  				bad = _("bad source");
> @@ -218,7 +264,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				&& lstat(dst, &st) == 0)
>  			bad = _("cannot move directory over file");
>  		else if (src_is_dir) {
> -			int first = cache_name_pos(src, length), last;
> +			int first, last;
> +dir_check:
> +			first = cache_name_pos(src, length);
>  
>  			if (first >= 0)
>  				prepare_move_submodule(src, first,
> @@ -229,7 +277,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			else { /* last - first >= 1 */
>  				int j, dst_len, n;
>  
> -				modes[i] = WORKING_DIRECTORY;
> +				if (!modes[i])
> +					modes[i] |= WORKING_DIRECTORY;

This appears to only add the WORKING_DIRECTORY flag if modes[i] is
already zero. This maybe implies that we wouldn't understand
"WORKING_DIRECTORY | SKIP_WORKTREE_DIR" as a value.

>  				n = argc + last - first;
>  				REALLOC_ARRAY(source, n);
>  				REALLOC_ARRAY(destination, n);
> @@ -331,7 +380,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			printf(_("Renaming %s to %s\n"), src, dst);
>  		if (show_only)
>  			continue;
> -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> +		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
> +		 	rename(src, dst) < 0) {

style-nit: align your logical statements.

>  			if (ignore_errors)
>  				continue;
>  			die_errno(_("renaming '%s' failed"), src);
> @@ -345,7 +395,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  							      1);
>  		}
>  
> -		if (mode == WORKING_DIRECTORY)
> +		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
>  			continue;

Ok, here you check if _either_ mode is enabled, which is good. Maybe
you don't need the "if (!mode[i])" part above.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory
  2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-05-27 12:07     ` Ævar Arnfjörð Bjarmason
  2022-05-27 14:48     ` Derrick Stolee
@ 2022-05-27 15:51     ` Victoria Dye
  2 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-05-27 15:51 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster, newren

Shaoxuan Yuan wrote:
> Add corresponding tests to test following situations:
> 
> * 'refuse to move out-of-cone directory without --sparse'
> * 'can move out-of-cone directory with --sparse'
> * 'refuse to move out-of-cone file without --sparse'
> * 'can move out-of-cone file with --sparse'
> * 'refuse to move sparse file to existing destination'
> * 'move sparse file to existing destination with --force and --sparse'
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  t/t7002-mv-sparse-checkout.sh | 98 +++++++++++++++++++++++++++++++++++
>  1 file changed, 98 insertions(+)
> 
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index 1d3d2aca21..963cb512e2 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -206,4 +206,102 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>  	test_cmp expect stderr
>  '
>  

Apologies in advance for adding more comments (after I said the last version
looked good)! I'm always learning things about Git, so hopefully my
suggestions are at least better than in my last review. :) 

> +test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&

Note that 'init' is now deprecated [1] (I think that happened between your
v1 and now, FWIW). You can use 'git sparse-checkout set --cone sub', to do
the same thing as this + the subsequent line.

[1] https://lore.kernel.org/git/9d96da855ea70e7e8a54bb68e710cc60a2f50376.1639454952.git.gitgitgadget@gmail.com/

> +	git sparse-checkout set sub &&
> +

While the tests don't automatically "reset" between them (and therefore, you
don't need to disable & re-enable sparse-checkout), I like that you're
explicitly clearing & re-establishing the sparse-checkout state! It makes
selectively running tests *much* easier and generally avoids hand-to-see
side effects from prior tests.

As for test content, this particular setup block is (almost) identically
repeated in all of the tests. You could pull that into functions to reduce
duplication, e.g.:

setup_sparse_checkout () {
	mkdir folder1 &&
	touch folder1 &&
	git add folder1 &&
	git sparse-checkout set --cone sub
}

cleanup_sparse_checkout () {
	git sparse-checkout disable &&
	git reset --hard
}

You may want to consider using "test_when_finished <some cleanup function>"
to clean up each one *at the end* of the test, rather than the beginning of
the next one. E.g.:

test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
	test_when_finished "cleanup_sparse_checkout" &&
	setup_sparse_checkout &&

	# ...the rest of the test
'

> +	test_must_fail git mv folder1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'can move out-of-cone directory with --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse folder1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +
> +	git sparse-checkout reapply &&
> +	test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/folder1/file1
> +'
> +
> +test_expect_failure 'refuse to move out-of-cone file without --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	test_must_fail git mv folder1/file1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'can move out-of-cone file with --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +
> +	git sparse-checkout reapply &&
> +	! test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/file1
> +'
> +
> +test_expect_failure 'refuse to move sparse file to existing destination' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	touch sub/file1 &&
> +	git add folder1 sub/file1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
> +	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
> +	git sparse-checkout disable &&
> +	git reset --hard &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	touch sub/file1 &&
> +	echo "overwrite" >folder1/file1 &&
> +	git add folder1 sub/file1 &&
> +	git sparse-checkout init --cone &&
> +	git sparse-checkout set sub &&
> +
> +	git mv --sparse --force folder1/file1 sub 2>stderr &&
> +	test_must_be_empty stderr &&
> +	echo "overwrite" >expect &&
> +	test_cmp expect sub/file1
> +'
> +
>  test_done

These tests clearly establish the behavior you want to implement for 'git
mv'. As in V1, nice work!

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 10:08   ` [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents Shaoxuan Yuan
  2022-05-27 12:10     ` Ævar Arnfjörð Bjarmason
@ 2022-05-27 19:36     ` Victoria Dye
  2022-05-27 19:59       ` Junio C Hamano
  1 sibling, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-05-27 19:36 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster, newren

Shaoxuan Yuan wrote:
> Originally, "git mv" a sparse file/directory from out/in-cone to
> in/out-cone does not update the sparsity following the sparse-checkout
> patterns.
> 

I generally agree with the intent here - that, if you move a non-sparse file
out-of-cone, it should become sparse (and vice versa). However, that result
can be reached by simply flipping the 'SKIP_WORKTREE' bit(s) on the
resultant index entry/entries (which you already have, since they're renamed
with 'rename_cache_entry_at()' below). 

Note that you'll also probably need to check out the file(s) (if moving into
the cone) or remove them from disk (if moving out of cone). If you don't,
files moved into cone will appear "deleted" on-disk, and files moved
out-of-cone that still appear on disk will have 'SKIP_WORKTREE'
automatically disabled (see [1]).

For reference, I'd advise against reapplying the sparsity patterns - as you
do below - because involves a much more expensive traversal of the entire
repository. It also has the possibly unwanted side effect of resetting the
'SKIP_WORKTREE' bit to match the sparse patterns on *all* files, not just
the one(s) you moved. 

[1] https://lore.kernel.org/git/11d46a399d26c913787b704d2b7169cafc28d639.1642175983.git.gitgitgadget@gmail.com/

> Use update_sparsity() after touching sparse contents, so the sparsity
> will be updated after the move.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 19 +++++++++++++++++++
>  t/t7002-mv-sparse-checkout.sh | 16 ++++++++++++++++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index e64f251a69..2c02120941 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -13,6 +13,7 @@
>  #include "string-list.h"
>  #include "parse-options.h"
>  #include "submodule.h"
> +#include "unpack-trees.h"
>  
>  static const char * const builtin_mv_usage[] = {
>  	N_("git mv [<options>] <source>... <destination>"),
> @@ -158,6 +159,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  {
>  	int i, flags, gitmodules_modified = 0;
>  	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
> +	int sparse_moved = 0;
>  	struct option builtin_mv_options[] = {
>  		OPT__VERBOSE(&verbose, N_("be verbose")),
>  		OPT__DRY_RUN(&show_only, N_("dry run")),
> @@ -376,6 +378,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		const char *src = source[i], *dst = destination[i];
>  		enum update_mode mode = modes[i];
>  		int pos;
> +		if (!sparse_moved && mode & (SPARSE | SKIP_WORKTREE_DIR))
> +			sparse_moved = 1;
>  		if (show_only || verbose)
>  			printf(_("Renaming %s to %s\n"), src, dst);
>  		if (show_only)
> @@ -403,6 +407,21 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		rename_cache_entry_at(pos, dst);
>  	}
>  
> +	if (sparse_moved) {
> +		struct unpack_trees_options o;
> +		memset(&o, 0, sizeof(o));
> +		o.verbose_update = isatty(2);
> +		o.update = 1;
> +		o.head_idx = -1;
> +		o.src_index = &the_index;
> +		o.dst_index = &the_index;
> +		o.skip_sparse_checkout = 0;
> +		o.pl = the_index.sparse_checkout_patterns;
> +		setup_unpack_trees_porcelain(&o, "mv");
> +		update_sparsity(&o);
> +		clear_unpack_trees_porcelain(&o);
> +	}
> +
>  	if (gitmodules_modified)
>  		stage_updated_gitmodules(&the_index);
>  
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index cf2f5dc46f..1fd3e3c0fc 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -287,6 +287,22 @@ test_expect_success 'refuse to move sparse file to existing destination' '
>  	test_cmp expect stderr
>  '
>  
> +# Need fix.
> +#
> +# The *expected* behavior:
> +#
> +# Using --sparse to accept a sparse file, --force to overwrite the destination.
> +# The folder1/file1 should replace the sub/file1 without error.
> +#
> +# The *actual* behavior:
> +#
> +# It emits a warning:
> +#
> +# warning: Path ' sub/file1
> +# ' already present; will not overwrite with sparse update.
> +# After fixing the above paths, you may want to run `git sparse-checkout
> +# reapply`.
> +
>  test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
>  	git sparse-checkout disable &&
>  	git reset --hard &&

This error is (I think) part of 'update_sparsity()'. If you change the
approach to only modifying the 'SKIP_WORKTREE' bit, hopefully you'll get the
behavior you're looking for.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 19:36     ` Victoria Dye
@ 2022-05-27 19:59       ` Junio C Hamano
  2022-05-27 21:24         ` Victoria Dye
  0 siblings, 1 reply; 95+ messages in thread
From: Junio C Hamano @ 2022-05-27 19:59 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Shaoxuan Yuan, git, derrickstolee, newren

Victoria Dye <vdye@github.com> writes:

> Note that you'll also probably need to check out the file(s) (if moving into
> the cone) or remove them from disk (if moving out of cone). If you don't,
> files moved into cone will appear "deleted" on-disk, and files moved
> out-of-cone that still appear on disk will have 'SKIP_WORKTREE'
> automatically disabled (see [1]).

Does it also imply that we should forbid "git mv" of a dirty path
out of the cone?  Or is that too draconian and it suffices to tweak
the rule slightly to "remove from the worktree when moving a clean
path out of cone", perhaps?  When a dirty path is moved out of cone,
we would trigger the "SKIP_WORKTREE automatically disabled" behaviour
and that would be a good thing, I imagine?


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 19:59       ` Junio C Hamano
@ 2022-05-27 21:24         ` Victoria Dye
  2022-06-16 13:51           ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-05-27 21:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shaoxuan Yuan, git, derrickstolee, newren

Junio C Hamano wrote:
> Victoria Dye <vdye@github.com> writes:
> 
>> Note that you'll also probably need to check out the file(s) (if moving into
>> the cone) or remove them from disk (if moving out of cone). If you don't,
>> files moved into cone will appear "deleted" on-disk, and files moved
>> out-of-cone that still appear on disk will have 'SKIP_WORKTREE'
>> automatically disabled (see [1]).
> 
> Does it also imply that we should forbid "git mv" of a dirty path
> out of the cone?  Or is that too draconian and it suffices to tweak
> the rule slightly to "remove from the worktree when moving a clean
> path out of cone", perhaps?  When a dirty path is moved out of cone,
> we would trigger the "SKIP_WORKTREE automatically disabled" behaviour
> and that would be a good thing, I imagine?
> 

I like the idea of the modified rule as an option since it *does* complete
the move in accordance with '--force', but doesn't result in silently lost
information. 

An alternative might be 'mv' refusing to move a modified file out-of-cone
(despite '--force'), printing something like
'WARNING_SPARSE_NOT_UPTODATE_FILE' ("Path 'x' not uptodate; will not remove
from working tree").

I'm not sure which would provide a more vs. less frustrating experience, but
both are at least safe in terms of preserving unstaged changes.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting
  2022-05-27 10:08   ` [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
@ 2022-05-27 22:04     ` Victoria Dye
  0 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-05-27 22:04 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: derrickstolee, gitster, newren

Shaoxuan Yuan wrote:
> Originally, moving a sparse file into cone can result in unwarned
> overwrite of existing entry. The expected behavior is that if the
> <destination> exists in the entry, user should be prompted to supply
> a [-f|--force] to carry out the operation, or the operation should
> fail.
> 
> Add a check mechanism to do that.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 23 +++++++++++------------
>  t/t7002-mv-sparse-checkout.sh |  2 +-
>  2 files changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 32ad4d5682..62284e3f86 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -185,16 +185,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  
>  		length = strlen(src);
>  		if (lstat(src, &st) < 0) {
> -			/*
> -			 * TODO: for now, when you try to overwrite a <destination>
> -			 * with your <source> as a sparse file, if you supply a "--sparse"
> -			 * flag, then the action will be done without providing "--force"
> -			 * and no warning.
> -			 *
> -			 * This is mainly because the sparse <source>
> -			 * is not on-disk, and this if-else chain will be cut off early in
> -			 * this check, thus the "--force" check is ignored. Need fix.
> -			 */
>  

Given that this removes the "TODO" comment you just added in the previous
patch, I agree with Stolee's suggestion [1] that you mention this context in
the patch 2 commit message rather than a code comment. The commit message of
*this* patch already explains the behavior you're correcting, so I don't
think any other changes would be needed here.

[1] https://lore.kernel.org/git/0884b97b-0745-5cad-3034-a679be5d6c3a@github.com/

>  			int pos = cache_name_pos(src, length);
>  			if (pos >= 0) {
> @@ -203,8 +193,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				if (ce_skip_worktree(ce)) {
>  					if (!ignore_sparse)
>  						string_list_append(&only_match_skip_worktree, src);
> -					else
> -						modes[i] = SPARSE;
> +					else {
> +						/* Check if dst exists in index */
> +						if (cache_name_pos(dst, strlen(dst)) >= 0) {
> +							if (force)
> +								modes[i] = SPARSE;
> +							else
> +								bad = _("destination exists");
> +						}
> +						else
> +							modes[i] = SPARSE;
> +					}
>  				}
>  				else
>  					bad = _("bad source");
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index 581ef4c0f6..2c9008573a 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -272,7 +272,7 @@ test_expect_success 'can move out-of-cone file with --sparse' '
>  	test_path_is_file sub/file1
>  '
>  
> -test_expect_failure 'refuse to move sparse file to existing destination' '
> +test_expect_success 'refuse to move sparse file to existing destination' '
>  	git sparse-checkout disable &&
>  	git reset --hard &&
>  	mkdir folder1 &&

The rest of this looks good to me!

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-05-27 15:13     ` Derrick Stolee
@ 2022-05-27 22:38       ` Victoria Dye
  2022-05-31  8:06       ` Shaoxuan Yuan
  1 sibling, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-05-27 22:38 UTC (permalink / raw)
  To: Derrick Stolee, Shaoxuan Yuan, git; +Cc: gitster, newren

Derrick Stolee wrote:
> 
> 
> On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
>> Originally, moving a <source> file which is not on-disk but exists in
>> index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
>> out with "bad source".
>>
>> Change the checking logic, so that such <source>
>> file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
>> instead of "bad source"; also user now can supply a "--sparse" flag so
>> this operation can be carried out successfully.
>>
>> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>> ---
>>  builtin/mv.c                  | 26 +++++++++++++++++++++++++-
>>  t/t7002-mv-sparse-checkout.sh |  4 ++--
>>  2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/builtin/mv.c b/builtin/mv.c
>> index 83a465ba83..32ad4d5682 100644
>> --- a/builtin/mv.c
>> +++ b/builtin/mv.c
>> @@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>  
>>  		length = strlen(src);
>>  		if (lstat(src, &st) < 0) {
>> +			/*
>> +			 * TODO: for now, when you try to overwrite a <destination>
>> +			 * with your <source> as a sparse file, if you supply a "--sparse"
>> +			 * flag, then the action will be done without providing "--force"
>> +			 * and no warning.
>> +			 *
>> +			 * This is mainly because the sparse <source>
>> +			 * is not on-disk, and this if-else chain will be cut off early in
>> +			 * this check, thus the "--force" check is ignored. Need fix.
>> +			 */
> 
> I wonder if this is worth the comment here, or if we'd rather see
> the mention in the commit message. You have documented tests that
> fail in this case, so we already have something that marks this
> as "TODO" in a more discoverable place.
> 
>> +			int pos = cache_name_pos(src, length);
>> +			if (pos >= 0) {
>> +				const struct cache_entry *ce = active_cache[pos];
>> +
>> +				if (ce_skip_worktree(ce)) {
>> +					if (!ignore_sparse)
>> +						string_list_append(&only_match_skip_worktree, src);
>> +					else
>> +						modes[i] = SPARSE;
> 
> 
>> +				}
>> +				else
>> +					bad = _("bad source");
> 
> style nit:
> 
> 	} else {
> 		bad = _("bad source");
> 	}
> 

In case this advice seems contradictory with past style suggestions, from 'Documentation/CodingGuidelines':

	- When there are multiple arms to a conditional and some of them
	  require braces, enclose even a single line block in braces for
	  consistency. E.g.:

		if (foo) {
			doit();
		} else {
			one();
			two();
			three();
		}

>> +			}
>>  			/* only error if existence is expected. */
>> -			if (modes[i] != SPARSE)
>> +			else if (modes[i] != SPARSE)
>>  				bad = _("bad source");
> 
> For this one, the comment makes it difficult to connect the 'else
> if' to its corresponding 'if'. Perhaps:
> 
> 	} else if (modes[i] != SPARSE) {
> 		/* only error if existence is expected. */
> 		bad = _("bad source");
> 	}
> 
>>  		} else if (!strncmp(src, dst, length) &&
>>  				(dst[length] == 0 || dst[length] == '/')) {
> 
> In general, I found this if/else-if chain hard to grok, and
> a lot of it is because we have "simple" cases at the end
> and the complicated parts have ever-increasing nesting. This
> is mostly due to the existing if/else-if chain in this method.
> 

Agreed that the if/else-if chains make 'cmd_mv' complicated. The most
frustrating thing about its current state (unrelated to this patch) is how
unclear it is whether any given conditions are mutually-exclusive vs.
dependent vs. one taking precedence over another. On that note... 

> Here is a diff that replaces that if/else-if chain with a
> 'goto' trick to jump ahead, allowing some code to decrease in
> tabbing:
> 

...while I'm usually hesitant to add more 'goto' labels to the code if it
can be avoided, I think that model fits this use case well.

> ---- >8 ----

[cutting the proposed refactor for space]

> ---- >8 ---
> 
> To me, this is a bit easier to parse, since we find the error
> cases and jump to the action before continuing on the "happy
> path". It does involve that first big refactor first, so I'd
> like to hear opinions of other contributors before you jump to
> taking this suggestion.
> 

I like how the refactored version simplifies 'cmd_mv', and how it
correspondingly simplifies the new checks in this (Shaoxuan's) patch. It
does still leave us with one big, monolithic 'cmd_mv', so in an ideal world
I'd probably lean towards pulling the innards of the main for-loop(s) into a
few dedicated functions (like 'validate_move_candidate', 'move_entry').
However, I'm happy with any improvement, and this refactor would certainly
give us that!

> Thanks,
> -Stolee


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-05-27 15:13     ` Derrick Stolee
  2022-05-27 22:38       ` Victoria Dye
@ 2022-05-31  8:06       ` Shaoxuan Yuan
  1 sibling, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-31  8:06 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, vdye, gitster, newren

On Fri, May 27, 2022 at 11:13 PM Derrick Stolee
<derrickstolee@github.com> wrote:
> > diff --git a/builtin/mv.c b/builtin/mv.c
> > index 83a465ba83..32ad4d5682 100644
> > --- a/builtin/mv.c
> > +++ b/builtin/mv.c
> > @@ -185,8 +185,32 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >
> >               length = strlen(src);
> >               if (lstat(src, &st) < 0) {
> > +                     /*
> > +                      * TODO: for now, when you try to overwrite a <destination>
> > +                      * with your <source> as a sparse file, if you supply a "--sparse"
> > +                      * flag, then the action will be done without providing "--force"
> > +                      * and no warning.
> > +                      *
> > +                      * This is mainly because the sparse <source>
> > +                      * is not on-disk, and this if-else chain will be cut off early in
> > +                      * this check, thus the "--force" check is ignored. Need fix.
> > +                      */
>
> I wonder if this is worth the comment here, or if we'd rather see
> the mention in the commit message. You have documented tests that
> fail in this case, so we already have something that marks this
> as "TODO" in a more discoverable place.

This comment was added during my local development, it should be
removed.

> > +                     int pos = cache_name_pos(src, length);
> > +                     if (pos >= 0) {
> > +                             const struct cache_entry *ce = active_cache[pos];
> > +
> > +                             if (ce_skip_worktree(ce)) {
> > +                                     if (!ignore_sparse)
> > +                                             string_list_append(&only_match_skip_worktree, src);
> > +                                     else
> > +                                             modes[i] = SPARSE;
>
>
> > +                             }
> > +                             else
> > +                                     bad = _("bad source");
>
> style nit:
>
>         } else {
>                 bad = _("bad source");
>         }
>
> > +                     }
> >                       /* only error if existence is expected. */
> > -                     if (modes[i] != SPARSE)
> > +                     else if (modes[i] != SPARSE)
> >                               bad = _("bad source");
>
> For this one, the comment makes it difficult to connect the 'else
> if' to its corresponding 'if'. Perhaps:
>
>         } else if (modes[i] != SPARSE) {
>                 /* only error if existence is expected. */
>                 bad = _("bad source");
>         }
>
> >               } else if (!strncmp(src, dst, length) &&
> >                               (dst[length] == 0 || dst[length] == '/')) {
>
> In general, I found this if/else-if chain hard to grok, and
> a lot of it is because we have "simple" cases at the end
> and the complicated parts have ever-increasing nesting. This
> is mostly due to the existing if/else-if chain in this method.
>
> Here is a diff that replaces that if/else-if chain with a
> 'goto' trick to jump ahead, allowing some code to decrease in
> tabbing:
>
> ---- >8 ----

[cutting the proposed refactor for space]

> ---- >8 ----
>
> But mostly the reason for this refactor is that the following
> diff should be equivalent to yours:
>
> ---- >8 ----
>
> diff --git a/builtin/mv.c b/builtin/mv.c
> index d8b5c24fb5..add48e23b4 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -185,11 +185,28 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>
>                 length = strlen(src);
>                 if (lstat(src, &st) < 0) {
> -                       /* only error if existence is expected. */
> -                       if (modes[i] != SPARSE) {
> +                       int pos;
> +                       const struct cache_entry *ce;
> +
> +                       pos = cache_name_pos(src, length);
> +                       if (pos < 0) {
> +                               /* only error if existence is expected. */
> +                               if (modes[i] != SPARSE)
> +                                       bad = _("bad source");
> +                               goto act_on_entry;
> +                       }
> +
> +                       ce = active_cache[pos];
> +                       if (!ce_skip_worktree(ce)) {
>                                 bad = _("bad source");
>                                 goto act_on_entry;
>                         }
> +
> +                       if (!ignore_sparse)
> +                               string_list_append(&only_match_skip_worktree, src);
> +                       else
> +                               modes[i] = SPARSE;
> +                       goto act_on_entry;
>                 }
>                 if (!strncmp(src, dst, length) &&
>                     (dst[length] == 0 || dst[length] == '/')) {
> ---- >8 ---
>
> To me, this is a bit easier to parse, since we find the error
> cases and jump to the action before continuing on the "happy
> path". It does involve that first big refactor first, so I'd
> like to hear opinions of other contributors before you jump to
> taking this suggestion.

True. I also find it easier to read. Though Victoria mentioned the
goto hazard, the gotos here decouples the huge chain and that
brings clarity and makes it easier to extend.

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue
  2022-05-27 15:27     ` Derrick Stolee
@ 2022-05-31  9:56       ` Shaoxuan Yuan
  2022-05-31 15:49         ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-05-31  9:56 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, vdye, gitster, newren

On Fri, May 27, 2022 at 11:27 PM Derrick Stolee
<derrickstolee@github.com> wrote:
>
> On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
> > +/*
> > + * Check if an out-of-cone directory should be in the index. Imagine this case
> > + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
> > + * and thus the directory is sparsified.
> > + *
> > + * Return 0 if such directory exist (i.e. with any of its contained files not
> > + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
> > + * Return 1 otherwise.
> > + */
> > +static int check_dir_in_index(const char *name, int namelen)
> > +{
> > +     int ret = 1;
> > +     const char *with_slash = add_slash(name);
> > +     int length = namelen + 1;
> > +
> > +     int pos = cache_name_pos(with_slash, length);
> > +     const struct cache_entry *ce;
> > +
> > +     if (pos < 0) {
> > +             pos = -pos - 1;
> > +             if (pos >= the_index.cache_nr)
> > +                     return ret;
> > +             ce = active_cache[pos];
> > +             if (strncmp(with_slash, ce->name, length))
> > +                     return ret;
> > +             if (ce_skip_worktree(ce))
> > +                     return ret = 0;
>
> This appears to check if the _first_ entry under the directory
> is sparse, but not if _all_ entries are sparse. These are not
> the same thing, even in cone-mode sparse-checkout. The t1092
> test directory has files like "folder1/0/0/a" but if
> "folder1/1" is in the sparse-checkout cone, then that first
> entry has the skip-worktree bit, but "folder1/1/a" and "folder1/a"
> do not.

Yes, it is checking the first entry and this would not work without the
lstat in the front. But I think the "lstat < 0" makes sure that this directory
cannot be partially sparsified.

It is either missing both in the worktree and index, or missing in the worktree
but present in index (with all its content sparsified). And because of that,
I think only the first entry needs to be checked.

> > +     }
> > +     return ret;
>
> At the moment, it doesn't seem like we need 'ret' since the
> only place you set it is in "return ret = 0;" (which could
> just be "return 0;" while the others are "return 1;"). But,
> perhaps you intended to create a loop over 'pos' while
> with_slash is a prefix of the cache entry?

I agree that this variable is redundant. But I fail to understand
the logical relation between before "But," and after "But,". Please
elaborate on that?

> > +                     else if (!check_dir_in_index(src, length) &&
> > +                                      !path_in_sparse_checkout(src_w_slash, &the_index)) {
>
> style-nit: You'll want to align the different parts of your
> logical statement to agree with the end of the "else if (",
>
>         else if (A &&
>                  B) {
>

This one is interesting because it appears just alright in my VSCode editor.
Later I found that it is because git-diff is using a tab size of 8 or something,
but my VSCode uses tab size of 4. After I configured the git-diff tab rendering
size, it looks alright. Same for another style nit down below.

> > +                             modes[i] = SKIP_WORKTREE_DIR;
>
> If we are moving to a flags-based model, should we convert all
> "modes[i] =" to "modes[i] |=" as a first step (before adding the
> SKIP_WORTKREE_DIR flag)?
>
> > +                             goto dir_check;
>
> Hm. While I did recommend using 'goto' to jump to a common end
> place in the loop body, I'm not sure about jumping into another
> else-if statement. This might be a good time to extract the
> code from "else if (src_is_dir)" below into a helper method that
> can be used in both places.

Right, this is suspicious. I wasn't familiar at all with C/C++, and being able
to do this inter-if-else jump also startled me.
I agree that it should be something more legitimate, like extracting a
method for it.

> > +                     }
> >                       /* only error if existence is expected. */
> >                       else if (modes[i] != SPARSE)
> >                               bad = _("bad source");
> > @@ -218,7 +264,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                               && lstat(dst, &st) == 0)
> >                       bad = _("cannot move directory over file");
> >               else if (src_is_dir) {
> > -                     int first = cache_name_pos(src, length), last;
> > +                     int first, last;
> > +dir_check:
> > +                     first = cache_name_pos(src, length);
> >
> >                       if (first >= 0)
> >                               prepare_move_submodule(src, first,
> > @@ -229,7 +277,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       else { /* last - first >= 1 */
> >                               int j, dst_len, n;
> >
> > -                             modes[i] = WORKING_DIRECTORY;
> > +                             if (!modes[i])
> > +                                     modes[i] |= WORKING_DIRECTORY;
>
> This appears to only add the WORKING_DIRECTORY flag if modes[i] is
> already zero. This maybe implies that we wouldn't understand
> "WORKING_DIRECTORY | SKIP_WORKTREE_DIR" as a value.

At this point, I cannot think of the reason for writing it this way. And yes,
this does not make sense...

> >                               n = argc + last - first;
> >                               REALLOC_ARRAY(source, n);
> >                               REALLOC_ARRAY(destination, n);
> > @@ -331,7 +380,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       printf(_("Renaming %s to %s\n"), src, dst);
> >               if (show_only)
> >                       continue;
> > -             if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> > +             if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
> > +                     rename(src, dst) < 0) {
>
> style-nit: align your logical statements.
>
> >                       if (ignore_errors)
> >                               continue;
> >                       die_errno(_("renaming '%s' failed"), src);
> > @@ -345,7 +395,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                                                             1);
> >               }
> >
> > -             if (mode == WORKING_DIRECTORY)
> > +             if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
> >                       continue;
>
> Ok, here you check if _either_ mode is enabled, which is good. Maybe
> you don't need the "if (!mode[i])" part above.
>
> Thanks,
> -Stolee

-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue
  2022-05-31  9:56       ` Shaoxuan Yuan
@ 2022-05-31 15:49         ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-05-31 15:49 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, vdye, gitster, newren

On 5/31/2022 5:56 AM, Shaoxuan Yuan wrote:
> On Fri, May 27, 2022 at 11:27 PM Derrick Stolee
> <derrickstolee@github.com> wrote:
>>
>> On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
...
>> This appears to check if the _first_ entry under the directory
>> is sparse, but not if _all_ entries are sparse. These are not
>> the same thing, even in cone-mode sparse-checkout. The t1092
>> test directory has files like "folder1/0/0/a" but if
>> "folder1/1" is in the sparse-checkout cone, then that first
>> entry has the skip-worktree bit, but "folder1/1/a" and "folder1/a"
>> do not.
> 
> Yes, it is checking the first entry and this would not work without the
> lstat in the front. But I think the "lstat < 0" makes sure that this directory
> cannot be partially sparsified.
> 
> It is either missing both in the worktree and index, or missing in the worktree
> but present in index (with all its content sparsified). And because of that,
> I think only the first entry needs to be checked.

Ah! Good thinking. I hadn't considered that extra detail, so
we get to save some cycles here.

>>> +     }
>>> +     return ret;
>>
>> At the moment, it doesn't seem like we need 'ret' since the
>> only place you set it is in "return ret = 0;" (which could
>> just be "return 0;" while the others are "return 1;"). But,
>> perhaps you intended to create a loop over 'pos' while
>> with_slash is a prefix of the cache entry?
> 
> I agree that this variable is redundant. But I fail to understand
> the logical relation between before "But," and after "But,". Please
> elaborate on that?

I was just thinking that if you intended to write a loop as
I had suggested, then 'ret' could be modified or used in more
places. Feel free to ignore since we resolved that.

>>> +                     else if (!check_dir_in_index(src, length) &&
>>> +                                      !path_in_sparse_checkout(src_w_slash, &the_index)) {
>>
>> style-nit: You'll want to align the different parts of your
>> logical statement to agree with the end of the "else if (",
>>
>>         else if (A &&
>>                  B) {
>>
> 
> This one is interesting because it appears just alright in my VSCode editor.
> Later I found that it is because git-diff is using a tab size of 8 or something,
> but my VSCode uses tab size of 4. After I configured the git-diff tab rendering
> size, it looks alright. Same for another style nit down below.

That'll do it. You can double-check the alignment in your GGG
PR, which should use the correct tab width.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-05-27 21:24         ` Victoria Dye
@ 2022-06-16 13:51           ` Shaoxuan Yuan
  2022-06-16 16:42             ` Victoria Dye
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-16 13:51 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Junio C Hamano, git, derrickstolee, newren

On Sat, May 28, 2022 at 5:24 AM Victoria Dye <vdye@github.com> wrote:
>
> Junio C Hamano wrote:
> > Victoria Dye <vdye@github.com> writes:
> >
> >> Note that you'll also probably need to check out the file(s) (if moving into
> >> the cone) or remove them from disk (if moving out of cone). If you don't,
> >> files moved into cone will appear "deleted" on-disk, and files moved
> >> out-of-cone that still appear on disk will have 'SKIP_WORKTREE'
> >> automatically disabled (see [1]).
> >
> > Does it also imply that we should forbid "git mv" of a dirty path
> > out of the cone?  Or is that too draconian and it suffices to tweak
> > the rule slightly to "remove from the worktree when moving a clean
> > path out of cone", perhaps?  When a dirty path is moved out of cone,
> > we would trigger the "SKIP_WORKTREE automatically disabled" behaviour
> > and that would be a good thing, I imagine?
> >
>
> I like the idea of the modified rule as an option since it *does* complete
> the move in accordance with '--force', but doesn't result in silently lost
> information.
>
> An alternative might be 'mv' refusing to move a modified file out-of-cone
> (despite '--force'), printing something like
> 'WARNING_SPARSE_NOT_UPTODATE_FILE' ("Path 'x' not uptodate; will not remove
> from working tree").
>
> I'm not sure which would provide a more vs. less frustrating experience, but
> both are at least safe in terms of preserving unstaged changes.

For me, the alternative provides a less frustrating experience.

Since it is more explicit (giving a message and directly saying NO).
Also, the `sparse-checkout` users should expect the moved file to be
missing in the working tree, as opposed to being present.

And the tweaked rule suggested by Junio [1] might need an extra
 `git sparse-checkout reapply` to re-sparsify the file that moved out-of-cone
after staging its change?

[1] https://lore.kernel.org/git/xmqq8rqm3fxa.fsf@gitster.g/
-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-06-16 13:51           ` Shaoxuan Yuan
@ 2022-06-16 16:42             ` Victoria Dye
  2022-06-17  2:15               ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-06-16 16:42 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: Junio C Hamano, git, derrickstolee, newren

Shaoxuan Yuan wrote:
> On Sat, May 28, 2022 at 5:24 AM Victoria Dye <vdye@github.com> wrote:
>>
>> Junio C Hamano wrote:
>>> Victoria Dye <vdye@github.com> writes:
>>>
>>>> Note that you'll also probably need to check out the file(s) (if moving into
>>>> the cone) or remove them from disk (if moving out of cone). If you don't,
>>>> files moved into cone will appear "deleted" on-disk, and files moved
>>>> out-of-cone that still appear on disk will have 'SKIP_WORKTREE'
>>>> automatically disabled (see [1]).
>>>
>>> Does it also imply that we should forbid "git mv" of a dirty path
>>> out of the cone?  Or is that too draconian and it suffices to tweak
>>> the rule slightly to "remove from the worktree when moving a clean
>>> path out of cone", perhaps?  When a dirty path is moved out of cone,
>>> we would trigger the "SKIP_WORKTREE automatically disabled" behaviour
>>> and that would be a good thing, I imagine?
>>>
>>
>> I like the idea of the modified rule as an option since it *does* complete
>> the move in accordance with '--force', but doesn't result in silently lost
>> information.
>>
>> An alternative might be 'mv' refusing to move a modified file out-of-cone
>> (despite '--force'), printing something like
>> 'WARNING_SPARSE_NOT_UPTODATE_FILE' ("Path 'x' not uptodate; will not remove
>> from working tree").
>>
>> I'm not sure which would provide a more vs. less frustrating experience, but
>> both are at least safe in terms of preserving unstaged changes.
> 
> For me, the alternative provides a less frustrating experience.
> 
> Since it is more explicit (giving a message and directly saying NO).
>> Also, the `sparse-checkout` users should expect the moved file to be
> missing in the working tree, as opposed to being present.
> 

Good point, since the sparseness of the destination file would be different
depending on whether it had local modifications or not (with no indication
from 'mv' of the different treatment).

If you're interested, maybe there's a middle-ground option? Suppose you want
to move a file 'file1' to an out-of-cone location:

1. If 'file1' is clean, regardless of use of '--force', move the file & make
   it sparse.
2. If 'file1' is *not* clean and '--force' is *not* used, refuse to move the
   file (with a "Path 'file1' not uptodate; will not move. Use '--force' to
   override." type of error).
3. If 'file1' is *not* clean and '--force' is used, move the file but do not
   make it sparse.

That way, '--force' really does force the move to happen, but users are
generally warned against it. I'm still not sure what the "right" approach
is, but to your point I think it should err on the side of not surprising
the user.

> And the tweaked rule suggested by Junio [1] might need an extra
>  `git sparse-checkout reapply` to re-sparsify the file that moved out-of-cone
> after staging its change?
> 

Just so I understand correctly, do you mean 'git sparse-checkout reapply'
*as part of* the 'mv' operation? Or are you thinking that a user might want
to manually run 'git sparse-checkout reapply' after running 'mv'? 

If it's the former (internally calling 'git sparse-checkout reapply' in
'mv'), then no, you wouldn't want to do that. In Junio's suggestion, he said
(emphasis mine):

> When a dirty path is moved out of cone, we would trigger the
> "SKIP_WORKTREE automatically disabled" behaviour" *and that would be a
> good thing, I imagine?*

We don't want the file moved out-of-cone to be sparse again because it has
local (on-disk) modifications that would disappear (since a file needs to be
removed from disk to be "sparse" in the eyes of 'sparse-checkout'). It's
*completely valid* behavior to have an out-of-cone file become non-sparse if
a user does something to cause that; it doesn't cause any bugs/corruption
with the repo. And, even if you did want to make the file sparse, it should
be done by manually setting 'SKIP_WORKTREE' and individually removing the
file from disk (for all the reasons I mentioned in my upthread comment [1]).

On the other hand, if you're talking about a user manually running 'git
sparse-checkout reapply' after the fact, that wouldn't work either - they'd
get an error:

warning: The following paths are not up to date and were left despite sparse patterns:
        <out-of-cone modified file>

[1] https://lore.kernel.org/git/077a0579-903e-32ad-029c-48572d471c84@github.com/

> [1] https://lore.kernel.org/git/xmqq8rqm3fxa.fsf@gitster.g/
> 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents
  2022-06-16 16:42             ` Victoria Dye
@ 2022-06-17  2:15               ` Shaoxuan Yuan
  0 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-17  2:15 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Junio C Hamano, git, derrickstolee, newren

On Fri, Jun 17, 2022 at 12:42 AM Victoria Dye <vdye@github.com> wrote:
*Truncated messages*
> > For me, the alternative provides a less frustrating experience.
> >
> > Since it is more explicit (giving a message and directly saying NO).
> >> Also, the `sparse-checkout` users should expect the moved file to be
> > missing in the working tree, as opposed to being present.
> >
>
> Good point, since the sparseness of the destination file would be different
> depending on whether it had local modifications or not (with no indication
> from 'mv' of the different treatment).
>
> If you're interested, maybe there's a middle-ground option? Suppose you want
> to move a file 'file1' to an out-of-cone location:
>
> 1. If 'file1' is clean, regardless of use of '--force', move the file & make
>    it sparse.
> 2. If 'file1' is *not* clean and '--force' is *not* used, refuse to move the
>    file (with a "Path 'file1' not uptodate; will not move. Use '--force' to
>    override." type of error).
> 3. If 'file1' is *not* clean and '--force' is used, move the file but do not
>    make it sparse.
>
> That way, '--force' really does force the move to happen, but users are
> generally warned against it. I'm still not sure what the "right" approach
> is, but to your point I think it should err on the side of not surprising
> the user.

I generally think this middle-ground option is good. Though I think the sort
of options that "messing with sparse contents" should be handled by
'--sparse', instead
of '--force', since the latter is used to "force move/rename even if
target exists".
Mixing the usage may cause syntax confusion?

> > And the tweaked rule suggested by Junio [1] might need an extra
> >  `git sparse-checkout reapply` to re-sparsify the file that moved out-of-cone
> > after staging its change?
> >
>
> Just so I understand correctly, do you mean 'git sparse-checkout reapply'
> *as part of* the 'mv' operation? Or are you thinking that a user might want
> to manually run 'git sparse-checkout reapply' after running 'mv'?
>
> If it's the former (internally calling 'git sparse-checkout reapply' in
> 'mv'), then no, you wouldn't want to do that. In Junio's suggestion, he said
> (emphasis mine):
>
> > When a dirty path is moved out of cone, we would trigger the
> > "SKIP_WORKTREE automatically disabled" behaviour" *and that would be a
> > good thing, I imagine?*
>
> We don't want the file moved out-of-cone to be sparse again because it has
> local (on-disk) modifications that would disappear (since a file needs to be
> removed from disk to be "sparse" in the eyes of 'sparse-checkout'). It's
> *completely valid* behavior to have an out-of-cone file become non-sparse if
> a user does something to cause that; it doesn't cause any bugs/corruption
> with the repo. And, even if you did want to make the file sparse, it should
> be done by manually setting 'SKIP_WORKTREE' and individually removing the
> file from disk (for all the reasons I mentioned in my upthread comment [1]).
>
> On the other hand, if you're talking about a user manually running 'git
> sparse-checkout reapply' after the fact, that wouldn't work either - they'd
> get an error:
> warning: The following paths are not up to date and were left despite sparse patterns:
>         <out-of-cone modified file>

This is what I meant, a user manually running `git sparse-checkout reapply`.
Though I did say users should only do this "after staging its change".

I propose this solution which sounds good to me:

1. If 'file1' is clean, iff with the use of '--sparse', move the file & make
    it sparse.
2. If 'file1' is dirty, iff with the use of '--sparse', move the file
& *do not* make
    it sparse, instead advise something like
    "file1 is not up to date, keep it non-sparse.
    Stage file1 then run `git sparse-checkout reapply` to re-sparsify it."

> [1] https://lore.kernel.org/git/077a0579-903e-32ad-029c-48572d471c84@github.com/
>
> > [1] https://lore.kernel.org/git/xmqq8rqm3fxa.fsf@gitster.g/
> >
-- 
Thanks & Regards,
Shaoxuan

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [WIP v3 0/7] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (7 preceding siblings ...)
  2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
@ 2022-06-19  3:25 ` Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
                     ` (7 more replies)
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
  10 siblings, 8 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

The range-diff seems a bit messy, because some of the changes are too big
then I have to tune the --create-factor big enough to reveal these changes.

## Changes since WIP v2 ##

1. Write helper functions for t7002 to reuse some code.

2. Refactor/decouple the if/else-if checking chain.

3. Separate out the 'update_mode' refactor into a single commit.

4. Stop using update_sparsity() and instead update the SKIP_WORKTREE
   bit for each cache_entry and check it out to the working tree.

## Limitations ##

At this point, we still don't have in-cone to out-of-cone move, which
I don't think is too much a problem, since the title says this series is
around out-of-cone as the <source>.

But I think it worth discuss if we should implement in-cone to 
out-of-cone move, since it will be nice (naturally) to have it working.

However, I noticed this from the mv man page:

"In the second form, the last argument has to be an existing directory; 
the given sources will be moved into this directory."

I think trying to move out-of-cone, the last argument has to be an non-existent
directory? I'm a bit confused: should we update some of mv basic logic to 
accomplish this?

Shaoxuan Yuan (7):
  t7002: add tests for moving out-of-cone file/directory
  mv: decouple if/else-if checks using goto
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: check if <destination> exists in index to handle overwriting
  mv: use flags mode for update_mode
  mv: add check_dir_in_index() and solve general dir check issue
  mv: update sparsity after moving from out-of-cone to in-cone

 builtin/mv.c                  | 244 +++++++++++++++++++++++++---------
 t/t7002-mv-sparse-checkout.sh |  85 ++++++++++++
 2 files changed, 264 insertions(+), 65 deletions(-)

Range-diff against v2:
1:  271445205d ! 1:  a08ce96935 t7002: add tests for moving out-of-cone file/directory
    @@ Commit message
     
         Add corresponding tests to test following situations:
     
    -    * 'refuse to move out-of-cone directory without --sparse'
    -    * 'can move out-of-cone directory with --sparse'
    -    * 'refuse to move out-of-cone file without --sparse'
    -    * 'can move out-of-cone file with --sparse'
    -    * 'refuse to move sparse file to existing destination'
    -    * 'move sparse file to existing destination with --force and --sparse'
    +    We do not have sufficient coverage of moving files outside
    +    of a sparse-checkout cone. Create new tests covering this
    +    behavior, keeping in mind that the user can include --sparse
    +    (or not), move a file or directory, and the destination can
    +    already exist in the index (in this case user can use --force
    +    to overwrite existing entry).
     
    +    Helped-by: Victoria Dye <vdye@github.com>
    +    Helped-by: Derrick Stolee <derrickstolee@github.com>
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## t/t7002-mv-sparse-checkout.sh ##
    +@@ t/t7002-mv-sparse-checkout.sh: test_description='git mv in sparse working trees'
    + 
    + . ./test-lib.sh
    + 
    ++setup_sparse_checkout () {
    ++	mkdir folder1 &&
    ++	touch folder1/file1 &&
    ++	git add folder1 &&
    ++	git sparse-checkout set --cone sub
    ++}
    ++
    ++cleanup_sparse_checkout () {
    ++	git sparse-checkout disable &&
    ++	git reset --hard
    ++}
    ++
    + test_expect_success 'setup' "
    + 	mkdir -p sub/dir sub/dir2 &&
    + 	touch a b c sub/d sub/dir/e sub/dir2/e &&
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move files to non-sparse dir' '
    + '
    + 
    + test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
    ++	test_when_finished "cleanup_sparse_checkout" &&
    + 	git reset --hard &&
    + 	git sparse-checkout init --no-cone &&
    + 	git sparse-checkout set a !/x y/ !x/y/z &&
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
      	test_cmp expect stderr
      '
      
     +test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    -+	mkdir folder1 &&
    -+	touch folder1/file1 &&
    -+	git add folder1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
    ++	setup_sparse_checkout &&
     +
     +	test_must_fail git mv folder1 sub 2>stderr &&
     +	cat sparse_error_header >expect &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +'
     +
     +test_expect_failure 'can move out-of-cone directory with --sparse' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    -+	mkdir folder1 &&
    -+	touch folder1/file1 &&
    -+	git add folder1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
    ++	setup_sparse_checkout &&
     +
     +	git mv --sparse folder1 sub 1>actual 2>stderr &&
     +	test_must_be_empty stderr &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +'
     +
     +test_expect_failure 'refuse to move out-of-cone file without --sparse' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    -+	mkdir folder1 &&
    -+	touch folder1/file1 &&
    -+	git add folder1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
    ++	setup_sparse_checkout &&
     +
     +	test_must_fail git mv folder1/file1 sub 2>stderr &&
     +	cat sparse_error_header >expect &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +'
     +
     +test_expect_failure 'can move out-of-cone file with --sparse' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    -+	mkdir folder1 &&
    -+	touch folder1/file1 &&
    -+	git add folder1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
    ++	setup_sparse_checkout &&
     +
     +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
     +	test_must_be_empty stderr &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +'
     +
     +test_expect_failure 'refuse to move sparse file to existing destination' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
     +	mkdir folder1 &&
     +	touch folder1/file1 &&
     +	touch sub/file1 &&
     +	git add folder1 sub/file1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	git sparse-checkout set --cone sub &&
     +
     +	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
     +	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +'
     +
     +test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    -+	git sparse-checkout disable &&
    -+	git reset --hard &&
    ++	test_when_finished "cleanup_sparse_checkout" &&
     +	mkdir folder1 &&
     +	touch folder1/file1 &&
     +	touch sub/file1 &&
     +	echo "overwrite" >folder1/file1 &&
     +	git add folder1 sub/file1 &&
    -+	git sparse-checkout init --cone &&
    -+	git sparse-checkout set sub &&
    ++	git sparse-checkout set --cone sub &&
     +
     +	git mv --sparse --force folder1/file1 sub 2>stderr &&
     +	test_must_be_empty stderr &&
-:  ---------- > 2:  8065fbc232 mv: decouple if/else-if checks using goto
2:  80f485f146 ! 3:  e227fe717b mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      
      		length = strlen(src);
      		if (lstat(src, &st) < 0) {
    -+			/*
    -+			 * TODO: for now, when you try to overwrite a <destination>
    -+			 * with your <source> as a sparse file, if you supply a "--sparse"
    -+			 * flag, then the action will be done without providing "--force"
    -+			 * and no warning.
    -+			 *
    -+			 * This is mainly because the sparse <source>
    -+			 * is not on-disk, and this if-else chain will be cut off early in
    -+			 * this check, thus the "--force" check is ignored. Need fix.
    -+			 */
    +-			/* only error if existence is expected. */
    +-			if (modes[i] != SPARSE) {
    ++			int pos;
    ++			const struct cache_entry *ce;
     +
    -+			int pos = cache_name_pos(src, length);
    -+			if (pos >= 0) {
    -+				const struct cache_entry *ce = active_cache[pos];
    -+
    -+				if (ce_skip_worktree(ce)) {
    -+					if (!ignore_sparse)
    -+						string_list_append(&only_match_skip_worktree, src);
    -+					else
    -+						modes[i] = SPARSE;
    -+				}
    -+				else
    ++			pos = cache_name_pos(src, length);
    ++			if (pos < 0) {
    ++				/* only error if existence is expected. */
    ++				if (modes[i] != SPARSE)
     +					bad = _("bad source");
    ++				goto act_on_entry;
     +			}
    - 			/* only error if existence is expected. */
    --			if (modes[i] != SPARSE)
    -+			else if (modes[i] != SPARSE)
    ++
    ++			ce = active_cache[pos];
    ++			if (!ce_skip_worktree(ce)) {
      				bad = _("bad source");
    - 		} else if (!strncmp(src, dst, length) &&
    - 				(dst[length] == 0 || dst[length] == '/')) {
    + 				goto act_on_entry;
    + 			}
    ++
    ++			if (!ignore_sparse)
    ++				string_list_append(&only_match_skip_worktree, src);
    ++			else
    ++				modes[i] = SPARSE;
    ++			goto act_on_entry;
    + 		}
    + 		if (!strncmp(src, dst, length) &&
    + 		    (dst[length] == 0 || dst[length] == '/')) {
     
      ## t/t7002-mv-sparse-checkout.sh ##
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'can move out-of-cone directory with --sparse' '
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'can move out-of-cone directo
      
     -test_expect_failure 'refuse to move out-of-cone file without --sparse' '
     +test_expect_success 'refuse to move out-of-cone file without --sparse' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    - 	mkdir folder1 &&
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	setup_sparse_checkout &&
    + 
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone file without --sparse' '
      	test_cmp expect stderr
      '
      
     -test_expect_failure 'can move out-of-cone file with --sparse' '
     +test_expect_success 'can move out-of-cone file with --sparse' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    - 	mkdir folder1 &&
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	setup_sparse_checkout &&
    + 
3:  04572e5e6b ! 4:  d0de7678e3 mv: check if <destination> exists in index to handle overwriting
    @@ Commit message
     
      ## builtin/mv.c ##
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 
    - 		length = strlen(src);
    - 		if (lstat(src, &st) < 0) {
    --			/*
    --			 * TODO: for now, when you try to overwrite a <destination>
    --			 * with your <source> as a sparse file, if you supply a "--sparse"
    --			 * flag, then the action will be done without providing "--force"
    --			 * and no warning.
    --			 *
    --			 * This is mainly because the sparse <source>
    --			 * is not on-disk, and this if-else chain will be cut off early in
    --			 * this check, thus the "--force" check is ignored. Need fix.
    --			 */
    - 
    - 			int pos = cache_name_pos(src, length);
    - 			if (pos >= 0) {
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 				if (ce_skip_worktree(ce)) {
    - 					if (!ignore_sparse)
    - 						string_list_append(&only_match_skip_worktree, src);
    --					else
    --						modes[i] = SPARSE;
    -+					else {
    -+						/* Check if dst exists in index */
    -+						if (cache_name_pos(dst, strlen(dst)) >= 0) {
    -+							if (force)
    -+								modes[i] = SPARSE;
    -+							else
    -+								bad = _("destination exists");
    -+						}
    -+						else
    -+							modes[i] = SPARSE;
    -+					}
    - 				}
    - 				else
    - 					bad = _("bad source");
    + 				bad = _("bad source");
    + 				goto act_on_entry;
    + 			}
    +-
    +-			if (!ignore_sparse)
    ++			if (!ignore_sparse) {
    + 				string_list_append(&only_match_skip_worktree, src);
    +-			else
    ++				goto act_on_entry;
    ++			}
    ++			/* Check if dst exists in index */
    ++			if (cache_name_pos(dst, strlen(dst)) < 0) {
    + 				modes[i] = SPARSE;
    ++				goto act_on_entry;
    ++			}
    ++			if (!force) {
    ++				bad = _("destination exists");
    ++				goto act_on_entry;
    ++			}
    ++			modes[i] = SPARSE;
    + 			goto act_on_entry;
    + 		}
    + 		if (!strncmp(src, dst, length) &&
     
      ## t/t7002-mv-sparse-checkout.sh ##
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file with --sparse' '
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file wi
      
     -test_expect_failure 'refuse to move sparse file to existing destination' '
     +test_expect_success 'refuse to move sparse file to existing destination' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    + 	test_when_finished "cleanup_sparse_checkout" &&
      	mkdir folder1 &&
    + 	touch folder1/file1 &&
-:  ---------- > 5:  70540957b6 mv: use flags mode for update_mode
4:  4eeae40186 ! 6:  f8302f64e0 mv: add check_dir_in_index() and solve general dir check issue
    @@ Commit message
         instead of "bad source"; also user now can supply a "--sparse" flag so
         this operation can be carried out successfully.
     
    -    Also, as suggested by Derrick [1],
    -    move the in-line definition of "enum update_mode" to the top
    -    of the file and make it use "flags" mode (each state is a different
    -    bit in the word).
    -
    -    [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
    -
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
    -@@ builtin/mv.c: static const char * const builtin_mv_usage[] = {
    - 	NULL
    - };
    - 
    -+enum update_mode {
    -+	BOTH = 0,
    -+	WORKING_DIRECTORY = (1 << 1),
    -+	INDEX = (1 << 2),
    -+	SPARSE = (1 << 3),
    -+	SKIP_WORKTREE_DIR = (1 << 4),
    -+};
    -+
    - #define DUP_BASENAME 1
    - #define KEEP_TRAILING_SLASH 2
    - 
     @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
      	return last - first;
      }
    @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
      {
      	int i, flags, gitmodules_modified = 0;
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 		OPT_END(),
    - 	};
    - 	const char **source, **destination, **dest_path, **submodule_gitfile;
    --	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
    -+	enum update_mode *modes;
    - 	struct stat st;
    - 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
    - 	struct lock_file lock_file = LOCK_INIT;
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 		if (lstat(src, &st) < 0) {
    - 
    - 			int pos = cache_name_pos(src, length);
    -+			const char *src_w_slash = add_slash(src);
    -+
    - 			if (pos >= 0) {
    - 				const struct cache_entry *ce = active_cache[pos];
    + 	/* Checking */
    + 	for (i = 0; i < argc; i++) {
    + 		const char *src = source[i], *dst = destination[i];
    +-		int length, src_is_dir;
    ++		int length;
    + 		const char *bad = NULL;
    + 		int skip_sparse = 0;
      
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 				else
    + 
    + 			pos = cache_name_pos(src, length);
    + 			if (pos < 0) {
    ++				const char *src_w_slash = add_slash(src);
    ++				if (!check_dir_in_index(src, length) &&
    ++					!path_in_sparse_checkout(src_w_slash, &the_index)) {
    ++					modes[i] |= SKIP_WORKTREE_DIR;
    ++					goto dir_check;
    ++				}
    + 				/* only error if existence is expected. */
    + 				if (!(modes[i] & SPARSE))
      					bad = _("bad source");
    + 				goto act_on_entry;
      			}
    -+			else if (!check_dir_in_index(src, length) &&
    -+					 !path_in_sparse_checkout(src_w_slash, &the_index)) {
    -+				modes[i] = SKIP_WORKTREE_DIR;
    -+				goto dir_check;
    -+			}
    - 			/* only error if existence is expected. */
    - 			else if (modes[i] != SPARSE)
    +-
    + 			ce = active_cache[pos];
    + 			if (!ce_skip_worktree(ce)) {
      				bad = _("bad source");
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 				&& lstat(dst, &st) == 0)
    + 			bad = _("can not move directory into itself");
    + 			goto act_on_entry;
    + 		}
    +-		if ((src_is_dir = S_ISDIR(st.st_mode))
    ++		if (S_ISDIR(st.st_mode)
    + 		    && lstat(dst, &st) == 0) {
      			bad = _("cannot move directory over file");
    - 		else if (src_is_dir) {
    + 			goto act_on_entry;
    + 		}
    +-		if (src_is_dir) {
    ++
    ++dir_check:
    ++		if (S_ISDIR(st.st_mode)) {
    + 			int j, dst_len, n;
     -			int first = cache_name_pos(src, length), last;
     +			int first, last;
    -+dir_check:
     +			first = cache_name_pos(src, length);
      
    - 			if (first >= 0)
    + 			if (first >= 0) {
      				prepare_move_submodule(src, first,
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 			else { /* last - first >= 1 */
    - 				int j, dst_len, n;
    - 
    --				modes[i] = WORKING_DIRECTORY;
    -+				if (!modes[i])
    -+					modes[i] |= WORKING_DIRECTORY;
    - 				n = argc + last - first;
    - 				REALLOC_ARRAY(source, n);
    - 				REALLOC_ARRAY(destination, n);
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 			printf(_("Renaming %s to %s\n"), src, dst);
    - 		if (show_only)
    - 			continue;
    --		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
    -+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
    -+		 	rename(src, dst) < 0) {
    - 			if (ignore_errors)
    - 				continue;
    - 			die_errno(_("renaming '%s' failed"), src);
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - 							      1);
    - 		}
    - 
    --		if (mode == WORKING_DIRECTORY)
    -+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
    - 			continue;
    - 
    - 		pos = cache_name_pos(src, strlen(src));
     
      ## t/t7002-mv-sparse-checkout.sh ##
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
      
     -test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
     +test_expect_success 'refuse to move out-of-cone directory without --sparse' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    - 	mkdir folder1 &&
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	setup_sparse_checkout &&
    + 
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
      	test_cmp expect stderr
      '
      
     -test_expect_failure 'can move out-of-cone directory with --sparse' '
     +test_expect_success 'can move out-of-cone directory with --sparse' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    - 	mkdir folder1 &&
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	setup_sparse_checkout &&
    + 
5:  a3a296c3ef ! 7:  bc996931e2 mv: use update_sparsity() after touching sparse contents
    @@ Metadata
     Author: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## Commit message ##
    -    mv: use update_sparsity() after touching sparse contents
    +    mv: update sparsity after moving from out-of-cone to in-cone
     
    -    Originally, "git mv" a sparse file/directory from out/in-cone to
    -    in/out-cone does not update the sparsity following the sparse-checkout
    -    patterns.
    +    Originally, "git mv" a sparse file from out-of-cone to
    +    in-cone does not update the moved file's sparsity (remove its
    +    SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
    +    not checked out in the working tree.
     
    -    Use update_sparsity() after touching sparse contents, so the sparsity
    -    will be updated after the move.
    +    Update the behavior so that:
    +    1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
    +       corresponding cache entry.
    +    2. The moved cache entry is checked out in the working tree to reflect
    +       the updated sparsity.
     
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
    @@ builtin/mv.c
      #include "string-list.h"
      #include "parse-options.h"
      #include "submodule.h"
    -+#include "unpack-trees.h"
    ++#include "entry.h"
      
      static const char * const builtin_mv_usage[] = {
      	N_("git mv [<options>] <source>... <destination>"),
    -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    - {
    - 	int i, flags, gitmodules_modified = 0;
    - 	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
    -+	int sparse_moved = 0;
    - 	struct option builtin_mv_options[] = {
    - 		OPT__VERBOSE(&verbose, N_("be verbose")),
    - 		OPT__DRY_RUN(&show_only, N_("dry run")),
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		const char *src = source[i], *dst = destination[i];
      		enum update_mode mode = modes[i];
      		int pos;
    -+		if (!sparse_moved && mode & (SPARSE | SKIP_WORKTREE_DIR))
    -+			sparse_moved = 1;
    ++		struct checkout state = CHECKOUT_INIT;
    ++		state.istate = &the_index;
    ++
    ++		if (force)
    ++			state.force = 1;
      		if (show_only || verbose)
      			printf(_("Renaming %s to %s\n"), src, dst);
      		if (show_only)
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    + 		pos = cache_name_pos(src, strlen(src));
    + 		assert(pos >= 0);
      		rename_cache_entry_at(pos, dst);
    ++
    ++		if (mode & SPARSE) {
    ++			if (path_in_sparse_checkout(dst, &the_index)) {
    ++				int dst_pos;
    ++
    ++				dst_pos = cache_name_pos(dst, strlen(dst));
    ++				active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
    ++
    ++				if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
    ++					die(_("cannot checkout %s"), ce->name);
    ++			}
    ++		}
      	}
      
    -+	if (sparse_moved) {
    -+		struct unpack_trees_options o;
    -+		memset(&o, 0, sizeof(o));
    -+		o.verbose_update = isatty(2);
    -+		o.update = 1;
    -+		o.head_idx = -1;
    -+		o.src_index = &the_index;
    -+		o.dst_index = &the_index;
    -+		o.skip_sparse_checkout = 0;
    -+		o.pl = the_index.sparse_checkout_patterns;
    -+		setup_unpack_trees_porcelain(&o, "mv");
    -+		update_sparsity(&o);
    -+		clear_unpack_trees_porcelain(&o);
    -+	}
    -+
      	if (gitmodules_modified)
    - 		stage_updated_gitmodules(&the_index);
    - 
     
      ## t/t7002-mv-sparse-checkout.sh ##
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone directory with --sparse' '
    + 	git mv --sparse folder1 sub 1>actual 2>stderr &&
    + 	test_must_be_empty stderr &&
    + 
    +-	git sparse-checkout reapply &&
    + 	test_path_is_dir sub/folder1 &&
    + 	test_path_is_file sub/folder1/file1
    + '
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file with --sparse' '
    + 	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
    + 	test_must_be_empty stderr &&
    + 
    +-	git sparse-checkout reapply &&
    + 	! test_path_is_dir sub/folder1 &&
    + 	test_path_is_file sub/file1
    + '
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move sparse file to existing destination' '
      	test_cmp expect stderr
      '
      
    -+# Need fix.
    -+#
    -+# The *expected* behavior:
    -+#
    -+# Using --sparse to accept a sparse file, --force to overwrite the destination.
    -+# The folder1/file1 should replace the sub/file1 without error.
    -+#
    -+# The *actual* behavior:
    -+#
    -+# It emits a warning:
    -+#
    -+# warning: Path ' sub/file1
    -+# ' already present; will not overwrite with sparse update.
    -+# After fixing the above paths, you may want to run `git sparse-checkout
    -+# reapply`.
    -+
    - test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    - 	git sparse-checkout disable &&
    - 	git reset --hard &&
    +-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    ++test_expect_success 'move sparse file to existing destination with --force and --sparse' '
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	mkdir folder1 &&
    + 	touch folder1/file1 &&

base-commit: 4f6db706e6ad145a9bf6b26a1ca0970bed27bb72
-- 
2.35.1


^ permalink raw reply	[flat|nested] 95+ messages in thread

* [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-21 21:23     ` Victoria Dye
  2022-06-19  3:25   ` [WIP v3 2/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Add corresponding tests to test following situations:

We do not have sufficient coverage of moving files outside
of a sparse-checkout cone. Create new tests covering this
behavior, keeping in mind that the user can include --sparse
(or not), move a file or directory, and the destination can
already exist in the index (in this case user can use --force
to overwrite existing entry).

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t7002-mv-sparse-checkout.sh | 87 +++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index f0f7cbfcdb..d6e7315a5a 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -4,6 +4,18 @@ test_description='git mv in sparse working trees'
 
 . ./test-lib.sh
 
+setup_sparse_checkout () {
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout set --cone sub
+}
+
+cleanup_sparse_checkout () {
+	git sparse-checkout disable &&
+	git reset --hard
+}
+
 test_expect_success 'setup' "
 	mkdir -p sub/dir sub/dir2 &&
 	touch a b c sub/d sub/dir/e sub/dir2/e &&
@@ -196,6 +208,7 @@ test_expect_success 'can move files to non-sparse dir' '
 '
 
 test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
+	test_when_finished "cleanup_sparse_checkout" &&
 	git reset --hard &&
 	git sparse-checkout init --no-cone &&
 	git sparse-checkout set a !/x y/ !x/y/z &&
@@ -206,4 +219,78 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
+test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone directory with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+
+	git sparse-checkout reapply &&
+	test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/folder1/file1
+'
+
+test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1/file1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone file with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
+	test_must_be_empty stderr &&
+
+	git sparse-checkout reapply &&
+	! test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/file1
+'
+
+test_expect_failure 'refuse to move sparse file to existing destination' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
+	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	echo "overwrite" >folder1/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	git mv --sparse --force folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "overwrite" >expect &&
+	test_cmp expect sub/file1
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 2/7] mv: decouple if/else-if checks using goto
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 3/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Previous if/else-if chain are highly nested and hard to develop/extend.

Refactor to decouple this if/else-if chain by using goto to jump ahead.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 139 +++++++++++++++++++++++++++++----------------------
 1 file changed, 80 insertions(+), 59 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba83..1ca2c21da8 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -186,53 +186,68 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			if (modes[i] != SPARSE) {
 				bad = _("bad source");
-		} else if (!strncmp(src, dst, length) &&
-				(dst[length] == 0 || dst[length] == '/')) {
+				goto act_on_entry;
+			}
+		}
+		if (!strncmp(src, dst, length) &&
+		    (dst[length] == 0 || dst[length] == '/')) {
 			bad = _("can not move directory into itself");
-		} else if ((src_is_dir = S_ISDIR(st.st_mode))
-				&& lstat(dst, &st) == 0)
+			goto act_on_entry;
+		}
+		if ((src_is_dir = S_ISDIR(st.st_mode))
+		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
-		else if (src_is_dir) {
+			goto act_on_entry;
+		}
+		if (src_is_dir) {
+			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
-			if (first >= 0)
+			if (first >= 0) {
 				prepare_move_submodule(src, first,
 						       submodule_gitfile + i);
-			else if (index_range_of_same_dir(src, length,
-							 &first, &last) < 1)
+				goto act_on_entry;
+			} else if (index_range_of_same_dir(src, length,
+							   &first, &last) < 1) {
 				bad = _("source directory is empty");
-			else { /* last - first >= 1 */
-				int j, dst_len, n;
-
-				modes[i] = WORKING_DIRECTORY;
-				n = argc + last - first;
-				REALLOC_ARRAY(source, n);
-				REALLOC_ARRAY(destination, n);
-				REALLOC_ARRAY(modes, n);
-				REALLOC_ARRAY(submodule_gitfile, n);
-
-				dst = add_slash(dst);
-				dst_len = strlen(dst);
-
-				for (j = 0; j < last - first; j++) {
-					const struct cache_entry *ce = active_cache[first + j];
-					const char *path = ce->name;
-					source[argc + j] = path;
-					destination[argc + j] =
-						prefix_path(dst, dst_len, path + length + 1);
-					modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
-					submodule_gitfile[argc + j] = NULL;
-				}
-				argc += last - first;
+				goto act_on_entry;
 			}
-		} else if (!(ce = cache_file_exists(src, length, 0))) {
+
+			/* last - first >= 1 */
+			modes[i] = WORKING_DIRECTORY;
+			n = argc + last - first;
+			REALLOC_ARRAY(source, n);
+			REALLOC_ARRAY(destination, n);
+			REALLOC_ARRAY(modes, n);
+			REALLOC_ARRAY(submodule_gitfile, n);
+
+			dst = add_slash(dst);
+			dst_len = strlen(dst);
+
+			for (j = 0; j < last - first; j++) {
+				const struct cache_entry *ce = active_cache[first + j];
+				const char *path = ce->name;
+				source[argc + j] = path;
+				destination[argc + j] =
+					prefix_path(dst, dst_len, path + length + 1);
+				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				submodule_gitfile[argc + j] = NULL;
+			}
+			argc += last - first;
+			goto act_on_entry;
+		}
+		if (!(ce = cache_file_exists(src, length, 0))) {
 			bad = _("not under version control");
-		} else if (ce_stage(ce)) {
+			goto act_on_entry;
+		}
+		if (ce_stage(ce)) {
 			bad = _("conflicted");
-		} else if (lstat(dst, &st) == 0 &&
-			 (!ignore_case || strcasecmp(src, dst))) {
+			goto act_on_entry;
+		}
+		if (lstat(dst, &st) == 0 &&
+		    (!ignore_case || strcasecmp(src, dst))) {
 			bad = _("destination exists");
 			if (force) {
 				/*
@@ -246,34 +261,40 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = _("Cannot overwrite");
 			}
-		} else if (string_list_has_string(&src_for_dst, dst))
+			goto act_on_entry;
+		}
+		if (string_list_has_string(&src_for_dst, dst)) {
 			bad = _("multiple sources for the same target");
-		else if (is_dir_sep(dst[strlen(dst) - 1]))
+			goto act_on_entry;
+		}
+		if (is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
-		else {
-			/*
-			 * We check if the paths are in the sparse-checkout
-			 * definition as a very final check, since that
-			 * allows us to point the user to the --sparse
-			 * option as a way to have a successful run.
-			 */
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(src, &the_index)) {
-				string_list_append(&only_match_skip_worktree, src);
-				skip_sparse = 1;
-			}
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
-				string_list_append(&only_match_skip_worktree, dst);
-				skip_sparse = 1;
-			}
-
-			if (skip_sparse)
-				goto remove_entry;
+			goto act_on_entry;
+		}
 
-			string_list_insert(&src_for_dst, dst);
+		/*
+		 * We check if the paths are in the sparse-checkout
+		 * definition as a very final check, since that
+		 * allows us to point the user to the --sparse
+		 * option as a way to have a successful run.
+		 */
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(src, &the_index)) {
+			string_list_append(&only_match_skip_worktree, src);
+			skip_sparse = 1;
+		}
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(dst, &the_index)) {
+			string_list_append(&only_match_skip_worktree, dst);
+			skip_sparse = 1;
 		}
 
+		if (skip_sparse)
+			goto remove_entry;
+
+		string_list_insert(&src_for_dst, dst);
+
+act_on_entry:
 		if (!bad)
 			continue;
 		if (!ignore_errors)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 3/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 2/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 4/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 21 +++++++++++++++++++--
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 1ca2c21da8..9d8494a2e4 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -185,11 +185,28 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
-			/* only error if existence is expected. */
-			if (modes[i] != SPARSE) {
+			int pos;
+			const struct cache_entry *ce;
+
+			pos = cache_name_pos(src, length);
+			if (pos < 0) {
+				/* only error if existence is expected. */
+				if (modes[i] != SPARSE)
+					bad = _("bad source");
+				goto act_on_entry;
+			}
+
+			ce = active_cache[pos];
+			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
 				goto act_on_entry;
 			}
+
+			if (!ignore_sparse)
+				string_list_append(&only_match_skip_worktree, src);
+			else
+				modes[i] = SPARSE;
+			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
 		    (dst[length] == 0 || dst[length] == '/')) {
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index d6e7315a5a..1984cf131d 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -242,7 +242,7 @@ test_expect_failure 'can move out-of-cone directory with --sparse' '
 	test_path_is_file sub/folder1/file1
 '
 
-test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+test_expect_success 'refuse to move out-of-cone file without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -253,7 +253,7 @@ test_expect_failure 'refuse to move out-of-cone file without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone file with --sparse' '
+test_expect_success 'can move out-of-cone file with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 4/7] mv: check if <destination> exists in index to handle overwriting
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                     ` (2 preceding siblings ...)
  2022-06-19  3:25   ` [WIP v3 3/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-19  3:25   ` [WIP v3 5/7] mv: use flags mode for update_mode Shaoxuan Yuan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 15 ++++++++++++---
 t/t7002-mv-sparse-checkout.sh |  2 +-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 9d8494a2e4..abb90d3266 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -201,11 +201,20 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				bad = _("bad source");
 				goto act_on_entry;
 			}
-
-			if (!ignore_sparse)
+			if (!ignore_sparse) {
 				string_list_append(&only_match_skip_worktree, src);
-			else
+				goto act_on_entry;
+			}
+			/* Check if dst exists in index */
+			if (cache_name_pos(dst, strlen(dst)) < 0) {
 				modes[i] = SPARSE;
+				goto act_on_entry;
+			}
+			if (!force) {
+				bad = _("destination exists");
+				goto act_on_entry;
+			}
+			modes[i] = SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 1984cf131d..5b61fbad5f 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -265,7 +265,7 @@ test_expect_success 'can move out-of-cone file with --sparse' '
 	test_path_is_file sub/file1
 '
 
-test_expect_failure 'refuse to move sparse file to existing destination' '
+test_expect_success 'refuse to move sparse file to existing destination' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 5/7] mv: use flags mode for update_mode
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                     ` (3 preceding siblings ...)
  2022-06-19  3:25   ` [WIP v3 4/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-21 22:32     ` Victoria Dye
  2022-06-19  3:25   ` [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

As suggested by Derrick [1],
move the in-line definition of "enum update_mode" to the top
of the file and make it use "flags" mode (each state is a different
bit in the word).

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index abb90d3266..7ce7992d6c 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -19,6 +19,14 @@ static const char * const builtin_mv_usage[] = {
 	NULL
 };
 
+enum update_mode {
+	BOTH = 0,
+	WORKING_DIRECTORY = (1 << 1),
+	INDEX = (1 << 2),
+	SPARSE = (1 << 3),
+	SKIP_WORKTREE_DIR = (1 << 4),
+};
+
 #define DUP_BASENAME 1
 #define KEEP_TRAILING_SLASH 2
 
@@ -129,7 +137,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
+	enum update_mode *modes;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -191,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
 				/* only error if existence is expected. */
-				if (modes[i] != SPARSE)
+				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
@@ -207,14 +215,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 			/* Check if dst exists in index */
 			if (cache_name_pos(dst, strlen(dst)) < 0) {
-				modes[i] = SPARSE;
+				modes[i] |= SPARSE;
 				goto act_on_entry;
 			}
 			if (!force) {
 				bad = _("destination exists");
 				goto act_on_entry;
 			}
-			modes[i] = SPARSE;
+			modes[i] |= SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
@@ -242,7 +250,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 
 			/* last - first >= 1 */
-			modes[i] = WORKING_DIRECTORY;
+			modes[i] |= WORKING_DIRECTORY;
 			n = argc + last - first;
 			REALLOC_ARRAY(source, n);
 			REALLOC_ARRAY(destination, n);
@@ -258,7 +266,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				source[argc + j] = path;
 				destination[argc + j] =
 					prefix_path(dst, dst_len, path + length + 1);
-				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				memset(modes + argc + j, 0, sizeof(enum update_mode));
+				modes[argc + j] |= ce_skip_worktree(ce) ? SPARSE : INDEX;
 				submodule_gitfile[argc + j] = NULL;
 			}
 			argc += last - first;
@@ -355,7 +364,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
+			rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
 			die_errno(_("renaming '%s' failed"), src);
@@ -369,7 +379,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode == WORKING_DIRECTORY)
+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                     ` (4 preceding siblings ...)
  2022-06-19  3:25   ` [WIP v3 5/7] mv: use flags mode for update_mode Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-21 22:55     ` Victoria Dye
  2022-06-19  3:25   ` [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
  2022-06-21 23:30   ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Victoria Dye
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 49 +++++++++++++++++++++++++++++++----
 t/t7002-mv-sparse-checkout.sh |  4 +--
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 7ce7992d6c..cb3441c7cb 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -123,6 +123,37 @@ static int index_range_of_same_dir(const char *src, int length,
 	return last - first;
 }
 
+/*
+ * Check if an out-of-cone directory should be in the index. Imagine this case
+ * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
+ * and thus the directory is sparsified.
+ *
+ * Return 0 if such directory exist (i.e. with any of its contained files not
+ * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
+ * Return 1 otherwise.
+ */
+static int check_dir_in_index(const char *name, int namelen)
+{
+	int ret = 1;
+	const char *with_slash = add_slash(name);
+	int length = namelen + 1;
+
+	int pos = cache_name_pos(with_slash, length);
+	const struct cache_entry *ce;
+
+	if (pos < 0) {
+		pos = -pos - 1;
+		if (pos >= the_index.cache_nr)
+			return ret;
+		ce = active_cache[pos];
+		if (strncmp(with_slash, ce->name, length))
+			return ret;
+		if (ce_skip_worktree(ce))
+			return ret = 0;
+	}
+	return ret;
+}
+
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
@@ -184,7 +215,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length;
 		const char *bad = NULL;
 		int skip_sparse = 0;
 
@@ -198,12 +229,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
+				const char *src_w_slash = add_slash(src);
+				if (!check_dir_in_index(src, length) &&
+					!path_in_sparse_checkout(src_w_slash, &the_index)) {
+					modes[i] |= SKIP_WORKTREE_DIR;
+					goto dir_check;
+				}
 				/* only error if existence is expected. */
 				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
-
 			ce = active_cache[pos];
 			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
@@ -230,14 +266,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			bad = _("can not move directory into itself");
 			goto act_on_entry;
 		}
-		if ((src_is_dir = S_ISDIR(st.st_mode))
+		if (S_ISDIR(st.st_mode)
 		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
 			goto act_on_entry;
 		}
-		if (src_is_dir) {
+
+dir_check:
+		if (S_ISDIR(st.st_mode)) {
 			int j, dst_len, n;
-			int first = cache_name_pos(src, length), last;
+			int first, last;
+			first = cache_name_pos(src, length);
 
 			if (first >= 0) {
 				prepare_move_submodule(src, first,
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 5b61fbad5f..30e13b9979 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -219,7 +219,7 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -230,7 +230,7 @@ test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone directory with --sparse' '
+test_expect_success 'can move out-of-cone directory with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                     ` (5 preceding siblings ...)
  2022-06-19  3:25   ` [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-06-19  3:25   ` Shaoxuan Yuan
  2022-06-21 23:11     ` Victoria Dye
  2022-06-21 23:30   ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Victoria Dye
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-19  3:25 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye, newren

Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.

Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
   corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
   the updated sparsity.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 18 ++++++++++++++++++
 t/t7002-mv-sparse-checkout.sh |  4 +---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index cb3441c7cb..a8b9f55654 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -13,6 +13,7 @@
 #include "string-list.h"
 #include "parse-options.h"
 #include "submodule.h"
+#include "entry.h"
 
 static const char * const builtin_mv_usage[] = {
 	N_("git mv [<options>] <source>... <destination>"),
@@ -399,6 +400,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
 		int pos;
+		struct checkout state = CHECKOUT_INIT;
+		state.istate = &the_index;
+
+		if (force)
+			state.force = 1;
 		if (show_only || verbose)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
@@ -424,6 +430,18 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
 		rename_cache_entry_at(pos, dst);
+
+		if (mode & SPARSE) {
+			if (path_in_sparse_checkout(dst, &the_index)) {
+				int dst_pos;
+
+				dst_pos = cache_name_pos(dst, strlen(dst));
+				active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
+
+				if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
+					die(_("cannot checkout %s"), ce->name);
+			}
+		}
 	}
 
 	if (gitmodules_modified)
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 30e13b9979..7734119197 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -237,7 +237,6 @@ test_expect_success 'can move out-of-cone directory with --sparse' '
 	git mv --sparse folder1 sub 1>actual 2>stderr &&
 	test_must_be_empty stderr &&
 
-	git sparse-checkout reapply &&
 	test_path_is_dir sub/folder1 &&
 	test_path_is_file sub/folder1/file1
 '
@@ -260,7 +259,6 @@ test_expect_success 'can move out-of-cone file with --sparse' '
 	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
 	test_must_be_empty stderr &&
 
-	git sparse-checkout reapply &&
 	! test_path_is_dir sub/folder1 &&
 	test_path_is_file sub/file1
 '
@@ -278,7 +276,7 @@ test_expect_success 'refuse to move sparse file to existing destination' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+test_expect_success 'move sparse file to existing destination with --force and --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory
  2022-06-19  3:25   ` [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-06-21 21:23     ` Victoria Dye
  0 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-06-21 21:23 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: derrickstolee, git, gitster, newren

Shaoxuan Yuan wrote:
> Add corresponding tests to test following situations:
> 
> We do not have sufficient coverage of moving files outside
> of a sparse-checkout cone. Create new tests covering this
> behavior, keeping in mind that the user can include --sparse
> (or not), move a file or directory, and the destination can
> already exist in the index (in this case user can use --force
> to overwrite existing entry).
> 
> Helped-by: Victoria Dye <vdye@github.com>
> Helped-by: Derrick Stolee <derrickstolee@github.com>
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  t/t7002-mv-sparse-checkout.sh | 87 +++++++++++++++++++++++++++++++++++
>  1 file changed, 87 insertions(+)
> 
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index f0f7cbfcdb..d6e7315a5a 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -4,6 +4,18 @@ test_description='git mv in sparse working trees'
>  
>  . ./test-lib.sh
>  
> +setup_sparse_checkout () {
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	git add folder1 &&
> +	git sparse-checkout set --cone sub
> +}
> +
> +cleanup_sparse_checkout () {
> +	git sparse-checkout disable &&
> +	git reset --hard
> +}
> +
>  test_expect_success 'setup' "
>  	mkdir -p sub/dir sub/dir2 &&
>  	touch a b c sub/d sub/dir/e sub/dir2/e &&
> @@ -196,6 +208,7 @@ test_expect_success 'can move files to non-sparse dir' '
>  '
>  
>  test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
> +	test_when_finished "cleanup_sparse_checkout" &&
>  	git reset --hard &&
>  	git sparse-checkout init --no-cone &&
>  	git sparse-checkout set a !/x y/ !x/y/z &&
> @@ -206,4 +219,78 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>  	test_cmp expect stderr
>  '
>  
> +test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	setup_sparse_checkout &&

The setup and cleanup approach looks good - thanks for updating it!

> +
> +	test_must_fail git mv folder1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'can move out-of-cone directory with --sparse' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	setup_sparse_checkout &&
> +
> +	git mv --sparse folder1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +
> +	git sparse-checkout reapply &&

You shouldn't need to run 'reapply' here (you remove it in Patch 7, but it
should probably be dropped here instead).

> +	test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/folder1/file1
> +'
> +
> +test_expect_failure 'refuse to move out-of-cone file without --sparse' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	setup_sparse_checkout &&
> +
> +	test_must_fail git mv folder1/file1 sub 2>stderr &&
> +	cat sparse_error_header >expect &&
> +	echo folder1/file1 >>expect &&
> +	cat sparse_hint >>expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'can move out-of-cone file with --sparse' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	setup_sparse_checkout &&
> +
> +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
> +	test_must_be_empty stderr &&
> +
> +	git sparse-checkout reapply &&
> +	! test_path_is_dir sub/folder1 &&
> +	test_path_is_file sub/file1

You can also drop the 'reapply' here (same reason as above), but you'll also
probably need to drop the '! test_path_is_dir sub/folder1'. Based on some
rough testing of the command in its current state, 'git mv' doesn't delete a
directory if 'mv' the last remaining file in that directory. In this test,
the directory being deleted is a result of 'sparse-checkout reapply', not
'mv'.

> +'
> +
> +test_expect_failure 'refuse to move sparse file to existing destination' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	touch sub/file1 &&
> +	git add folder1 sub/file1 &&
> +	git sparse-checkout set --cone sub &&
> +
> +	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
> +	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
> +	test_cmp expect stderr
> +'
> +
> +test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
> +	test_when_finished "cleanup_sparse_checkout" &&
> +	mkdir folder1 &&
> +	touch folder1/file1 &&
> +	touch sub/file1 &&
> +	echo "overwrite" >folder1/file1 &&
> +	git add folder1 sub/file1 &&
> +	git sparse-checkout set --cone sub &&
> +
> +	git mv --sparse --force folder1/file1 sub 2>stderr &&
> +	test_must_be_empty stderr &&
> +	echo "overwrite" >expect &&
> +	test_cmp expect sub/file1
> +'
> +
>  test_done


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 5/7] mv: use flags mode for update_mode
  2022-06-19  3:25   ` [WIP v3 5/7] mv: use flags mode for update_mode Shaoxuan Yuan
@ 2022-06-21 22:32     ` Victoria Dye
  2022-06-22  9:37       ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-06-21 22:32 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: derrickstolee, git, gitster, newren

Shaoxuan Yuan wrote:
> As suggested by Derrick [1],
> move the in-line definition of "enum update_mode" to the top
> of the file and make it use "flags" mode (each state is a different
> bit in the word).
> 

This message doesn't quite cover all of what's done in the commit. In
addition to moving the enum definition, you introduce a 'SKIP_WORKTREE_DIR'
flag and change the flag assignments to '|=' (additive) from '=' (single
assignment). If those changes belong in this commit (not a later one), they
should be explained in the message here.

> [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c | 26 ++++++++++++++++++--------
>  1 file changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index abb90d3266..7ce7992d6c 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -19,6 +19,14 @@ static const char * const builtin_mv_usage[] = {
>  	NULL
>  };
>  
> +enum update_mode {
> +	BOTH = 0,

I know this comes from the original inline enum, but I don't see 'BOTH' used
anywhere. The name itself is somewhat confusing (I have no idea what "both"
is referring to - possibly "both" 'WORKING_DIRECTORY' and 'INDEX'??), so
would you mind removing it in the next re-roll? 

> +	WORKING_DIRECTORY = (1 << 1),
> +	INDEX = (1 << 2),
> +	SPARSE = (1 << 3),
> +	SKIP_WORKTREE_DIR = (1 << 4),

You're not introducing any assignment of 'SKIP_WORKTREE_DIR' in this commit
(looks like that's done in the next one, patch [6/7]), so you should
probably 'SKIP_WORKTREE_DIR' and its corresponding usage in that patch
instead of this one.

> +};

When the update modes were mutually-exclusive, it made sense for them to be
represented by an enum. Now that they're flags that can be combined, should
they instead be pre-processor '#define' values (e.g., like the 'RESET_*'
modes in 'reset.h' or 'CE_*' flags in 'cache.h')? I don't actually know what
the standard is, since I also see one or two examples of using enums as
flags (e.g., 'commit_graph_split_flags' in 'commit-graph.h'). Maybe another
contributor could clarify? 

> +
>  #define DUP_BASENAME 1
>  #define KEEP_TRAILING_SLASH 2
>  
> @@ -129,7 +137,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		OPT_END(),
>  	};
>  	const char **source, **destination, **dest_path, **submodule_gitfile;
> -	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
> +	enum update_mode *modes;
>  	struct stat st;
>  	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
>  	struct lock_file lock_file = LOCK_INIT;
> @@ -191,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			pos = cache_name_pos(src, length);
>  			if (pos < 0) {
>  				/* only error if existence is expected. */
> -				if (modes[i] != SPARSE)
> +				if (!(modes[i] & SPARSE))
>  					bad = _("bad source");
>  				goto act_on_entry;
>  			}
> @@ -207,14 +215,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			}
>  			/* Check if dst exists in index */
>  			if (cache_name_pos(dst, strlen(dst)) < 0) {
> -				modes[i] = SPARSE;
> +				modes[i] |= SPARSE;
>  				goto act_on_entry;
>  			}
>  			if (!force) {
>  				bad = _("destination exists");
>  				goto act_on_entry;
>  			}
> -			modes[i] = SPARSE;
> +			modes[i] |= SPARSE;
>  			goto act_on_entry;
>  		}
>  		if (!strncmp(src, dst, length) &&
> @@ -242,7 +250,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			}
>  
>  			/* last - first >= 1 */
> -			modes[i] = WORKING_DIRECTORY;
> +			modes[i] |= WORKING_DIRECTORY;
>  			n = argc + last - first;
>  			REALLOC_ARRAY(source, n);
>  			REALLOC_ARRAY(destination, n);
> @@ -258,7 +266,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  				source[argc + j] = path;
>  				destination[argc + j] =
>  					prefix_path(dst, dst_len, path + length + 1);
> -				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
> +				memset(modes + argc + j, 0, sizeof(enum update_mode));

One benefit of using '#define' values would be that 'modes' would just be an
array of unsigned ints, so you could just assign '0' rather than using
memset. In terms of the implementation as-is, though, I think what you have
is correct.

> +				modes[argc + j] |= ce_skip_worktree(ce) ? SPARSE : INDEX;
>  				submodule_gitfile[argc + j] = NULL;
>  			}
>  			argc += last - first;
> @@ -355,7 +364,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			printf(_("Renaming %s to %s\n"), src, dst);
>  		if (show_only)
>  			continue;
> -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> +		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
> +			rename(src, dst) < 0) {

Nit: could you align 'rename' with the line above it (per the highlighted
section in the CodingGuidelines [1])? As far as I can tell, the "align with
tabs and spaces" approach is what's *intended* to be used in 'mv.c'
(although it's admittedly pretty inconsistent).

[1] https://github.com/git/git/blob/master/Documentation/CodingGuidelines#L371-L383 

>  			if (ignore_errors)
>  				continue;
>  			die_errno(_("renaming '%s' failed"), src);
> @@ -369,7 +379,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  							      1);
>  		}
>  
> -		if (mode == WORKING_DIRECTORY)
> +		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
>  			continue;
>  
>  		pos = cache_name_pos(src, strlen(src));


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-19  3:25   ` [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-06-21 22:55     ` Victoria Dye
  0 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-06-21 22:55 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: derrickstolee, git, gitster, newren

Shaoxuan Yuan wrote:
> Originally, moving a <source> directory which is not on-disk due
> to its existence outside of sparse-checkout cone, "giv mv" command
> errors out with "bad source".
> 
> Add a helper check_dir_in_index() function to see if a directory
> name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
> such directories.
> 
> Change the checking logic, so that such <source> directory makes
> "giv mv" command warns with "advise_on_updating_sparse_paths()"
> instead of "bad source"; also user now can supply a "--sparse" flag so
> this operation can be carried out successfully.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 49 +++++++++++++++++++++++++++++++----
>  t/t7002-mv-sparse-checkout.sh |  4 +--
>  2 files changed, 46 insertions(+), 7 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 7ce7992d6c..cb3441c7cb 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -123,6 +123,37 @@ static int index_range_of_same_dir(const char *src, int length,
>  	return last - first;
>  }
>  
> +/*
> + * Check if an out-of-cone directory should be in the index. Imagine this case
> + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
> + * and thus the directory is sparsified.> + *
> + * Return 0 if such directory exist (i.e. with any of its contained files not
> + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
> + * Return 1 otherwise.
> + */
This explanation is helpful in clarifying that you don't mean *sparse
directories* (that is, directory entries in a sparse index), you mean
directories whose contents are all sparse. It's a tricky distinction, but
you handled it nicely here.

> +static int check_dir_in_index(const char *name, int namelen)
> +{
> +	int ret = 1;
> +	const char *with_slash = add_slash(name);
> +	int length = namelen + 1;
> +
> +	int pos = cache_name_pos(with_slash, length);
> +	const struct cache_entry *ce;
> +
> +	if (pos < 0) {
> +		pos = -pos - 1;
> +		if (pos >= the_index.cache_nr)
> +			return ret;
> +		ce = active_cache[pos];
> +		if (strncmp(with_slash, ce->name, length))
> +			return ret;
> +		if (ce_skip_worktree(ce))
> +			return ret = 0;
> +	}
> +	return ret;

The way 'ret' is handled here is a bit difficult to follow. Would you be
opposed to returning hardcoded '0' or '1', rather than changing the value of
'ret' throughout? Something like:

static int check_dir_in_index(const char *name, int namelen)
{
	int pos, length = namelen + 1;
	const struct cache_entry *ce;
	const char *with_slash = add_slash(name);

	pos = cache_name_pos(with_slash, length);
	if (pos < 0) {
		pos = -pos - 1;
		if (pos >= the_index.cache_nr)
			return 1;
		ce = active_cache[pos];
		if (strncmp(with_slash, ce->name, length))
			return 1;
		if (ce_skip_worktree(ce))
			return 0;
	}
	return 1;
}

> +}
> +
>  int cmd_mv(int argc, const char **argv, const char *prefix)
>  {
>  	int i, flags, gitmodules_modified = 0;
> @@ -184,7 +215,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  	/* Checking */
>  	for (i = 0; i < argc; i++) {
>  		const char *src = source[i], *dst = destination[i];
> -		int length, src_is_dir;
> +		int length;
>  		const char *bad = NULL;
>  		int skip_sparse = 0;
>  
> @@ -198,12 +229,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  
>  			pos = cache_name_pos(src, length);
>  			if (pos < 0) {
> +				const char *src_w_slash = add_slash(src);
> +				if (!check_dir_in_index(src, length) &&
> +					!path_in_sparse_checkout(src_w_slash, &the_index)) {

In checks like these, the less "expensive" one should come first (so that if
it returns 'false', we completely skip the more expensive one). Since
'check_dir_in_index()' requires binary searching the index, it's likely to
be more expensive than 'path_in_sparse_checkout()', so the condition order
should be flipped:

				if (!path_in_sparse_checkout(src_w_slash, &the_index) &&
				    !check_dir_in_index(src, length)) {

Also nit: alignment (more details on why/how in my last message [1]).

[1] https://lore.kernel.org/git/01b39c63-5652-4293-0424-ff99b6f9f7d2@github.com/

> +					modes[i] |= SKIP_WORKTREE_DIR;
> +					goto dir_check;
> +				}
>  				/* only error if existence is expected. */
>  				if (!(modes[i] & SPARSE))
>  					bad = _("bad source");
>  				goto act_on_entry;
>  			}
> -
>  			ce = active_cache[pos];
>  			if (!ce_skip_worktree(ce)) {
>  				bad = _("bad source");
> @@ -230,14 +266,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  			bad = _("can not move directory into itself");
>  			goto act_on_entry;
>  		}
> -		if ((src_is_dir = S_ISDIR(st.st_mode))
> +		if (S_ISDIR(st.st_mode)
>  		    && lstat(dst, &st) == 0) {
>  			bad = _("cannot move directory over file");
>  			goto act_on_entry;
>  		}
> -		if (src_is_dir) {
> +
> +dir_check:
> +		if (S_ISDIR(st.st_mode)) {
>  			int j, dst_len, n;
> -			int first = cache_name_pos(src, length), last;
> +			int first, last;
> +			first = cache_name_pos(src, length);

Super-nit: why did this line change? It looks like it just rearranges the
lines for no functional purpose.

>  
>  			if (first >= 0) {
>  				prepare_move_submodule(src, first,
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index 5b61fbad5f..30e13b9979 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -219,7 +219,7 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>  	test_cmp expect stderr
>  '
>  
> -test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
> +test_expect_success 'refuse to move out-of-cone directory without --sparse' '
>  	test_when_finished "cleanup_sparse_checkout" &&
>  	setup_sparse_checkout &&
>  
> @@ -230,7 +230,7 @@ test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
>  	test_cmp expect stderr
>  '
>  
> -test_expect_failure 'can move out-of-cone directory with --sparse' '
> +test_expect_success 'can move out-of-cone directory with --sparse' '
>  	test_when_finished "cleanup_sparse_checkout" &&
>  	setup_sparse_checkout &&
>  


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-19  3:25   ` [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
@ 2022-06-21 23:11     ` Victoria Dye
  0 siblings, 0 replies; 95+ messages in thread
From: Victoria Dye @ 2022-06-21 23:11 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: derrickstolee, git, gitster, newren

Shaoxuan Yuan wrote:
> Originally, "git mv" a sparse file from out-of-cone to
> in-cone does not update the moved file's sparsity (remove its
> SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
> not checked out in the working tree.
> 
> Update the behavior so that:
> 1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
>    corresponding cache entry.
> 2. The moved cache entry is checked out in the working tree to reflect
>    the updated sparsity.
> 
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 18 ++++++++++++++++++
>  t/t7002-mv-sparse-checkout.sh |  4 +---
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index cb3441c7cb..a8b9f55654 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -13,6 +13,7 @@
>  #include "string-list.h"
>  #include "parse-options.h"
>  #include "submodule.h"
> +#include "entry.h"
>  
>  static const char * const builtin_mv_usage[] = {
>  	N_("git mv [<options>] <source>... <destination>"),
> @@ -399,6 +400,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		const char *src = source[i], *dst = destination[i];
>  		enum update_mode mode = modes[i];
>  		int pos;
> +		struct checkout state = CHECKOUT_INIT;
> +		state.istate = &the_index;
> +
> +		if (force)
> +			state.force = 1;
>  		if (show_only || verbose)
>  			printf(_("Renaming %s to %s\n"), src, dst);
>  		if (show_only)
> @@ -424,6 +430,18 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		pos = cache_name_pos(src, strlen(src));
>  		assert(pos >= 0);
>  		rename_cache_entry_at(pos, dst);

At first I wasn't sure how this would handle moving whole "sparse"
directories (i.e., directories containing all 'SKIP_WORKTREE' entries),
since this loop only iterates over 'argc'. The good news is: it does work,
successfully moving each file in the directory individually! Unfortunately,
the *reason* it works is because 'mv' changes the value of 'argc' to include
the new directories.

All this to say - your implementation is good (and IMO doesn't require any
changes), it just happens to sit alongside somewhat questionable code. :)

> +
> +		if (mode & SPARSE) {
> +			if (path_in_sparse_checkout(dst, &the_index)) {

Nit: this can be consolidated into a single condition:

		if ((mode & SPARSE) && 
		    path_in_sparse_checkout(dst, &the_index)) {

> +				int dst_pos;
> +
> +				dst_pos = cache_name_pos(dst, strlen(dst));
> +				active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
> +
> +				if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
> +					die(_("cannot checkout %s"), ce->name);
> +			}
> +		}
>  	}
>  
>  	if (gitmodules_modified)
> diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
> index 30e13b9979..7734119197 100755
> --- a/t/t7002-mv-sparse-checkout.sh
> +++ b/t/t7002-mv-sparse-checkout.sh
> @@ -237,7 +237,6 @@ test_expect_success 'can move out-of-cone directory with --sparse' '
>  	git mv --sparse folder1 sub 1>actual 2>stderr &&
>  	test_must_be_empty stderr &&
>  
> -	git sparse-checkout reapply &&
>  	test_path_is_dir sub/folder1 &&
>  	test_path_is_file sub/folder1/file1
>  '
> @@ -260,7 +259,6 @@ test_expect_success 'can move out-of-cone file with --sparse' '
>  	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
>  	test_must_be_empty stderr &&
>  
> -	git sparse-checkout reapply &&
>  	! test_path_is_dir sub/folder1 &&
>  	test_path_is_file sub/file1
>  '
> @@ -278,7 +276,7 @@ test_expect_success 'refuse to move sparse file to existing destination' '
>  	test_cmp expect stderr
>  '
>  
> -test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
> +test_expect_success 'move sparse file to existing destination with --force and --sparse' '
>  	test_when_finished "cleanup_sparse_checkout" &&
>  	mkdir folder1 &&
>  	touch folder1/file1 &&


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                     ` (6 preceding siblings ...)
  2022-06-19  3:25   ` [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
@ 2022-06-21 23:30   ` Victoria Dye
  2022-06-23 15:06     ` Derrick Stolee
  7 siblings, 1 reply; 95+ messages in thread
From: Victoria Dye @ 2022-06-21 23:30 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: derrickstolee, git, gitster, newren

Shaoxuan Yuan wrote:
> The range-diff seems a bit messy, because some of the changes are too big
> then I have to tune the --create-factor big enough to reveal these changes.
> 
> ## Changes since WIP v2 ##
> 
> 1. Write helper functions for t7002 to reuse some code.
> 
> 2. Refactor/decouple the if/else-if checking chain.
> 
> 3. Separate out the 'update_mode' refactor into a single commit.
> 
> 4. Stop using update_sparsity() and instead update the SKIP_WORKTREE
>    bit for each cache_entry and check it out to the working tree.

All of this looks great! My comments are all minor syntax nits - the
functionality is sound. Thanks again for working through this tangled mess
of a problem!

> 
> ## Limitations ##
> 
> At this point, we still don't have in-cone to out-of-cone move, which
> I don't think is too much a problem, since the title says this series is
> around out-of-cone as the <source>.
> 

While it would be nice to include in-cone -> out-of-cone here, I'm also
content with you moving it to a later series (the first couple patches in a
sparse index integration or as a standalone series; up to you). 

> But I think it worth discuss if we should implement in-cone to 
> out-of-cone move, since it will be nice (naturally) to have it working.
> 
> However, I noticed this from the mv man page:
> 
> "In the second form, the last argument has to be an existing directory; 
> the given sources will be moved into this directory."
> 
> I think trying to move out-of-cone, the last argument has to be an non-existent
> directory? I'm a bit confused: should we update some of mv basic logic to 
> accomplish this?
> 

I suspect this requirement is related to the POSIX 'mv' [1] (and
corresponding 'rename()', used in 'git mv'), which also requires that the
destination directory exists. I personally don't think this requirement
needs to apply to 'git mv' at all, but note that changing the behavior would
require first creating the necessary directories before calling 'rename()'. 

As a more conservative solution, you could do the parent directory creation
*only* in the case of moving to a sparse contents-only directory (using
something like the 'check_dir_in_index()' function you introduced to
identify).

I'm also interested in hearing what others have to say, especially regarding
historical context/use cases of 'git mv'.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html

> Shaoxuan Yuan (7):
>   t7002: add tests for moving out-of-cone file/directory
>   mv: decouple if/else-if checks using goto
>   mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
>   mv: check if <destination> exists in index to handle overwriting
>   mv: use flags mode for update_mode
>   mv: add check_dir_in_index() and solve general dir check issue
>   mv: update sparsity after moving from out-of-cone to in-cone
> 
>  builtin/mv.c                  | 244 +++++++++++++++++++++++++---------
>  t/t7002-mv-sparse-checkout.sh |  85 ++++++++++++
>  2 files changed, 264 insertions(+), 65 deletions(-)
> 
> Range-diff against v2:
> 1:  271445205d ! 1:  a08ce96935 t7002: add tests for moving out-of-cone file/directory
>     @@ Commit message
>      
>          Add corresponding tests to test following situations:
>      
>     -    * 'refuse to move out-of-cone directory without --sparse'
>     -    * 'can move out-of-cone directory with --sparse'
>     -    * 'refuse to move out-of-cone file without --sparse'
>     -    * 'can move out-of-cone file with --sparse'
>     -    * 'refuse to move sparse file to existing destination'
>     -    * 'move sparse file to existing destination with --force and --sparse'
>     +    We do not have sufficient coverage of moving files outside
>     +    of a sparse-checkout cone. Create new tests covering this
>     +    behavior, keeping in mind that the user can include --sparse
>     +    (or not), move a file or directory, and the destination can
>     +    already exist in the index (in this case user can use --force
>     +    to overwrite existing entry).
>      
>     +    Helped-by: Victoria Dye <vdye@github.com>
>     +    Helped-by: Derrick Stolee <derrickstolee@github.com>
>          Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>      
>       ## t/t7002-mv-sparse-checkout.sh ##
>     +@@ t/t7002-mv-sparse-checkout.sh: test_description='git mv in sparse working trees'
>     + 
>     + . ./test-lib.sh
>     + 
>     ++setup_sparse_checkout () {
>     ++	mkdir folder1 &&
>     ++	touch folder1/file1 &&
>     ++	git add folder1 &&
>     ++	git sparse-checkout set --cone sub
>     ++}
>     ++
>     ++cleanup_sparse_checkout () {
>     ++	git sparse-checkout disable &&
>     ++	git reset --hard
>     ++}
>     ++
>     + test_expect_success 'setup' "
>     + 	mkdir -p sub/dir sub/dir2 &&
>     + 	touch a b c sub/d sub/dir/e sub/dir2/e &&
>     +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move files to non-sparse dir' '
>     + '
>     + 
>     + test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>     + 	git reset --hard &&
>     + 	git sparse-checkout init --no-cone &&
>     + 	git sparse-checkout set a !/x y/ !x/y/z &&
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>       	test_cmp expect stderr
>       '
>       
>      +test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     -+	mkdir folder1 &&
>     -+	touch folder1/file1 &&
>     -+	git add folder1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>     ++	setup_sparse_checkout &&
>      +
>      +	test_must_fail git mv folder1 sub 2>stderr &&
>      +	cat sparse_error_header >expect &&
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>      +'
>      +
>      +test_expect_failure 'can move out-of-cone directory with --sparse' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     -+	mkdir folder1 &&
>     -+	touch folder1/file1 &&
>     -+	git add folder1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>     ++	setup_sparse_checkout &&
>      +
>      +	git mv --sparse folder1 sub 1>actual 2>stderr &&
>      +	test_must_be_empty stderr &&
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>      +'
>      +
>      +test_expect_failure 'refuse to move out-of-cone file without --sparse' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     -+	mkdir folder1 &&
>     -+	touch folder1/file1 &&
>     -+	git add folder1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>     ++	setup_sparse_checkout &&
>      +
>      +	test_must_fail git mv folder1/file1 sub 2>stderr &&
>      +	cat sparse_error_header >expect &&
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>      +'
>      +
>      +test_expect_failure 'can move out-of-cone file with --sparse' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     -+	mkdir folder1 &&
>     -+	touch folder1/file1 &&
>     -+	git add folder1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>     ++	setup_sparse_checkout &&
>      +
>      +	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
>      +	test_must_be_empty stderr &&
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>      +'
>      +
>      +test_expect_failure 'refuse to move sparse file to existing destination' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>      +	mkdir folder1 &&
>      +	touch folder1/file1 &&
>      +	touch sub/file1 &&
>      +	git add folder1 sub/file1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	git sparse-checkout set --cone sub &&
>      +
>      +	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
>      +	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>      +'
>      +
>      +test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
>     -+	git sparse-checkout disable &&
>     -+	git reset --hard &&
>     ++	test_when_finished "cleanup_sparse_checkout" &&
>      +	mkdir folder1 &&
>      +	touch folder1/file1 &&
>      +	touch sub/file1 &&
>      +	echo "overwrite" >folder1/file1 &&
>      +	git add folder1 sub/file1 &&
>     -+	git sparse-checkout init --cone &&
>     -+	git sparse-checkout set sub &&
>     ++	git sparse-checkout set --cone sub &&
>      +
>      +	git mv --sparse --force folder1/file1 sub 2>stderr &&
>      +	test_must_be_empty stderr &&
> -:  ---------- > 2:  8065fbc232 mv: decouple if/else-if checks using goto
> 2:  80f485f146 ! 3:  e227fe717b mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
>     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>       
>       		length = strlen(src);
>       		if (lstat(src, &st) < 0) {
>     -+			/*
>     -+			 * TODO: for now, when you try to overwrite a <destination>
>     -+			 * with your <source> as a sparse file, if you supply a "--sparse"
>     -+			 * flag, then the action will be done without providing "--force"
>     -+			 * and no warning.
>     -+			 *
>     -+			 * This is mainly because the sparse <source>
>     -+			 * is not on-disk, and this if-else chain will be cut off early in
>     -+			 * this check, thus the "--force" check is ignored. Need fix.
>     -+			 */
>     +-			/* only error if existence is expected. */
>     +-			if (modes[i] != SPARSE) {
>     ++			int pos;
>     ++			const struct cache_entry *ce;
>      +
>     -+			int pos = cache_name_pos(src, length);
>     -+			if (pos >= 0) {
>     -+				const struct cache_entry *ce = active_cache[pos];
>     -+
>     -+				if (ce_skip_worktree(ce)) {
>     -+					if (!ignore_sparse)
>     -+						string_list_append(&only_match_skip_worktree, src);
>     -+					else
>     -+						modes[i] = SPARSE;
>     -+				}
>     -+				else
>     ++			pos = cache_name_pos(src, length);
>     ++			if (pos < 0) {
>     ++				/* only error if existence is expected. */
>     ++				if (modes[i] != SPARSE)
>      +					bad = _("bad source");
>     ++				goto act_on_entry;
>      +			}
>     - 			/* only error if existence is expected. */
>     --			if (modes[i] != SPARSE)
>     -+			else if (modes[i] != SPARSE)
>     ++
>     ++			ce = active_cache[pos];
>     ++			if (!ce_skip_worktree(ce)) {
>       				bad = _("bad source");
>     - 		} else if (!strncmp(src, dst, length) &&
>     - 				(dst[length] == 0 || dst[length] == '/')) {
>     + 				goto act_on_entry;
>     + 			}
>     ++
>     ++			if (!ignore_sparse)
>     ++				string_list_append(&only_match_skip_worktree, src);
>     ++			else
>     ++				modes[i] = SPARSE;
>     ++			goto act_on_entry;
>     + 		}
>     + 		if (!strncmp(src, dst, length) &&
>     + 		    (dst[length] == 0 || dst[length] == '/')) {
>      
>       ## t/t7002-mv-sparse-checkout.sh ##
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'can move out-of-cone directory with --sparse' '
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'can move out-of-cone directo
>       
>      -test_expect_failure 'refuse to move out-of-cone file without --sparse' '
>      +test_expect_success 'refuse to move out-of-cone file without --sparse' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     - 	mkdir folder1 &&
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>     + 	setup_sparse_checkout &&
>     + 
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone file without --sparse' '
>       	test_cmp expect stderr
>       '
>       
>      -test_expect_failure 'can move out-of-cone file with --sparse' '
>      +test_expect_success 'can move out-of-cone file with --sparse' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     - 	mkdir folder1 &&
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>     + 	setup_sparse_checkout &&
>     + 
> 3:  04572e5e6b ! 4:  d0de7678e3 mv: check if <destination> exists in index to handle overwriting
>     @@ Commit message
>      
>       ## builtin/mv.c ##
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 
>     - 		length = strlen(src);
>     - 		if (lstat(src, &st) < 0) {
>     --			/*
>     --			 * TODO: for now, when you try to overwrite a <destination>
>     --			 * with your <source> as a sparse file, if you supply a "--sparse"
>     --			 * flag, then the action will be done without providing "--force"
>     --			 * and no warning.
>     --			 *
>     --			 * This is mainly because the sparse <source>
>     --			 * is not on-disk, and this if-else chain will be cut off early in
>     --			 * this check, thus the "--force" check is ignored. Need fix.
>     --			 */
>     - 
>     - 			int pos = cache_name_pos(src, length);
>     - 			if (pos >= 0) {
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 				if (ce_skip_worktree(ce)) {
>     - 					if (!ignore_sparse)
>     - 						string_list_append(&only_match_skip_worktree, src);
>     --					else
>     --						modes[i] = SPARSE;
>     -+					else {
>     -+						/* Check if dst exists in index */
>     -+						if (cache_name_pos(dst, strlen(dst)) >= 0) {
>     -+							if (force)
>     -+								modes[i] = SPARSE;
>     -+							else
>     -+								bad = _("destination exists");
>     -+						}
>     -+						else
>     -+							modes[i] = SPARSE;
>     -+					}
>     - 				}
>     - 				else
>     - 					bad = _("bad source");
>     + 				bad = _("bad source");
>     + 				goto act_on_entry;
>     + 			}
>     +-
>     +-			if (!ignore_sparse)
>     ++			if (!ignore_sparse) {
>     + 				string_list_append(&only_match_skip_worktree, src);
>     +-			else
>     ++				goto act_on_entry;
>     ++			}
>     ++			/* Check if dst exists in index */
>     ++			if (cache_name_pos(dst, strlen(dst)) < 0) {
>     + 				modes[i] = SPARSE;
>     ++				goto act_on_entry;
>     ++			}
>     ++			if (!force) {
>     ++				bad = _("destination exists");
>     ++				goto act_on_entry;
>     ++			}
>     ++			modes[i] = SPARSE;
>     + 			goto act_on_entry;
>     + 		}
>     + 		if (!strncmp(src, dst, length) &&
>      
>       ## t/t7002-mv-sparse-checkout.sh ##
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file with --sparse' '
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file wi
>       
>      -test_expect_failure 'refuse to move sparse file to existing destination' '
>      +test_expect_success 'refuse to move sparse file to existing destination' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>       	mkdir folder1 &&
>     + 	touch folder1/file1 &&
> -:  ---------- > 5:  70540957b6 mv: use flags mode for update_mode
> 4:  4eeae40186 ! 6:  f8302f64e0 mv: add check_dir_in_index() and solve general dir check issue
>     @@ Commit message
>          instead of "bad source"; also user now can supply a "--sparse" flag so
>          this operation can be carried out successfully.
>      
>     -    Also, as suggested by Derrick [1],
>     -    move the in-line definition of "enum update_mode" to the top
>     -    of the file and make it use "flags" mode (each state is a different
>     -    bit in the word).
>     -
>     -    [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
>     -
>          Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>      
>       ## builtin/mv.c ##
>     -@@ builtin/mv.c: static const char * const builtin_mv_usage[] = {
>     - 	NULL
>     - };
>     - 
>     -+enum update_mode {
>     -+	BOTH = 0,
>     -+	WORKING_DIRECTORY = (1 << 1),
>     -+	INDEX = (1 << 2),
>     -+	SPARSE = (1 << 3),
>     -+	SKIP_WORKTREE_DIR = (1 << 4),
>     -+};
>     -+
>     - #define DUP_BASENAME 1
>     - #define KEEP_TRAILING_SLASH 2
>     - 
>      @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
>       	return last - first;
>       }
>     @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
>       {
>       	int i, flags, gitmodules_modified = 0;
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 		OPT_END(),
>     - 	};
>     - 	const char **source, **destination, **dest_path, **submodule_gitfile;
>     --	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
>     -+	enum update_mode *modes;
>     - 	struct stat st;
>     - 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
>     - 	struct lock_file lock_file = LOCK_INIT;
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 		if (lstat(src, &st) < 0) {
>     - 
>     - 			int pos = cache_name_pos(src, length);
>     -+			const char *src_w_slash = add_slash(src);
>     -+
>     - 			if (pos >= 0) {
>     - 				const struct cache_entry *ce = active_cache[pos];
>     + 	/* Checking */
>     + 	for (i = 0; i < argc; i++) {
>     + 		const char *src = source[i], *dst = destination[i];
>     +-		int length, src_is_dir;
>     ++		int length;
>     + 		const char *bad = NULL;
>     + 		int skip_sparse = 0;
>       
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 				else
>     + 
>     + 			pos = cache_name_pos(src, length);
>     + 			if (pos < 0) {
>     ++				const char *src_w_slash = add_slash(src);
>     ++				if (!check_dir_in_index(src, length) &&
>     ++					!path_in_sparse_checkout(src_w_slash, &the_index)) {
>     ++					modes[i] |= SKIP_WORKTREE_DIR;
>     ++					goto dir_check;
>     ++				}
>     + 				/* only error if existence is expected. */
>     + 				if (!(modes[i] & SPARSE))
>       					bad = _("bad source");
>     + 				goto act_on_entry;
>       			}
>     -+			else if (!check_dir_in_index(src, length) &&
>     -+					 !path_in_sparse_checkout(src_w_slash, &the_index)) {
>     -+				modes[i] = SKIP_WORKTREE_DIR;
>     -+				goto dir_check;
>     -+			}
>     - 			/* only error if existence is expected. */
>     - 			else if (modes[i] != SPARSE)
>     +-
>     + 			ce = active_cache[pos];
>     + 			if (!ce_skip_worktree(ce)) {
>       				bad = _("bad source");
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 				&& lstat(dst, &st) == 0)
>     + 			bad = _("can not move directory into itself");
>     + 			goto act_on_entry;
>     + 		}
>     +-		if ((src_is_dir = S_ISDIR(st.st_mode))
>     ++		if (S_ISDIR(st.st_mode)
>     + 		    && lstat(dst, &st) == 0) {
>       			bad = _("cannot move directory over file");
>     - 		else if (src_is_dir) {
>     + 			goto act_on_entry;
>     + 		}
>     +-		if (src_is_dir) {
>     ++
>     ++dir_check:
>     ++		if (S_ISDIR(st.st_mode)) {
>     + 			int j, dst_len, n;
>      -			int first = cache_name_pos(src, length), last;
>      +			int first, last;
>     -+dir_check:
>      +			first = cache_name_pos(src, length);
>       
>     - 			if (first >= 0)
>     + 			if (first >= 0) {
>       				prepare_move_submodule(src, first,
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 			else { /* last - first >= 1 */
>     - 				int j, dst_len, n;
>     - 
>     --				modes[i] = WORKING_DIRECTORY;
>     -+				if (!modes[i])
>     -+					modes[i] |= WORKING_DIRECTORY;
>     - 				n = argc + last - first;
>     - 				REALLOC_ARRAY(source, n);
>     - 				REALLOC_ARRAY(destination, n);
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 			printf(_("Renaming %s to %s\n"), src, dst);
>     - 		if (show_only)
>     - 			continue;
>     --		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
>     -+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
>     -+		 	rename(src, dst) < 0) {
>     - 			if (ignore_errors)
>     - 				continue;
>     - 			die_errno(_("renaming '%s' failed"), src);
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - 							      1);
>     - 		}
>     - 
>     --		if (mode == WORKING_DIRECTORY)
>     -+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
>     - 			continue;
>     - 
>     - 		pos = cache_name_pos(src, strlen(src));
>      
>       ## t/t7002-mv-sparse-checkout.sh ##
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
>     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
>       
>      -test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
>      +test_expect_success 'refuse to move out-of-cone directory without --sparse' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     - 	mkdir folder1 &&
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>     + 	setup_sparse_checkout &&
>     + 
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
>       	test_cmp expect stderr
>       '
>       
>      -test_expect_failure 'can move out-of-cone directory with --sparse' '
>      +test_expect_success 'can move out-of-cone directory with --sparse' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     - 	mkdir folder1 &&
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>     + 	setup_sparse_checkout &&
>     + 
> 5:  a3a296c3ef ! 7:  bc996931e2 mv: use update_sparsity() after touching sparse contents
>     @@ Metadata
>      Author: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>      
>       ## Commit message ##
>     -    mv: use update_sparsity() after touching sparse contents
>     +    mv: update sparsity after moving from out-of-cone to in-cone
>      
>     -    Originally, "git mv" a sparse file/directory from out/in-cone to
>     -    in/out-cone does not update the sparsity following the sparse-checkout
>     -    patterns.
>     +    Originally, "git mv" a sparse file from out-of-cone to
>     +    in-cone does not update the moved file's sparsity (remove its
>     +    SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
>     +    not checked out in the working tree.
>      
>     -    Use update_sparsity() after touching sparse contents, so the sparsity
>     -    will be updated after the move.
>     +    Update the behavior so that:
>     +    1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
>     +       corresponding cache entry.
>     +    2. The moved cache entry is checked out in the working tree to reflect
>     +       the updated sparsity.
>      
>          Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>      
>     @@ builtin/mv.c
>       #include "string-list.h"
>       #include "parse-options.h"
>       #include "submodule.h"
>     -+#include "unpack-trees.h"
>     ++#include "entry.h"
>       
>       static const char * const builtin_mv_usage[] = {
>       	N_("git mv [<options>] <source>... <destination>"),
>     -@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     - {
>     - 	int i, flags, gitmodules_modified = 0;
>     - 	int verbose = 0, show_only = 0, force = 0, ignore_errors = 0, ignore_sparse = 0;
>     -+	int sparse_moved = 0;
>     - 	struct option builtin_mv_options[] = {
>     - 		OPT__VERBOSE(&verbose, N_("be verbose")),
>     - 		OPT__DRY_RUN(&show_only, N_("dry run")),
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>       		const char *src = source[i], *dst = destination[i];
>       		enum update_mode mode = modes[i];
>       		int pos;
>     -+		if (!sparse_moved && mode & (SPARSE | SKIP_WORKTREE_DIR))
>     -+			sparse_moved = 1;
>     ++		struct checkout state = CHECKOUT_INIT;
>     ++		state.istate = &the_index;
>     ++
>     ++		if (force)
>     ++			state.force = 1;
>       		if (show_only || verbose)
>       			printf(_("Renaming %s to %s\n"), src, dst);
>       		if (show_only)
>      @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
>     + 		pos = cache_name_pos(src, strlen(src));
>     + 		assert(pos >= 0);
>       		rename_cache_entry_at(pos, dst);
>     ++
>     ++		if (mode & SPARSE) {
>     ++			if (path_in_sparse_checkout(dst, &the_index)) {
>     ++				int dst_pos;
>     ++
>     ++				dst_pos = cache_name_pos(dst, strlen(dst));
>     ++				active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
>     ++
>     ++				if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
>     ++					die(_("cannot checkout %s"), ce->name);
>     ++			}
>     ++		}
>       	}
>       
>     -+	if (sparse_moved) {
>     -+		struct unpack_trees_options o;
>     -+		memset(&o, 0, sizeof(o));
>     -+		o.verbose_update = isatty(2);
>     -+		o.update = 1;
>     -+		o.head_idx = -1;
>     -+		o.src_index = &the_index;
>     -+		o.dst_index = &the_index;
>     -+		o.skip_sparse_checkout = 0;
>     -+		o.pl = the_index.sparse_checkout_patterns;
>     -+		setup_unpack_trees_porcelain(&o, "mv");
>     -+		update_sparsity(&o);
>     -+		clear_unpack_trees_porcelain(&o);
>     -+	}
>     -+
>       	if (gitmodules_modified)
>     - 		stage_updated_gitmodules(&the_index);
>     - 
>      
>       ## t/t7002-mv-sparse-checkout.sh ##
>     +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone directory with --sparse' '
>     + 	git mv --sparse folder1 sub 1>actual 2>stderr &&
>     + 	test_must_be_empty stderr &&
>     + 
>     +-	git sparse-checkout reapply &&
>     + 	test_path_is_dir sub/folder1 &&
>     + 	test_path_is_file sub/folder1/file1
>     + '
>     +@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file with --sparse' '
>     + 	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
>     + 	test_must_be_empty stderr &&
>     + 
>     +-	git sparse-checkout reapply &&
>     + 	! test_path_is_dir sub/folder1 &&
>     + 	test_path_is_file sub/file1
>     + '
>      @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move sparse file to existing destination' '
>       	test_cmp expect stderr
>       '
>       
>     -+# Need fix.
>     -+#
>     -+# The *expected* behavior:
>     -+#
>     -+# Using --sparse to accept a sparse file, --force to overwrite the destination.
>     -+# The folder1/file1 should replace the sub/file1 without error.
>     -+#
>     -+# The *actual* behavior:
>     -+#
>     -+# It emits a warning:
>     -+#
>     -+# warning: Path ' sub/file1
>     -+# ' already present; will not overwrite with sparse update.
>     -+# After fixing the above paths, you may want to run `git sparse-checkout
>     -+# reapply`.
>     -+
>     - test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
>     - 	git sparse-checkout disable &&
>     - 	git reset --hard &&
>     +-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
>     ++test_expect_success 'move sparse file to existing destination with --force and --sparse' '
>     + 	test_when_finished "cleanup_sparse_checkout" &&
>     + 	mkdir folder1 &&
>     + 	touch folder1/file1 &&
> 
> base-commit: 4f6db706e6ad145a9bf6b26a1ca0970bed27bb72


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 5/7] mv: use flags mode for update_mode
  2022-06-21 22:32     ` Victoria Dye
@ 2022-06-22  9:37       ` Shaoxuan Yuan
  0 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-22  9:37 UTC (permalink / raw)
  To: Victoria Dye; +Cc: derrickstolee, git, gitster, newren


On 6/22/2022 6:32 AM, Victoria Dye wrote:
> Shaoxuan Yuan wrote:
>> As suggested by Derrick [1],
>> move the in-line definition of "enum update_mode" to the top
>> of the file and make it use "flags" mode (each state is a different
>> bit in the word).
>>
> This message doesn't quite cover all of what's done in the commit. In
> addition to moving the enum definition, you introduce a 'SKIP_WORKTREE_DIR'
> flag and change the flag assignments to '|=' (additive) from '=' (single
> assignment). If those changes belong in this commit (not a later one), they
> should be explained in the message here.


Sure! It makes the commit sound clearer.


>> [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
>>
>> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
>> ---
>>   builtin/mv.c | 26 ++++++++++++++++++--------
>>   1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/builtin/mv.c b/builtin/mv.c
>> index abb90d3266..7ce7992d6c 100644
>> --- a/builtin/mv.c
>> +++ b/builtin/mv.c
>> @@ -19,6 +19,14 @@ static const char * const builtin_mv_usage[] = {
>>   	NULL
>>   };
>>   
>> +enum update_mode {
>> +	BOTH = 0,
> I know this comes from the original inline enum, but I don't see 'BOTH' used
> anywhere. The name itself is somewhat confusing (I have no idea what "both"
> is referring to - possibly "both" 'WORKING_DIRECTORY' and 'INDEX'??), so
> would you mind removing it in the next re-roll?


Yep, this BOTH is confusing.

I think it could be changed to something like UNDECIDED?

If taking it as UNDECIDED, it could potentially help determine if an 
argument

is untouched throughout the checking, then we can give the argument a 
REGULAR

flag (this may serve a purpose in my planned in-cone to out-of-cone move).


>> +	WORKING_DIRECTORY = (1 << 1),
>> +	INDEX = (1 << 2),
>> +	SPARSE = (1 << 3),
>> +	SKIP_WORKTREE_DIR = (1 << 4),
> You're not introducing any assignment of 'SKIP_WORKTREE_DIR' in this commit
> (looks like that's done in the next one, patch [6/7]), so you should
> probably 'SKIP_WORKTREE_DIR' and its corresponding usage in that patch
> instead of this one.


Right. SKIP_WORKTREE_DIR should go to a different patch.


>> +};
> When the update modes were mutually-exclusive, it made sense for them to be
> represented by an enum. Now that they're flags that can be combined, should
> they instead be pre-processor '#define' values (e.g., like the 'RESET_*'
> modes in 'reset.h' or 'CE_*' flags in 'cache.h')? I don't actually know what
> the standard is, since I also see one or two examples of using enums as
> flags (e.g., 'commit_graph_split_flags' in 'commit-graph.h'). Maybe another
> contributor could clarify?


I'm not sure either. Though a enum groups the flags together, which 
seems nice to me

as the flags are naturally distinguished from other '#define's.


>> +
>>   #define DUP_BASENAME 1
>>   #define KEEP_TRAILING_SLASH 2
>>   
>> @@ -129,7 +137,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   		OPT_END(),
>>   	};
>>   	const char **source, **destination, **dest_path, **submodule_gitfile;
>> -	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
>> +	enum update_mode *modes;
>>   	struct stat st;
>>   	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
>>   	struct lock_file lock_file = LOCK_INIT;
>> @@ -191,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   			pos = cache_name_pos(src, length);
>>   			if (pos < 0) {
>>   				/* only error if existence is expected. */
>> -				if (modes[i] != SPARSE)
>> +				if (!(modes[i] & SPARSE))
>>   					bad = _("bad source");
>>   				goto act_on_entry;
>>   			}
>> @@ -207,14 +215,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   			}
>>   			/* Check if dst exists in index */
>>   			if (cache_name_pos(dst, strlen(dst)) < 0) {
>> -				modes[i] = SPARSE;
>> +				modes[i] |= SPARSE;
>>   				goto act_on_entry;
>>   			}
>>   			if (!force) {
>>   				bad = _("destination exists");
>>   				goto act_on_entry;
>>   			}
>> -			modes[i] = SPARSE;
>> +			modes[i] |= SPARSE;
>>   			goto act_on_entry;
>>   		}
>>   		if (!strncmp(src, dst, length) &&
>> @@ -242,7 +250,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   			}
>>   
>>   			/* last - first >= 1 */
>> -			modes[i] = WORKING_DIRECTORY;
>> +			modes[i] |= WORKING_DIRECTORY;
>>   			n = argc + last - first;
>>   			REALLOC_ARRAY(source, n);
>>   			REALLOC_ARRAY(destination, n);
>> @@ -258,7 +266,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   				source[argc + j] = path;
>>   				destination[argc + j] =
>>   					prefix_path(dst, dst_len, path + length + 1);
>> -				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
>> +				memset(modes + argc + j, 0, sizeof(enum update_mode));
> One benefit of using '#define' values would be that 'modes' would just be an
> array of unsigned ints, so you could just assign '0' rather than using
> memset. In terms of the implementation as-is, though, I think what you have
> is correct.
>
>> +				modes[argc + j] |= ce_skip_worktree(ce) ? SPARSE : INDEX;
>>   				submodule_gitfile[argc + j] = NULL;
>>   			}
>>   			argc += last - first;
>> @@ -355,7 +364,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   			printf(_("Renaming %s to %s\n"), src, dst);
>>   		if (show_only)
>>   			continue;
>> -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
>> +		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
>> +			rename(src, dst) < 0) {
> Nit: could you align 'rename' with the line above it (per the highlighted
> section in the CodingGuidelines [1])? As far as I can tell, the "align with
> tabs and spaces" approach is what's *intended* to be used in 'mv.c'
> (although it's admittedly pretty inconsistent).
>
> [1] https://github.com/git/git/blob/master/Documentation/CodingGuidelines#L371-L383


Sure. My 'format-patch' always gives me this issue, will check.


>>   			if (ignore_errors)
>>   				continue;
>>   			die_errno(_("renaming '%s' failed"), src);
>> @@ -369,7 +379,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>>   							      1);
>>   		}
>>   
>> -		if (mode == WORKING_DIRECTORY)
>> +		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
>>   			continue;
>>   
>>   		pos = cache_name_pos(src, strlen(src));

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (8 preceding siblings ...)
  2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
@ 2022-06-23 11:41 ` Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
                     ` (7 more replies)
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
  10 siblings, 8 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

## Changes since WIP v3 ##

1. Fix style-nits.

2. Move "mv: update sparsity after moving from out-of-cone to in-cone" to the 
   (2/7) position (was (7/7) position in WIP v3). Make this move so that we
   can drop "git sparse-checkout reapply" from the tests, as suggested by
   Victoria [1]. The reason is we need to check out the moved cache_entry
   for all out-of-cone to in-cone moves, so this commit works better if being the
   first one.

3. Fix the commit message of "mv: use flags mode for update_mode", as suggested
   here [2].

4. Add "Helped-by" and "Suggested-by" trailers.

5. In "mv: add check_dir_in_index() and solve general dir check issue", change
   the 'check_dir_in_index()' to no more accept 'namelen' as argument. The
   original "namelen + 1" logic can be erroneous, for example when 'name' already
   has a trailing slash. So just use strlen() to save the trouble.

## Notes ##

As discussed in WIP v3, we can postpone the "in-cone to out-of-cone" move
to a later series. Hence, this series' functionality part has been done,
remove the WIP prefix.

[1] https://lore.kernel.org/git/adb795ba-56ce-8441-0c38-a3e6b0a6e861@github.com/
[2] https://lore.kernel.org/git/01b39c63-5652-4293-0424-ff99b6f9f7d2@github.com/

Shaoxuan Yuan (7):
  t7002: add tests for moving out-of-cone file/directory
  mv: update sparsity after moving from out-of-cone to in-cone
  mv: decouple if/else-if checks using goto
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: check if <destination> exists in index to handle overwriting
  mv: use flags mode for update_mode
  mv: add check_dir_in_index() and solve general dir check issue

 builtin/mv.c                  | 239 +++++++++++++++++++++++++---------
 t/t7002-mv-sparse-checkout.sh |  84 ++++++++++++
 2 files changed, 259 insertions(+), 64 deletions(-)

Range-diff against v3:
1:  574cbdfdb4 ! 1:  90c38f479b t7002: add tests for moving out-of-cone file/directory
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +	test_when_finished "cleanup_sparse_checkout" &&
     +	setup_sparse_checkout &&
     +
    -+	git mv --sparse folder1 sub 1>actual 2>stderr &&
    ++	git mv --sparse folder1 sub 2>stderr &&
     +	test_must_be_empty stderr &&
     +
    -+	git sparse-checkout reapply &&
     +	test_path_is_dir sub/folder1 &&
     +	test_path_is_file sub/folder1/file1
     +'
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-s
     +	test_when_finished "cleanup_sparse_checkout" &&
     +	setup_sparse_checkout &&
     +
    -+	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
    ++	git mv --sparse folder1/file1 sub 2>stderr &&
     +	test_must_be_empty stderr &&
     +
    -+	git sparse-checkout reapply &&
    -+	! test_path_is_dir sub/folder1 &&
     +	test_path_is_file sub/file1
     +'
     +
7:  6fa630203b ! 2:  c6fcaf8313 mv: update sparsity after moving from out-of-cone to in-cone
    @@ Commit message
         2. The moved cache entry is checked out in the working tree to reflect
            the updated sparsity.
     
    +    Helped-by: Victoria Dye <vdye@github.com>
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		assert(pos >= 0);
      		rename_cache_entry_at(pos, dst);
     +
    -+		if (mode & SPARSE) {
    -+			if (path_in_sparse_checkout(dst, &the_index)) {
    -+				int dst_pos;
    ++		if ((mode & SPARSE) &&
    ++		    (path_in_sparse_checkout(dst, &the_index))) {
    ++			int dst_pos;
     +
    -+				dst_pos = cache_name_pos(dst, strlen(dst));
    -+				active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
    ++			dst_pos = cache_name_pos(dst, strlen(dst));
    ++			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
     +
    -+				if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
    -+					die(_("cannot checkout %s"), ce->name);
    -+			}
    ++			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
    ++				die(_("cannot checkout %s"), ce->name);
     +		}
      	}
      
      	if (gitmodules_modified)
    -
    - ## t/t7002-mv-sparse-checkout.sh ##
    -@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone directory with --sparse' '
    - 	git mv --sparse folder1 sub 1>actual 2>stderr &&
    - 	test_must_be_empty stderr &&
    - 
    --	git sparse-checkout reapply &&
    - 	test_path_is_dir sub/folder1 &&
    - 	test_path_is_file sub/folder1/file1
    - '
    -@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file with --sparse' '
    - 	git mv --sparse folder1/file1 sub 1>actual 2>stderr &&
    - 	test_must_be_empty stderr &&
    - 
    --	git sparse-checkout reapply &&
    - 	! test_path_is_dir sub/folder1 &&
    - 	test_path_is_file sub/file1
    - '
    -@@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move sparse file to existing destination' '
    - 	test_cmp expect stderr
    - '
    - 
    --test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    -+test_expect_success 'move sparse file to existing destination with --force and --sparse' '
    - 	test_when_finished "cleanup_sparse_checkout" &&
    - 	mkdir folder1 &&
    - 	touch folder1/file1 &&
2:  aac267091b ! 3:  aee896a3f2 mv: decouple if/else-if checks using goto
    @@ Commit message
         Refactor to decouple this if/else-if chain by using goto to jump ahead.
     
         Suggested-by: Derrick Stolee <derrickstolee@github.com>
    +    Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
     @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
3:  b0e90cfa01 = 4:  82c10486ec mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
4:  c0c9dc9575 ! 5:  59dcf1a55f mv: check if <destination> exists in index to handle overwriting
    @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'can move out-of-cone file wi
      	test_when_finished "cleanup_sparse_checkout" &&
      	mkdir folder1 &&
      	touch folder1/file1 &&
    +@@ t/t7002-mv-sparse-checkout.sh: test_expect_failure 'refuse to move sparse file to existing destination' '
    + 	test_cmp expect stderr
    + '
    + 
    +-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
    ++test_expect_success 'move sparse file to existing destination with --force and --sparse' '
    + 	test_when_finished "cleanup_sparse_checkout" &&
    + 	mkdir folder1 &&
    + 	touch folder1/file1 &&
5:  3088724e72 ! 6:  b8e3094178 mv: use flags mode for update_mode
    @@ Commit message
         of the file and make it use "flags" mode (each state is a different
         bit in the word).
     
    +    Change the flag assignments from '=' (single assignment) to '|='
    +    (additive). Also change flag evaluation from '==' to '&', etc.
    +
         [1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
     
    +    Helped-by: Victoria Dye <vdye@github.com>
    +    Helped-by: Derrick Stolee <derrickstolee@github.com>
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
    @@ builtin/mv.c: static const char * const builtin_mv_usage[] = {
     +	WORKING_DIRECTORY = (1 << 1),
     +	INDEX = (1 << 2),
     +	SPARSE = (1 << 3),
    -+	SKIP_WORKTREE_DIR = (1 << 4),
     +};
     +
      #define DUP_BASENAME 1
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		if (show_only)
      			continue;
     -		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
    -+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
    -+			rename(src, dst) < 0) {
    ++		if (!(mode & (INDEX | SPARSE)) &&
    ++		    rename(src, dst) < 0) {
      			if (ignore_errors)
      				continue;
      			die_errno(_("renaming '%s' failed"), src);
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      		}
      
     -		if (mode == WORKING_DIRECTORY)
    -+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
    ++		if (mode & (WORKING_DIRECTORY))
      			continue;
      
      		pos = cache_name_pos(src, strlen(src));
6:  0c7ba28ddc ! 7:  080f34ee01 mv: add check_dir_in_index() and solve general dir check issue
    @@ Commit message
         instead of "bad source"; also user now can supply a "--sparse" flag so
         this operation can be carried out successfully.
     
    +    Helped-by: Victoria Dye <vdye@github.com>
    +    Helped-by: Derrick Stolee <derrickstolee@github.com>
         Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
     
      ## builtin/mv.c ##
    +@@ builtin/mv.c: enum update_mode {
    + 	WORKING_DIRECTORY = (1 << 1),
    + 	INDEX = (1 << 2),
    + 	SPARSE = (1 << 3),
    ++	SKIP_WORKTREE_DIR = (1 << 4),
    + };
    + 
    + #define DUP_BASENAME 1
     @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
      	return last - first;
      }
    @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
     + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
     + * Return 1 otherwise.
     + */
    -+static int check_dir_in_index(const char *name, int namelen)
    ++static int check_dir_in_index(const char *name)
     +{
    -+	int ret = 1;
     +	const char *with_slash = add_slash(name);
    -+	int length = namelen + 1;
    ++	int length = strlen(with_slash);
     +
     +	int pos = cache_name_pos(with_slash, length);
     +	const struct cache_entry *ce;
    @@ builtin/mv.c: static int index_range_of_same_dir(const char *src, int length,
     +	if (pos < 0) {
     +		pos = -pos - 1;
     +		if (pos >= the_index.cache_nr)
    -+			return ret;
    ++			return 1;
     +		ce = active_cache[pos];
     +		if (strncmp(with_slash, ce->name, length))
    -+			return ret;
    ++			return 1;
     +		if (ce_skip_worktree(ce))
    -+			return ret = 0;
    ++			return 0;
     +	}
    -+	return ret;
    ++	return 1;
     +}
     +
      int cmd_mv(int argc, const char **argv, const char *prefix)
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      			pos = cache_name_pos(src, length);
      			if (pos < 0) {
     +				const char *src_w_slash = add_slash(src);
    -+				if (!check_dir_in_index(src, length) &&
    -+					!path_in_sparse_checkout(src_w_slash, &the_index)) {
    ++				if (!path_in_sparse_checkout(src_w_slash, &the_index) &&
    ++				    !check_dir_in_index(src)) {
     +					modes[i] |= SKIP_WORKTREE_DIR;
     +					goto dir_check;
     +				}
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
     +dir_check:
     +		if (S_ISDIR(st.st_mode)) {
      			int j, dst_len, n;
    --			int first = cache_name_pos(src, length), last;
    -+			int first, last;
    -+			first = cache_name_pos(src, length);
    + 			int first = cache_name_pos(src, length), last;
    + 
    +@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    + 			printf(_("Renaming %s to %s\n"), src, dst);
    + 		if (show_only)
    + 			continue;
    +-		if (!(mode & (INDEX | SPARSE)) &&
    ++		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
    + 		    rename(src, dst) < 0) {
    + 			if (ignore_errors)
    + 				continue;
    +@@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
    + 							      1);
    + 		}
    + 
    +-		if (mode & (WORKING_DIRECTORY))
    ++		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
    + 			continue;
      
    - 			if (first >= 0) {
    - 				prepare_move_submodule(src, first,
    + 		pos = cache_name_pos(src, strlen(src));
     
      ## t/t7002-mv-sparse-checkout.sh ##
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '

base-commit: ddbc07872e86265dc30aefa5f4d881f762120044
-- 
2.35.1


^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH v4 1/7] t7002: add tests for moving out-of-cone file/directory
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Add corresponding tests to test following situations:

We do not have sufficient coverage of moving files outside
of a sparse-checkout cone. Create new tests covering this
behavior, keeping in mind that the user can include --sparse
(or not), move a file or directory, and the destination can
already exist in the index (in this case user can use --force
to overwrite existing entry).

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t7002-mv-sparse-checkout.sh | 84 +++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index f0f7cbfcdb..023e657c9e 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -4,6 +4,18 @@ test_description='git mv in sparse working trees'
 
 . ./test-lib.sh
 
+setup_sparse_checkout () {
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout set --cone sub
+}
+
+cleanup_sparse_checkout () {
+	git sparse-checkout disable &&
+	git reset --hard
+}
+
 test_expect_success 'setup' "
 	mkdir -p sub/dir sub/dir2 &&
 	touch a b c sub/d sub/dir/e sub/dir2/e &&
@@ -196,6 +208,7 @@ test_expect_success 'can move files to non-sparse dir' '
 '
 
 test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
+	test_when_finished "cleanup_sparse_checkout" &&
 	git reset --hard &&
 	git sparse-checkout init --no-cone &&
 	git sparse-checkout set a !/x y/ !x/y/z &&
@@ -206,4 +219,75 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
+test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone directory with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+
+	test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/folder1/file1
+'
+
+test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1/file1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone file with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+
+	test_path_is_file sub/file1
+'
+
+test_expect_failure 'refuse to move sparse file to existing destination' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
+	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	echo "overwrite" >folder1/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	git mv --sparse --force folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "overwrite" >expect &&
+	test_cmp expect sub/file1
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 15:08     ` Derrick Stolee
  2022-06-23 11:41   ` [PATCH v4 3/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.

Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
   corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
   the updated sparsity.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba83..0c0c2b4914 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -13,6 +13,7 @@
 #include "string-list.h"
 #include "parse-options.h"
 #include "submodule.h"
+#include "entry.h"
 
 static const char * const builtin_mv_usage[] = {
 	N_("git mv [<options>] <source>... <destination>"),
@@ -304,6 +305,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
 		int pos;
+		struct checkout state = CHECKOUT_INIT;
+		state.istate = &the_index;
+
+		if (force)
+			state.force = 1;
 		if (show_only || verbose)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
@@ -328,6 +334,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
 		rename_cache_entry_at(pos, dst);
+
+		if ((mode & SPARSE) &&
+		    (path_in_sparse_checkout(dst, &the_index))) {
+			int dst_pos;
+
+			dst_pos = cache_name_pos(dst, strlen(dst));
+			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
+
+			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
+				die(_("cannot checkout %s"), ce->name);
+		}
 	}
 
 	if (gitmodules_modified)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 3/7] mv: decouple if/else-if checks using goto
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 4/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Previous if/else-if chain are highly nested and hard to develop/extend.

Refactor to decouple this if/else-if chain by using goto to jump ahead.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 139 +++++++++++++++++++++++++++++----------------------
 1 file changed, 80 insertions(+), 59 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 0c0c2b4914..7f8d658028 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -187,53 +187,68 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			if (modes[i] != SPARSE) {
 				bad = _("bad source");
-		} else if (!strncmp(src, dst, length) &&
-				(dst[length] == 0 || dst[length] == '/')) {
+				goto act_on_entry;
+			}
+		}
+		if (!strncmp(src, dst, length) &&
+		    (dst[length] == 0 || dst[length] == '/')) {
 			bad = _("can not move directory into itself");
-		} else if ((src_is_dir = S_ISDIR(st.st_mode))
-				&& lstat(dst, &st) == 0)
+			goto act_on_entry;
+		}
+		if ((src_is_dir = S_ISDIR(st.st_mode))
+		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
-		else if (src_is_dir) {
+			goto act_on_entry;
+		}
+		if (src_is_dir) {
+			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
-			if (first >= 0)
+			if (first >= 0) {
 				prepare_move_submodule(src, first,
 						       submodule_gitfile + i);
-			else if (index_range_of_same_dir(src, length,
-							 &first, &last) < 1)
+				goto act_on_entry;
+			} else if (index_range_of_same_dir(src, length,
+							   &first, &last) < 1) {
 				bad = _("source directory is empty");
-			else { /* last - first >= 1 */
-				int j, dst_len, n;
-
-				modes[i] = WORKING_DIRECTORY;
-				n = argc + last - first;
-				REALLOC_ARRAY(source, n);
-				REALLOC_ARRAY(destination, n);
-				REALLOC_ARRAY(modes, n);
-				REALLOC_ARRAY(submodule_gitfile, n);
-
-				dst = add_slash(dst);
-				dst_len = strlen(dst);
-
-				for (j = 0; j < last - first; j++) {
-					const struct cache_entry *ce = active_cache[first + j];
-					const char *path = ce->name;
-					source[argc + j] = path;
-					destination[argc + j] =
-						prefix_path(dst, dst_len, path + length + 1);
-					modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
-					submodule_gitfile[argc + j] = NULL;
-				}
-				argc += last - first;
+				goto act_on_entry;
 			}
-		} else if (!(ce = cache_file_exists(src, length, 0))) {
+
+			/* last - first >= 1 */
+			modes[i] = WORKING_DIRECTORY;
+			n = argc + last - first;
+			REALLOC_ARRAY(source, n);
+			REALLOC_ARRAY(destination, n);
+			REALLOC_ARRAY(modes, n);
+			REALLOC_ARRAY(submodule_gitfile, n);
+
+			dst = add_slash(dst);
+			dst_len = strlen(dst);
+
+			for (j = 0; j < last - first; j++) {
+				const struct cache_entry *ce = active_cache[first + j];
+				const char *path = ce->name;
+				source[argc + j] = path;
+				destination[argc + j] =
+					prefix_path(dst, dst_len, path + length + 1);
+				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				submodule_gitfile[argc + j] = NULL;
+			}
+			argc += last - first;
+			goto act_on_entry;
+		}
+		if (!(ce = cache_file_exists(src, length, 0))) {
 			bad = _("not under version control");
-		} else if (ce_stage(ce)) {
+			goto act_on_entry;
+		}
+		if (ce_stage(ce)) {
 			bad = _("conflicted");
-		} else if (lstat(dst, &st) == 0 &&
-			 (!ignore_case || strcasecmp(src, dst))) {
+			goto act_on_entry;
+		}
+		if (lstat(dst, &st) == 0 &&
+		    (!ignore_case || strcasecmp(src, dst))) {
 			bad = _("destination exists");
 			if (force) {
 				/*
@@ -247,34 +262,40 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = _("Cannot overwrite");
 			}
-		} else if (string_list_has_string(&src_for_dst, dst))
+			goto act_on_entry;
+		}
+		if (string_list_has_string(&src_for_dst, dst)) {
 			bad = _("multiple sources for the same target");
-		else if (is_dir_sep(dst[strlen(dst) - 1]))
+			goto act_on_entry;
+		}
+		if (is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
-		else {
-			/*
-			 * We check if the paths are in the sparse-checkout
-			 * definition as a very final check, since that
-			 * allows us to point the user to the --sparse
-			 * option as a way to have a successful run.
-			 */
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(src, &the_index)) {
-				string_list_append(&only_match_skip_worktree, src);
-				skip_sparse = 1;
-			}
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
-				string_list_append(&only_match_skip_worktree, dst);
-				skip_sparse = 1;
-			}
-
-			if (skip_sparse)
-				goto remove_entry;
+			goto act_on_entry;
+		}
 
-			string_list_insert(&src_for_dst, dst);
+		/*
+		 * We check if the paths are in the sparse-checkout
+		 * definition as a very final check, since that
+		 * allows us to point the user to the --sparse
+		 * option as a way to have a successful run.
+		 */
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(src, &the_index)) {
+			string_list_append(&only_match_skip_worktree, src);
+			skip_sparse = 1;
+		}
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(dst, &the_index)) {
+			string_list_append(&only_match_skip_worktree, dst);
+			skip_sparse = 1;
 		}
 
+		if (skip_sparse)
+			goto remove_entry;
+
+		string_list_insert(&src_for_dst, dst);
+
+act_on_entry:
 		if (!bad)
 			continue;
 		if (!ignore_errors)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 4/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
                     ` (2 preceding siblings ...)
  2022-06-23 11:41   ` [PATCH v4 3/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 5/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 21 +++++++++++++++++++--
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 7f8d658028..d1b3229be6 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -186,11 +186,28 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
-			/* only error if existence is expected. */
-			if (modes[i] != SPARSE) {
+			int pos;
+			const struct cache_entry *ce;
+
+			pos = cache_name_pos(src, length);
+			if (pos < 0) {
+				/* only error if existence is expected. */
+				if (modes[i] != SPARSE)
+					bad = _("bad source");
+				goto act_on_entry;
+			}
+
+			ce = active_cache[pos];
+			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
 				goto act_on_entry;
 			}
+
+			if (!ignore_sparse)
+				string_list_append(&only_match_skip_worktree, src);
+			else
+				modes[i] = SPARSE;
+			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
 		    (dst[length] == 0 || dst[length] == '/')) {
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 023e657c9e..1510b5ed6a 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -241,7 +241,7 @@ test_expect_failure 'can move out-of-cone directory with --sparse' '
 	test_path_is_file sub/folder1/file1
 '
 
-test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+test_expect_success 'refuse to move out-of-cone file without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -252,7 +252,7 @@ test_expect_failure 'refuse to move out-of-cone file without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone file with --sparse' '
+test_expect_success 'can move out-of-cone file with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 5/7] mv: check if <destination> exists in index to handle overwriting
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
                     ` (3 preceding siblings ...)
  2022-06-23 11:41   ` [PATCH v4 4/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 11:41   ` [PATCH v4 6/7] mv: use flags mode for update_mode Shaoxuan Yuan
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 15 ++++++++++++---
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index d1b3229be6..40a3a5c5ff 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -202,11 +202,20 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				bad = _("bad source");
 				goto act_on_entry;
 			}
-
-			if (!ignore_sparse)
+			if (!ignore_sparse) {
 				string_list_append(&only_match_skip_worktree, src);
-			else
+				goto act_on_entry;
+			}
+			/* Check if dst exists in index */
+			if (cache_name_pos(dst, strlen(dst)) < 0) {
 				modes[i] = SPARSE;
+				goto act_on_entry;
+			}
+			if (!force) {
+				bad = _("destination exists");
+				goto act_on_entry;
+			}
+			modes[i] = SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 1510b5ed6a..6d2fb4f8d2 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -262,7 +262,7 @@ test_expect_success 'can move out-of-cone file with --sparse' '
 	test_path_is_file sub/file1
 '
 
-test_expect_failure 'refuse to move sparse file to existing destination' '
+test_expect_success 'refuse to move sparse file to existing destination' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
@@ -275,7 +275,7 @@ test_expect_failure 'refuse to move sparse file to existing destination' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+test_expect_success 'move sparse file to existing destination with --force and --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 6/7] mv: use flags mode for update_mode
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
                     ` (4 preceding siblings ...)
  2022-06-23 11:41   ` [PATCH v4 5/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 15:10     ` Derrick Stolee
  2022-06-23 11:41   ` [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
  2022-06-23 15:16   ` [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic Derrick Stolee
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

As suggested by Derrick [1],
move the in-line definition of "enum update_mode" to the top
of the file and make it use "flags" mode (each state is a different
bit in the word).

Change the flag assignments from '=' (single assignment) to '|='
(additive). Also change flag evaluation from '==' to '&', etc.

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 40a3a5c5ff..aa29da4337 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -20,6 +20,13 @@ static const char * const builtin_mv_usage[] = {
 	NULL
 };
 
+enum update_mode {
+	BOTH = 0,
+	WORKING_DIRECTORY = (1 << 1),
+	INDEX = (1 << 2),
+	SPARSE = (1 << 3),
+};
+
 #define DUP_BASENAME 1
 #define KEEP_TRAILING_SLASH 2
 
@@ -130,7 +137,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
+	enum update_mode *modes;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -192,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
 				/* only error if existence is expected. */
-				if (modes[i] != SPARSE)
+				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
@@ -208,14 +215,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 			/* Check if dst exists in index */
 			if (cache_name_pos(dst, strlen(dst)) < 0) {
-				modes[i] = SPARSE;
+				modes[i] |= SPARSE;
 				goto act_on_entry;
 			}
 			if (!force) {
 				bad = _("destination exists");
 				goto act_on_entry;
 			}
-			modes[i] = SPARSE;
+			modes[i] |= SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
@@ -243,7 +250,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 
 			/* last - first >= 1 */
-			modes[i] = WORKING_DIRECTORY;
+			modes[i] |= WORKING_DIRECTORY;
 			n = argc + last - first;
 			REALLOC_ARRAY(source, n);
 			REALLOC_ARRAY(destination, n);
@@ -259,7 +266,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				source[argc + j] = path;
 				destination[argc + j] =
 					prefix_path(dst, dst_len, path + length + 1);
-				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				memset(modes + argc + j, 0, sizeof(enum update_mode));
+				modes[argc + j] |= ce_skip_worktree(ce) ? SPARSE : INDEX;
 				submodule_gitfile[argc + j] = NULL;
 			}
 			argc += last - first;
@@ -361,7 +369,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
+		if (!(mode & (INDEX | SPARSE)) &&
+		    rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
 			die_errno(_("renaming '%s' failed"), src);
@@ -375,7 +384,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode == WORKING_DIRECTORY)
+		if (mode & (WORKING_DIRECTORY))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
                     ` (5 preceding siblings ...)
  2022-06-23 11:41   ` [PATCH v4 6/7] mv: use flags mode for update_mode Shaoxuan Yuan
@ 2022-06-23 11:41   ` Shaoxuan Yuan
  2022-06-23 15:14     ` Derrick Stolee
  2022-06-23 15:16   ` [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic Derrick Stolee
  7 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-23 11:41 UTC (permalink / raw)
  To: shaoxuan.yuan02; +Cc: derrickstolee, git, gitster, vdye

Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 50 ++++++++++++++++++++++++++++++-----
 t/t7002-mv-sparse-checkout.sh |  4 +--
 2 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index aa29da4337..b5d0d8ef4f 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -25,6 +25,7 @@ enum update_mode {
 	WORKING_DIRECTORY = (1 << 1),
 	INDEX = (1 << 2),
 	SPARSE = (1 << 3),
+	SKIP_WORKTREE_DIR = (1 << 4),
 };
 
 #define DUP_BASENAME 1
@@ -123,6 +124,36 @@ static int index_range_of_same_dir(const char *src, int length,
 	return last - first;
 }
 
+/*
+ * Check if an out-of-cone directory should be in the index. Imagine this case
+ * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
+ * and thus the directory is sparsified.
+ *
+ * Return 0 if such directory exist (i.e. with any of its contained files not
+ * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
+ * Return 1 otherwise.
+ */
+static int check_dir_in_index(const char *name)
+{
+	const char *with_slash = add_slash(name);
+	int length = strlen(with_slash);
+
+	int pos = cache_name_pos(with_slash, length);
+	const struct cache_entry *ce;
+
+	if (pos < 0) {
+		pos = -pos - 1;
+		if (pos >= the_index.cache_nr)
+			return 1;
+		ce = active_cache[pos];
+		if (strncmp(with_slash, ce->name, length))
+			return 1;
+		if (ce_skip_worktree(ce))
+			return 0;
+	}
+	return 1;
+}
+
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
@@ -184,7 +215,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length;
 		const char *bad = NULL;
 		int skip_sparse = 0;
 
@@ -198,12 +229,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
+				const char *src_w_slash = add_slash(src);
+				if (!path_in_sparse_checkout(src_w_slash, &the_index) &&
+				    !check_dir_in_index(src)) {
+					modes[i] |= SKIP_WORKTREE_DIR;
+					goto dir_check;
+				}
 				/* only error if existence is expected. */
 				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
-
 			ce = active_cache[pos];
 			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
@@ -230,12 +266,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			bad = _("can not move directory into itself");
 			goto act_on_entry;
 		}
-		if ((src_is_dir = S_ISDIR(st.st_mode))
+		if (S_ISDIR(st.st_mode)
 		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
 			goto act_on_entry;
 		}
-		if (src_is_dir) {
+
+dir_check:
+		if (S_ISDIR(st.st_mode)) {
 			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
@@ -369,7 +407,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (!(mode & (INDEX | SPARSE)) &&
+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
 		    rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
@@ -384,7 +422,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode & (WORKING_DIRECTORY))
+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 6d2fb4f8d2..71fe29690f 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -219,7 +219,7 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -230,7 +230,7 @@ test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone directory with --sparse' '
+test_expect_success 'can move out-of-cone directory with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [WIP v3 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-21 23:30   ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Victoria Dye
@ 2022-06-23 15:06     ` Derrick Stolee
  2022-06-23 16:19       ` Junio C Hamano
  0 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-06-23 15:06 UTC (permalink / raw)
  To: Victoria Dye, Shaoxuan Yuan; +Cc: git, gitster, newren

On 6/21/2022 7:30 PM, Victoria Dye wrote:
> Shaoxuan Yuan wrote:
>> But I think it worth discuss if we should implement in-cone to 
>> out-of-cone move, since it will be nice (naturally) to have it working.
>>
>> However, I noticed this from the mv man page:
>>
>> "In the second form, the last argument has to be an existing directory; 
>> the given sources will be moved into this directory."
>>
>> I think trying to move out-of-cone, the last argument has to be an non-existent
>> directory? I'm a bit confused: should we update some of mv basic logic to 
>> accomplish this?
>>
> 
> I suspect this requirement is related to the POSIX 'mv' [1] (and
> corresponding 'rename()', used in 'git mv'), which also requires that the
> destination directory exists. I personally don't think this requirement
> needs to apply to 'git mv' at all, but note that changing the behavior would
> require first creating the necessary directories before calling 'rename()'. 
> 
> As a more conservative solution, you could do the parent directory creation
> *only* in the case of moving to a sparse contents-only directory (using
> something like the 'check_dir_in_index()' function you introduced to
> identify).
> 
> I'm also interested in hearing what others have to say, especially regarding
> historical context/use cases of 'git mv'.
> 
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html

I wanted to reply here to maybe get more attention on this point.

My personal opinion is that `git mv` should move to the location requested,
even if it requires adding parent directories. Changing that behavior might
need to come as its own topic, before doing the in-cone-to-out-of-cone work.
Knowing if this behavior can change (or must stay the same) informs how that
sparse case will work.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-23 11:41   ` [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
@ 2022-06-23 15:08     ` Derrick Stolee
  2022-06-24  8:04       ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-06-23 15:08 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
> Originally, "git mv" a sparse file from out-of-cone to
> in-cone does not update the moved file's sparsity (remove its
> SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
> not checked out in the working tree.
> 
> Update the behavior so that:
> 1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
>    corresponding cache entry.
> 2. The moved cache entry is checked out in the working tree to reflect
>    the updated sparsity.

Since this is a behavior change, can we test it? It would be good
to verify that the new path exists in the worktree after 'git mv'
succeeds.
 
Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 6/7] mv: use flags mode for update_mode
  2022-06-23 11:41   ` [PATCH v4 6/7] mv: use flags mode for update_mode Shaoxuan Yuan
@ 2022-06-23 15:10     ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-06-23 15:10 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
> As suggested by Derrick [1],
> move the in-line definition of "enum update_mode" to the top
> of the file and make it use "flags" mode (each state is a different
> bit in the word).

nit: strange line-wrapping here.

> Change the flag assignments from '=' (single assignment) to '|='
> (additive). Also change flag evaluation from '==' to '&', etc.

Code looks good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-23 11:41   ` [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-06-23 15:14     ` Derrick Stolee
  2022-06-24  7:57       ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-06-23 15:14 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
> Originally, moving a <source> directory which is not on-disk due
> to its existence outside of sparse-checkout cone, "giv mv" command
> errors out with "bad source".
> 
> Add a helper check_dir_in_index() function to see if a directory
> name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
> such directories.
> 
> Change the checking logic, so that such <source> directory makes
> "giv mv" command warns with "advise_on_updating_sparse_paths()"
> instead of "bad source"; also user now can supply a "--sparse" flag so
> this operation can be carried out successfully.
> 
> Helped-by: Victoria Dye <vdye@github.com>
> Helped-by: Derrick Stolee <derrickstolee@github.com>
> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
> ---
>  builtin/mv.c                  | 50 ++++++++++++++++++++++++++++++-----
>  t/t7002-mv-sparse-checkout.sh |  4 +--
>  2 files changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index aa29da4337..b5d0d8ef4f 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -25,6 +25,7 @@ enum update_mode {
>  	WORKING_DIRECTORY = (1 << 1),
>  	INDEX = (1 << 2),
>  	SPARSE = (1 << 3),
> +	SKIP_WORKTREE_DIR = (1 << 4),
>  };
>  
>  #define DUP_BASENAME 1
> @@ -123,6 +124,36 @@ static int index_range_of_same_dir(const char *src, int length,
>  	return last - first;
>  }
>  
> +/*
> + * Check if an out-of-cone directory should be in the index. Imagine this case
> + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
> + * and thus the directory is sparsified.
> + *
> + * Return 0 if such directory exist (i.e. with any of its contained files not
> + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
> + * Return 1 otherwise.
> + */

This description and the implementation seems like it will work
even if the path exists as a sparse directory in a sparse index.

It would be good to consider testing this kind of move for a
directory on the sparse boundary (where it would be a sparse
directory in a sparse index) _and_ if it is deeper than the
boundary (so the sparse index would expand in the cache_name_pos()
method). These tests can be written now for correctness, but later
the first case can be updated to use the 'ensure_not_expanded'
helper in t1092.

> +static int check_dir_in_index(const char *name)

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
                     ` (6 preceding siblings ...)
  2022-06-23 11:41   ` [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-06-23 15:16   ` Derrick Stolee
  2022-06-23 18:05     ` Junio C Hamano
  7 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-06-23 15:16 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
> ## Changes since WIP v3 ##

It's good to keep the main cover letter body around, even as the version
updates are included. Sometimes reviewers come to a topic late and want
to see the latest version be completely self-contained.
 
After reading all the patches with fresh eyes, I think the code looks
very good. I have a nitpick in a commit message and some recommendations
for additional tests, but otherwise I'm pretty happy with this version.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-23 15:06     ` Derrick Stolee
@ 2022-06-23 16:19       ` Junio C Hamano
  2022-06-24  8:26         ` Shaoxuan Yuan
  0 siblings, 1 reply; 95+ messages in thread
From: Junio C Hamano @ 2022-06-23 16:19 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye, Shaoxuan Yuan, git, newren

Derrick Stolee <derrickstolee@github.com> writes:

> On 6/21/2022 7:30 PM, Victoria Dye wrote:
>> Shaoxuan Yuan wrote:
>>> But I think it worth discuss if we should implement in-cone to 
>>> out-of-cone move, since it will be nice (naturally) to have it working.
>>>
>>> However, I noticed this from the mv man page:
>>>
>>> "In the second form, the last argument has to be an existing directory; 
>>> the given sources will be moved into this directory."
>>>
>>> I think trying to move out-of-cone, the last argument has to be an non-existent
>>> directory? I'm a bit confused: should we update some of mv basic logic to 
>>> accomplish this?
>>>
>> 
>> I suspect this requirement is related to the POSIX 'mv' [1] (and
>> corresponding 'rename()', used in 'git mv'), which also requires that the
>> destination directory exists. I personally don't think this requirement
>> needs to apply to 'git mv' at all, but note that changing the behavior would
>> require first creating the necessary directories before calling 'rename()'. 
>> 
>> As a more conservative solution, you could do the parent directory creation
>> *only* in the case of moving to a sparse contents-only directory (using
>> something like the 'check_dir_in_index()' function you introduced to
>> identify).
>> 
>> I'm also interested in hearing what others have to say, especially regarding
>> historical context/use cases of 'git mv'.
>> 
>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html
>
> I wanted to reply here to maybe get more attention on this point.
>
> My personal opinion is that `git mv` should move to the location requested,
> even if it requires adding parent directories. Changing that behavior might
> need to come as its own topic, before doing the in-cone-to-out-of-cone work.
> Knowing if this behavior can change (or must stay the same) informs how that
> sparse case will work.

When a particular checkout excludes directory, say, Documentation/,
from its cone of interest, we may not have that directory in the
working tree.  In such a scenario, if you did 

    $ git mv new.txt Documentation/technical/

the index may not even know if "Documentation/" (which most likely
is represented as a sparse "tree entry in the index") has the
"technical" subdirectory in it, so it may have to expand it
on-demand.  I do not have an objection against making it easier for
users to do this.

As part of its implementation, you may have to do an equivalent of
"mkdir -p Documentation/technical/" before you can even materialize
the "new.txt" file in the directory.  I do not think that breaks the
parallel to POSIX "mv" in that case, as Documentation/technical/ is
*NOT* really a destination directory that does not exist.
Conceptually, the directory (and all the directories in the full
tree) exists---it is just the sparse-checkout is hiding it from the
view.

A corollary to the above is what should happen when you did

    $ git mv new.txt Documentation/no-such-directory/

i.e. you try to do a move that would fail even if you weren't using
the sparse-checkout feature.  I think that *SHOULD* fail, if we
wanted to be parallel to what POSIX "mv" does.

Thanks.









^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-23 15:16   ` [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic Derrick Stolee
@ 2022-06-23 18:05     ` Junio C Hamano
  0 siblings, 0 replies; 95+ messages in thread
From: Junio C Hamano @ 2022-06-23 18:05 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Shaoxuan Yuan, git, vdye

Derrick Stolee <derrickstolee@github.com> writes:

> On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
>> ## Changes since WIP v3 ##
>
> It's good to keep the main cover letter body around, even as the version
> updates are included. Sometimes reviewers come to a topic late and want
> to see the latest version be completely self-contained.

Yup.  Thanks for stressing on it.

> After reading all the patches with fresh eyes, I think the code looks
> very good. I have a nitpick in a commit message and some recommendations
> for additional tests, but otherwise I'm pretty happy with this version.

Thanks for ushering the topic forward.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-23 15:14     ` Derrick Stolee
@ 2022-06-24  7:57       ` Shaoxuan Yuan
  2022-06-27 13:59         ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-24  7:57 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, gitster, vdye

On 6/23/2022 11:14 PM, Derrick Stolee wrote:
 > On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
 >> Originally, moving a <source> directory which is not on-disk due
 >> to its existence outside of sparse-checkout cone, "giv mv" command
 >> errors out with "bad source".
 >>
 >> Add a helper check_dir_in_index() function to see if a directory
 >> name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
 >> such directories.
 >>
 >> Change the checking logic, so that such <source> directory makes
 >> "giv mv" command warns with "advise_on_updating_sparse_paths()"
 >> instead of "bad source"; also user now can supply a "--sparse" flag so
 >> this operation can be carried out successfully.
 >>
 >> Helped-by: Victoria Dye <vdye@github.com>
 >> Helped-by: Derrick Stolee <derrickstolee@github.com>
 >> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
 >> ---
 >>  builtin/mv.c                  | 50 ++++++++++++++++++++++++++++++-----
 >>  t/t7002-mv-sparse-checkout.sh |  4 +--
 >>  2 files changed, 46 insertions(+), 8 deletions(-)
 >>
 >> diff --git a/builtin/mv.c b/builtin/mv.c
 >> index aa29da4337..b5d0d8ef4f 100644
 >> --- a/builtin/mv.c
 >> +++ b/builtin/mv.c
 >> @@ -25,6 +25,7 @@ enum update_mode {
 >>      WORKING_DIRECTORY = (1 << 1),
 >>      INDEX = (1 << 2),
 >>      SPARSE = (1 << 3),
 >> +    SKIP_WORKTREE_DIR = (1 << 4),
 >>  };
 >>
 >>  #define DUP_BASENAME 1
 >> @@ -123,6 +124,36 @@ static int index_range_of_same_dir(const char 
*src, int length,
 >>      return last - first;
 >>  }
 >>
 >> +/*
 >> + * Check if an out-of-cone directory should be in the index. 
Imagine this case
 >> + * that all the files under a directory are marked with 
'CE_SKIP_WORKTREE' bit
 >> + * and thus the directory is sparsified.
 >> + *
 >> + * Return 0 if such directory exist (i.e. with any of its contained 
files not
 >> + * marked with CE_SKIP_WORKTREE, the directory would be present in 
working tree).
 >> + * Return 1 otherwise.
 >> + */
 >
 > This description and the implementation seems like it will work
 > even if the path exists as a sparse directory in a sparse index.
 >
 > It would be good to consider testing this kind of move for a
 > directory on the sparse boundary (where it would be a sparse
 > directory in a sparse index) _and_ if it is deeper than the
 > boundary (so the sparse index would expand in the cache_name_pos()
 > method). These tests can be written now for correctness, but later
 > the first case can be updated to use the 'ensure_not_expanded'
 > helper in t1092.

I'm a bit confused here. Shouldn't we turn on the sparse-index
feature for 'mv' before adding sparse-index related tests? Since this
series does not go into sparse-index, I'm not sure how the tests can
pass. Perhaps we can test about this in the future sparse-index
integration series, no?

Thanks & Regards,
Shaoxuan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-23 15:08     ` Derrick Stolee
@ 2022-06-24  8:04       ` Shaoxuan Yuan
  2022-06-27 13:55         ` Derrick Stolee
  0 siblings, 1 reply; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-24  8:04 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, gitster, vdye


On 6/23/2022 11:08 PM, Derrick Stolee wrote:
 > On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
 >> Originally, "git mv" a sparse file from out-of-cone to
 >> in-cone does not update the moved file's sparsity (remove its
 >> SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
 >> not checked out in the working tree.
 >>
 >> Update the behavior so that:
 >> 1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
 >>    corresponding cache entry.
 >> 2. The moved cache entry is checked out in the working tree to reflect
 >>    the updated sparsity.
 >
 > Since this is a behavior change, can we test it? It would be good
 > to verify that the new path exists in the worktree after 'git mv'
 > succeeds.

I don't think we can effectively test this based on the change per se.
This change is preparing a correct behavior for the next few
commits, so I'll say it's tested along with the next few commits
(i.e. move "sparse" file/directory ones)?

Thanks & Regards,
Shaoxuan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [WIP v3 0/7] mv: fix out-of-cone file/directory move logic
  2022-06-23 16:19       ` Junio C Hamano
@ 2022-06-24  8:26         ` Shaoxuan Yuan
  0 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-24  8:26 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee; +Cc: Victoria Dye, git, newren


On 6/24/2022 12:19 AM, Junio C Hamano wrote:
 > Derrick Stolee <derrickstolee@github.com> writes:
 >
 >> On 6/21/2022 7:30 PM, Victoria Dye wrote:
 >>> Shaoxuan Yuan wrote:
 >>>> But I think it worth discuss if we should implement in-cone to
 >>>> out-of-cone move, since it will be nice (naturally) to have it 
working.
 >>>>
 >>>> However, I noticed this from the mv man page:
 >>>>
 >>>> "In the second form, the last argument has to be an existing 
directory;
 >>>> the given sources will be moved into this directory."
 >>>>
 >>>> I think trying to move out-of-cone, the last argument has to be an 
non-existent
 >>>> directory? I'm a bit confused: should we update some of mv basic 
logic to
 >>>> accomplish this?
 >>>>
 >>>
 >>> I suspect this requirement is related to the POSIX 'mv' [1] (and
 >>> corresponding 'rename()', used in 'git mv'), which also requires 
that the
 >>> destination directory exists. I personally don't think this requirement
 >>> needs to apply to 'git mv' at all, but note that changing the 
behavior would
 >>> require first creating the necessary directories before calling 
'rename()'.
 >>>
 >>> As a more conservative solution, you could do the parent directory 
creation
 >>> *only* in the case of moving to a sparse contents-only directory (using
 >>> something like the 'check_dir_in_index()' function you introduced to
 >>> identify).
 >>>
 >>> I'm also interested in hearing what others have to say, especially 
regarding
 >>> historical context/use cases of 'git mv'.
 >>>
 >>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html
 >>
 >> I wanted to reply here to maybe get more attention on this point.
 >>
 >> My personal opinion is that `git mv` should move to the location 
requested,
 >> even if it requires adding parent directories. Changing that 
behavior might
 >> need to come as its own topic, before doing the 
in-cone-to-out-of-cone work.
 >> Knowing if this behavior can change (or must stay the same) informs 
how that
 >> sparse case will work.
 >
 > When a particular checkout excludes directory, say, Documentation/,
 > from its cone of interest, we may not have that directory in the
 > working tree.  In such a scenario, if you did
 >
 >     $ git mv new.txt Documentation/technical/
 >
 > the index may not even know if "Documentation/" (which most likely
 > is represented as a sparse "tree entry in the index") has the
 > "technical" subdirectory in it, so it may have to expand it
 > on-demand.  I do not have an objection against making it easier for
 > users to do this.
 >
 > As part of its implementation, you may have to do an equivalent of
 > "mkdir -p Documentation/technical/" before you can even materialize
 > the "new.txt" file in the directory.  I do not think that breaks the
 > parallel to POSIX "mv" in that case, as Documentation/technical/ is
 > *NOT* really a destination directory that does not exist.
 > Conceptually, the directory (and all the directories in the full
 > tree) exists---it is just the sparse-checkout is hiding it from the
 > view.
 >
 > A corollary to the above is what should happen when you did
 >
 >     $ git mv new.txt Documentation/no-such-directory/
 >
 > i.e. you try to do a move that would fail even if you weren't using
 > the sparse-checkout feature.  I think that *SHOULD* fail, if we
 > wanted to be parallel to what POSIX "mv" does.
 >
 > Thanks.

Agree.

Thanks & Regards,
Shaoxuan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-24  8:04       ` Shaoxuan Yuan
@ 2022-06-27 13:55         ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-06-27 13:55 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/24/2022 4:04 AM, Shaoxuan Yuan wrote:
> 
> On 6/23/2022 11:08 PM, Derrick Stolee wrote:
>> On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
>>> Originally, "git mv" a sparse file from out-of-cone to
>>> in-cone does not update the moved file's sparsity (remove its
>>> SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
>>> not checked out in the working tree.
>>>
>>> Update the behavior so that:
>>> 1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
>>>    corresponding cache entry.
>>> 2. The moved cache entry is checked out in the working tree to reflect
>>>    the updated sparsity.
>>
>> Since this is a behavior change, can we test it? It would be good
>> to verify that the new path exists in the worktree after 'git mv'
>> succeeds.
> 
> I don't think we can effectively test this based on the change per se.
> This change is preparing a correct behavior for the next few
> commits, so I'll say it's tested along with the next few commits
> (i.e. move "sparse" file/directory ones)?

Ah, right. There are other reasons why moving from out-of-cone to
in-cone is blocked at this point in time.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-24  7:57       ` Shaoxuan Yuan
@ 2022-06-27 13:59         ` Derrick Stolee
  0 siblings, 0 replies; 95+ messages in thread
From: Derrick Stolee @ 2022-06-27 13:59 UTC (permalink / raw)
  To: Shaoxuan Yuan; +Cc: git, gitster, vdye

On 6/24/2022 3:57 AM, Shaoxuan Yuan wrote:
> On 6/23/2022 11:14 PM, Derrick Stolee wrote:
>> On 6/23/2022 7:41 AM, Shaoxuan Yuan wrote:
>>> +/*
>>> + * Check if an out-of-cone directory should be in the index. Imagine this case
>>> + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
>>> + * and thus the directory is sparsified.
>>> + *
>>> + * Return 0 if such directory exist (i.e. with any of its contained files not
>>> + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
>>> + * Return 1 otherwise.
>>> + */
>>
>> This description and the implementation seems like it will work
>> even if the path exists as a sparse directory in a sparse index.
>>
>> It would be good to consider testing this kind of move for a
>> directory on the sparse boundary (where it would be a sparse
>> directory in a sparse index) _and_ if it is deeper than the
>> boundary (so the sparse index would expand in the cache_name_pos()
>> method). These tests can be written now for correctness, but later
>> the first case can be updated to use the 'ensure_not_expanded'
>> helper in t1092.
> 
> I'm a bit confused here. Shouldn't we turn on the sparse-index
> feature for 'mv' before adding sparse-index related tests? Since this
> series does not go into sparse-index, I'm not sure how the tests can
> pass. Perhaps we can test about this in the future sparse-index
> integration series, no?

I mean that you are making a change right now that might lead to
different behavior _when you enable the sparse index later_. Since
we are looking at this behavior now, it might be interesting to add
the extra test coverage now while we are thinking about this data
shape.

We can add tests to t1092-sparse-checkout-compatibility.sh even
though the sparse index is expanded on every 'git mv' instance, but
it can help when you eventually update 'git mv' to not expand on a
sparse index.

The bit about ensure_not_expanded can't be added until you integrate
with the sparse index, but it is easier to add that when you already
have tests that check these boundary cases.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic
  2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
                   ` (9 preceding siblings ...)
  2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
@ 2022-06-30  2:37 ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 1/8] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
                     ` (8 more replies)
  10 siblings, 9 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

## Changes since PATCH v4 ##

1. Fix style-nits.

2. Add t1092 tests (2/8) for "mv: add check_dir_in_index() and solve 
   general dir check issue" (8/8).

## Changes since WIP v3 ##

1. Fix style-nits.

2. Move "mv: update sparsity after moving from out-of-cone to in-cone" to the 
   (2/7) position (was (7/7) position in WIP v3). Make this move so that we
   can drop "git sparse-checkout reapply" from the tests, as suggested by
   Victoria [1]. The reason is we need to check out the moved cache_entry
   for all out-of-cone to in-cone moves, so this commit works better if being the
   first one.

3. Fix the commit message of "mv: use flags mode for update_mode", as suggested
   here [2].

4. Add "Helped-by" and "Suggested-by" trailers.

5. In "mv: add check_dir_in_index() and solve general dir check issue", change
   the 'check_dir_in_index()' to no more accept 'namelen' as argument. The
   original "namelen + 1" logic can be erroneous, for example when 'name' already
   has a trailing slash. So just use strlen() to save the trouble.

[1] https://lore.kernel.org/git/adb795ba-56ce-8441-0c38-a3e6b0a6e861@github.com/
[2] https://lore.kernel.org/git/01b39c63-5652-4293-0424-ff99b6f9f7d2@github.com/

## Changes since WIP v2 ##

1. Write helper functions for t7002 to reuse some code.

2. Refactor/decouple the if/else-if checking chain.

3. Separate out the 'update_mode' refactor into a single commit.

4. Stop using update_sparsity() and instead update the SKIP_WORKTREE
   bit for each cache_entry and check it out to the working tree.

## Changes since WIP v1 ##

1. Move t7002 tests to the front and turn corresponding tests to 
   test_expect_success along with corresponding commits.

2. Add two tests to t7002.

3. Update check_dir_in_index() and added corresponding documentation.

4. Turn update_mode into enum flags.

5. Use update_sparsity() to replace advise*() function after touching
   sparse contents (this change is INCOMPLETE, NEED FIX).

6. Fix some format issues.

## Starting WIP v1 ##

Before integrating 'mv' with sparse-index, I still find some possibly buggy
UX when 'mv' is interacting with 'sparse-checkout'. 

So I kept sparse-index off in order to sort things out without a sparse index.
We can proceed to integrate with sparse-index once these changes are solid.

Note that this patch is tentative, and still have known glitches, but it 
illustrates a general approach that I intended to harmonize 'mv' 
with 'sparse-checkout'.

## Pre WIP v1 discussion ##

RFC patch about integarting sparse index with 'mv' [3]

[3] https://lore.kernel.org/git/20220315100145.214054-1-shaoxuan.yuan02@gmail.com/

Shaoxuan Yuan (8):
  t7002: add tests for moving out-of-cone file/directory
  t1092: mv directory from out-of-cone to in-cone
  mv: update sparsity after moving from out-of-cone to in-cone
  mv: decouple if/else-if checks using goto
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: check if <destination> exists in index to handle overwriting
  mv: use flags mode for update_mode
  mv: add check_dir_in_index() and solve general dir check issue

 builtin/mv.c                             | 239 +++++++++++++++++------
 t/t1092-sparse-checkout-compatibility.sh |  25 +++
 t/t7002-mv-sparse-checkout.sh            |  84 ++++++++
 3 files changed, 284 insertions(+), 64 deletions(-)

Range-diff against v4:
1:  6ec0fee96c = 1:  c2715ffd84 t7002: add tests for moving out-of-cone file/directory
-:  ---------- > 2:  7069755abd t1092: mv directory from out-of-cone to in-cone
2:  0887fbbc82 ! 3:  33188b9f1c mv: update sparsity after moving from out-of-cone to in-cone
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
     +			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
     +
     +			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
    -+				die(_("cannot checkout %s"), ce->name);
    ++				die(_("cannot checkout %s"), active_cache[dst_pos]->name);
     +		}
      	}
      
3:  e79785487e = 4:  6100d6ca40 mv: decouple if/else-if checks using goto
4:  bdbfd90843 = 5:  c34e2871d1 mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
5:  807de09f19 = 6:  e4a0ba40df mv: check if <destination> exists in index to handle overwriting
6:  af84df7dd0 ! 7:  7bc5cddcf4 mv: use flags mode for update_mode
    @@ Metadata
      ## Commit message ##
         mv: use flags mode for update_mode
     
    -    As suggested by Derrick [1],
    -    move the in-line definition of "enum update_mode" to the top
    -    of the file and make it use "flags" mode (each state is a different
    -    bit in the word).
    +    As suggested by Derrick [1], move the in-line definition of
    +    "enum update_mode" to the top of the file and make it use "flags"
    +    mode (each state is a different bit in the word).
     
         Change the flag assignments from '=' (single assignment) to '|='
         (additive). Also change flag evaluation from '==' to '&', etc.
7:  d9fd1c452c ! 8:  9f89faa557 mv: add check_dir_in_index() and solve general dir check issue
    @@ builtin/mv.c: int cmd_mv(int argc, const char **argv, const char *prefix)
      
      		pos = cache_name_pos(src, strlen(src));
     
    + ## t/t1092-sparse-checkout-compatibility.sh ##
    +@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'checkout behaves oddly with df-conflict-2' '
    + 	test_cmp full-checkout-err sparse-index-err
    + '
    + 
    +-test_expect_failure 'mv directory from out-of-cone to in-cone' '
    ++test_expect_success 'mv directory from out-of-cone to in-cone' '
    + 	init_repos &&
    + 
    + 	# <source> as a sparse directory (or SKIP_WORKTREE_DIR without enabling
    +
      ## t/t7002-mv-sparse-checkout.sh ##
     @@ t/t7002-mv-sparse-checkout.sh: test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
      	test_cmp expect stderr

base-commit: e4a4b31577c7419497ac30cebe30d755b97752c5
-- 
2.35.1


^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH v5 1/8] t7002: add tests for moving out-of-cone file/directory
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 2/8] t1092: mv directory from out-of-cone to in-cone Shaoxuan Yuan
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Add corresponding tests to test following situations:

We do not have sufficient coverage of moving files outside
of a sparse-checkout cone. Create new tests covering this
behavior, keeping in mind that the user can include --sparse
(or not), move a file or directory, and the destination can
already exist in the index (in this case user can use --force
to overwrite existing entry).

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t7002-mv-sparse-checkout.sh | 84 +++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index f0f7cbfcdb..023e657c9e 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -4,6 +4,18 @@ test_description='git mv in sparse working trees'
 
 . ./test-lib.sh
 
+setup_sparse_checkout () {
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	git add folder1 &&
+	git sparse-checkout set --cone sub
+}
+
+cleanup_sparse_checkout () {
+	git sparse-checkout disable &&
+	git reset --hard
+}
+
 test_expect_success 'setup' "
 	mkdir -p sub/dir sub/dir2 &&
 	touch a b c sub/d sub/dir/e sub/dir2/e &&
@@ -196,6 +208,7 @@ test_expect_success 'can move files to non-sparse dir' '
 '
 
 test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
+	test_when_finished "cleanup_sparse_checkout" &&
 	git reset --hard &&
 	git sparse-checkout init --no-cone &&
 	git sparse-checkout set a !/x y/ !x/y/z &&
@@ -206,4 +219,75 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
+test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone directory with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+
+	test_path_is_dir sub/folder1 &&
+	test_path_is_file sub/folder1/file1
+'
+
+test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	test_must_fail git mv folder1/file1 sub 2>stderr &&
+	cat sparse_error_header >expect &&
+	echo folder1/file1 >>expect &&
+	cat sparse_hint >>expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'can move out-of-cone file with --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	setup_sparse_checkout &&
+
+	git mv --sparse folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+
+	test_path_is_file sub/file1
+'
+
+test_expect_failure 'refuse to move sparse file to existing destination' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	test_must_fail git mv --sparse folder1/file1 sub 2>stderr &&
+	echo "fatal: destination exists, source=folder1/file1, destination=sub/file1" >expect &&
+	test_cmp expect stderr
+'
+
+test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+	test_when_finished "cleanup_sparse_checkout" &&
+	mkdir folder1 &&
+	touch folder1/file1 &&
+	touch sub/file1 &&
+	echo "overwrite" >folder1/file1 &&
+	git add folder1 sub/file1 &&
+	git sparse-checkout set --cone sub &&
+
+	git mv --sparse --force folder1/file1 sub 2>stderr &&
+	test_must_be_empty stderr &&
+	echo "overwrite" >expect &&
+	test_cmp expect sub/file1
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 2/8] t1092: mv directory from out-of-cone to in-cone
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 1/8] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 3/8] mv: update sparsity after moving " Shaoxuan Yuan
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Add test for "mv: add check_dir_in_index() and solve general dir check
issue" in this series.

This change tests the following:

1. mv <source> as a directory on the sparse index boundary (where it
   would be a sparse directory in a sparse index).
2. mv <source> as a directory which is deeper than the boundary (so
   the sparse index would expand in the cache_name_pos() method).

These tests can be written now for correctness, but later the first case
can be updated to use the 'ensure_not_expanded' helper in t1092.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index f9f8c988bb..5eef799e25 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1828,4 +1828,29 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
 	test_cmp full-checkout-err sparse-index-err
 '
 
+test_expect_failure 'mv directory from out-of-cone to in-cone' '
+	init_repos &&
+
+	# <source> as a sparse directory (or SKIP_WORKTREE_DIR without enabling
+	# sparse index).
+	test_all_match git mv --sparse folder1 deep &&
+	test_all_match git status --porcelain=v2 &&
+	test_sparse_match git ls-files -t &&
+	git -C sparse-checkout ls-files -t >actual &&
+	grep -e "H deep/folder1/0/0/0" actual &&
+	grep -e "H deep/folder1/0/1" actual &&
+	grep -e "H deep/folder1/a" actual &&
+
+	test_all_match git reset --hard &&
+
+	# <source> as a directory deeper than sparse index boundary (where
+	# sparse index will expand).
+	test_sparse_match git mv --sparse folder1/0 deep &&
+	test_sparse_match git status --porcelain=v2 &&
+	test_sparse_match git ls-files -t &&
+	git -C sparse-checkout ls-files -t >actual &&
+	grep -e "H deep/0/0/0" actual &&
+	grep -e "H deep/0/1" actual
+'
+
 test_done
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 3/8] mv: update sparsity after moving from out-of-cone to in-cone
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 1/8] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 2/8] t1092: mv directory from out-of-cone to in-cone Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 4/8] mv: decouple if/else-if checks using goto Shaoxuan Yuan
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.

Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
   corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
   the updated sparsity.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/builtin/mv.c b/builtin/mv.c
index 83a465ba83..5f4eeadd5d 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -13,6 +13,7 @@
 #include "string-list.h"
 #include "parse-options.h"
 #include "submodule.h"
+#include "entry.h"
 
 static const char * const builtin_mv_usage[] = {
 	N_("git mv [<options>] <source>... <destination>"),
@@ -304,6 +305,11 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
 		int pos;
+		struct checkout state = CHECKOUT_INIT;
+		state.istate = &the_index;
+
+		if (force)
+			state.force = 1;
 		if (show_only || verbose)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
@@ -328,6 +334,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
 		rename_cache_entry_at(pos, dst);
+
+		if ((mode & SPARSE) &&
+		    (path_in_sparse_checkout(dst, &the_index))) {
+			int dst_pos;
+
+			dst_pos = cache_name_pos(dst, strlen(dst));
+			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
+
+			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
+				die(_("cannot checkout %s"), active_cache[dst_pos]->name);
+		}
 	}
 
 	if (gitmodules_modified)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 4/8] mv: decouple if/else-if checks using goto
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (2 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 3/8] mv: update sparsity after moving " Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 5/8] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Previous if/else-if chain are highly nested and hard to develop/extend.

Refactor to decouple this if/else-if chain by using goto to jump ahead.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 139 +++++++++++++++++++++++++++++----------------------
 1 file changed, 80 insertions(+), 59 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 5f4eeadd5d..e800da3ab8 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -187,53 +187,68 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
 			/* only error if existence is expected. */
-			if (modes[i] != SPARSE)
+			if (modes[i] != SPARSE) {
 				bad = _("bad source");
-		} else if (!strncmp(src, dst, length) &&
-				(dst[length] == 0 || dst[length] == '/')) {
+				goto act_on_entry;
+			}
+		}
+		if (!strncmp(src, dst, length) &&
+		    (dst[length] == 0 || dst[length] == '/')) {
 			bad = _("can not move directory into itself");
-		} else if ((src_is_dir = S_ISDIR(st.st_mode))
-				&& lstat(dst, &st) == 0)
+			goto act_on_entry;
+		}
+		if ((src_is_dir = S_ISDIR(st.st_mode))
+		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
-		else if (src_is_dir) {
+			goto act_on_entry;
+		}
+		if (src_is_dir) {
+			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
-			if (first >= 0)
+			if (first >= 0) {
 				prepare_move_submodule(src, first,
 						       submodule_gitfile + i);
-			else if (index_range_of_same_dir(src, length,
-							 &first, &last) < 1)
+				goto act_on_entry;
+			} else if (index_range_of_same_dir(src, length,
+							   &first, &last) < 1) {
 				bad = _("source directory is empty");
-			else { /* last - first >= 1 */
-				int j, dst_len, n;
-
-				modes[i] = WORKING_DIRECTORY;
-				n = argc + last - first;
-				REALLOC_ARRAY(source, n);
-				REALLOC_ARRAY(destination, n);
-				REALLOC_ARRAY(modes, n);
-				REALLOC_ARRAY(submodule_gitfile, n);
-
-				dst = add_slash(dst);
-				dst_len = strlen(dst);
-
-				for (j = 0; j < last - first; j++) {
-					const struct cache_entry *ce = active_cache[first + j];
-					const char *path = ce->name;
-					source[argc + j] = path;
-					destination[argc + j] =
-						prefix_path(dst, dst_len, path + length + 1);
-					modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
-					submodule_gitfile[argc + j] = NULL;
-				}
-				argc += last - first;
+				goto act_on_entry;
 			}
-		} else if (!(ce = cache_file_exists(src, length, 0))) {
+
+			/* last - first >= 1 */
+			modes[i] = WORKING_DIRECTORY;
+			n = argc + last - first;
+			REALLOC_ARRAY(source, n);
+			REALLOC_ARRAY(destination, n);
+			REALLOC_ARRAY(modes, n);
+			REALLOC_ARRAY(submodule_gitfile, n);
+
+			dst = add_slash(dst);
+			dst_len = strlen(dst);
+
+			for (j = 0; j < last - first; j++) {
+				const struct cache_entry *ce = active_cache[first + j];
+				const char *path = ce->name;
+				source[argc + j] = path;
+				destination[argc + j] =
+					prefix_path(dst, dst_len, path + length + 1);
+				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				submodule_gitfile[argc + j] = NULL;
+			}
+			argc += last - first;
+			goto act_on_entry;
+		}
+		if (!(ce = cache_file_exists(src, length, 0))) {
 			bad = _("not under version control");
-		} else if (ce_stage(ce)) {
+			goto act_on_entry;
+		}
+		if (ce_stage(ce)) {
 			bad = _("conflicted");
-		} else if (lstat(dst, &st) == 0 &&
-			 (!ignore_case || strcasecmp(src, dst))) {
+			goto act_on_entry;
+		}
+		if (lstat(dst, &st) == 0 &&
+		    (!ignore_case || strcasecmp(src, dst))) {
 			bad = _("destination exists");
 			if (force) {
 				/*
@@ -247,34 +262,40 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = _("Cannot overwrite");
 			}
-		} else if (string_list_has_string(&src_for_dst, dst))
+			goto act_on_entry;
+		}
+		if (string_list_has_string(&src_for_dst, dst)) {
 			bad = _("multiple sources for the same target");
-		else if (is_dir_sep(dst[strlen(dst) - 1]))
+			goto act_on_entry;
+		}
+		if (is_dir_sep(dst[strlen(dst) - 1])) {
 			bad = _("destination directory does not exist");
-		else {
-			/*
-			 * We check if the paths are in the sparse-checkout
-			 * definition as a very final check, since that
-			 * allows us to point the user to the --sparse
-			 * option as a way to have a successful run.
-			 */
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(src, &the_index)) {
-				string_list_append(&only_match_skip_worktree, src);
-				skip_sparse = 1;
-			}
-			if (!ignore_sparse &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
-				string_list_append(&only_match_skip_worktree, dst);
-				skip_sparse = 1;
-			}
-
-			if (skip_sparse)
-				goto remove_entry;
+			goto act_on_entry;
+		}
 
-			string_list_insert(&src_for_dst, dst);
+		/*
+		 * We check if the paths are in the sparse-checkout
+		 * definition as a very final check, since that
+		 * allows us to point the user to the --sparse
+		 * option as a way to have a successful run.
+		 */
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(src, &the_index)) {
+			string_list_append(&only_match_skip_worktree, src);
+			skip_sparse = 1;
+		}
+		if (!ignore_sparse &&
+		    !path_in_sparse_checkout(dst, &the_index)) {
+			string_list_append(&only_match_skip_worktree, dst);
+			skip_sparse = 1;
 		}
 
+		if (skip_sparse)
+			goto remove_entry;
+
+		string_list_insert(&src_for_dst, dst);
+
+act_on_entry:
 		if (!bad)
 			continue;
 		if (!ignore_errors)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 5/8] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (3 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 4/8] mv: decouple if/else-if checks using goto Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 6/8] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 21 +++++++++++++++++++--
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index e800da3ab8..520be85774 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -186,11 +186,28 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		length = strlen(src);
 		if (lstat(src, &st) < 0) {
-			/* only error if existence is expected. */
-			if (modes[i] != SPARSE) {
+			int pos;
+			const struct cache_entry *ce;
+
+			pos = cache_name_pos(src, length);
+			if (pos < 0) {
+				/* only error if existence is expected. */
+				if (modes[i] != SPARSE)
+					bad = _("bad source");
+				goto act_on_entry;
+			}
+
+			ce = active_cache[pos];
+			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
 				goto act_on_entry;
 			}
+
+			if (!ignore_sparse)
+				string_list_append(&only_match_skip_worktree, src);
+			else
+				modes[i] = SPARSE;
+			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
 		    (dst[length] == 0 || dst[length] == '/')) {
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 023e657c9e..1510b5ed6a 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -241,7 +241,7 @@ test_expect_failure 'can move out-of-cone directory with --sparse' '
 	test_path_is_file sub/folder1/file1
 '
 
-test_expect_failure 'refuse to move out-of-cone file without --sparse' '
+test_expect_success 'refuse to move out-of-cone file without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -252,7 +252,7 @@ test_expect_failure 'refuse to move out-of-cone file without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone file with --sparse' '
+test_expect_success 'can move out-of-cone file with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 6/8] mv: check if <destination> exists in index to handle overwriting
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (4 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 5/8] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 7/8] mv: use flags mode for update_mode Shaoxuan Yuan
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                  | 15 ++++++++++++---
 t/t7002-mv-sparse-checkout.sh |  4 ++--
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 520be85774..7d9627938a 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -202,11 +202,20 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				bad = _("bad source");
 				goto act_on_entry;
 			}
-
-			if (!ignore_sparse)
+			if (!ignore_sparse) {
 				string_list_append(&only_match_skip_worktree, src);
-			else
+				goto act_on_entry;
+			}
+			/* Check if dst exists in index */
+			if (cache_name_pos(dst, strlen(dst)) < 0) {
 				modes[i] = SPARSE;
+				goto act_on_entry;
+			}
+			if (!force) {
+				bad = _("destination exists");
+				goto act_on_entry;
+			}
+			modes[i] = SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 1510b5ed6a..6d2fb4f8d2 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -262,7 +262,7 @@ test_expect_success 'can move out-of-cone file with --sparse' '
 	test_path_is_file sub/file1
 '
 
-test_expect_failure 'refuse to move sparse file to existing destination' '
+test_expect_success 'refuse to move sparse file to existing destination' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
@@ -275,7 +275,7 @@ test_expect_failure 'refuse to move sparse file to existing destination' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'move sparse file to existing destination with --force and --sparse' '
+test_expect_success 'move sparse file to existing destination with --force and --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	mkdir folder1 &&
 	touch folder1/file1 &&
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 7/8] mv: use flags mode for update_mode
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (5 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 6/8] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-06-30  2:37   ` [PATCH v5 8/8] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
  2022-07-01 19:43   ` [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic Derrick Stolee
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

As suggested by Derrick [1], move the in-line definition of
"enum update_mode" to the top of the file and make it use "flags"
mode (each state is a different bit in the word).

Change the flag assignments from '=' (single assignment) to '|='
(additive). Also change flag evaluation from '==' to '&', etc.

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 7d9627938a..b805a0d0f6 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -20,6 +20,13 @@ static const char * const builtin_mv_usage[] = {
 	NULL
 };
 
+enum update_mode {
+	BOTH = 0,
+	WORKING_DIRECTORY = (1 << 1),
+	INDEX = (1 << 2),
+	SPARSE = (1 << 3),
+};
+
 #define DUP_BASENAME 1
 #define KEEP_TRAILING_SLASH 2
 
@@ -130,7 +137,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX, SPARSE } *modes;
+	enum update_mode *modes;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -192,7 +199,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
 				/* only error if existence is expected. */
-				if (modes[i] != SPARSE)
+				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
@@ -208,14 +215,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 			/* Check if dst exists in index */
 			if (cache_name_pos(dst, strlen(dst)) < 0) {
-				modes[i] = SPARSE;
+				modes[i] |= SPARSE;
 				goto act_on_entry;
 			}
 			if (!force) {
 				bad = _("destination exists");
 				goto act_on_entry;
 			}
-			modes[i] = SPARSE;
+			modes[i] |= SPARSE;
 			goto act_on_entry;
 		}
 		if (!strncmp(src, dst, length) &&
@@ -243,7 +250,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			}
 
 			/* last - first >= 1 */
-			modes[i] = WORKING_DIRECTORY;
+			modes[i] |= WORKING_DIRECTORY;
 			n = argc + last - first;
 			REALLOC_ARRAY(source, n);
 			REALLOC_ARRAY(destination, n);
@@ -259,7 +266,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				source[argc + j] = path;
 				destination[argc + j] =
 					prefix_path(dst, dst_len, path + length + 1);
-				modes[argc + j] = ce_skip_worktree(ce) ? SPARSE : INDEX;
+				memset(modes + argc + j, 0, sizeof(enum update_mode));
+				modes[argc + j] |= ce_skip_worktree(ce) ? SPARSE : INDEX;
 				submodule_gitfile[argc + j] = NULL;
 			}
 			argc += last - first;
@@ -361,7 +369,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
+		if (!(mode & (INDEX | SPARSE)) &&
+		    rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
 			die_errno(_("renaming '%s' failed"), src);
@@ -375,7 +384,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode == WORKING_DIRECTORY)
+		if (mode & (WORKING_DIRECTORY))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* [PATCH v5 8/8] mv: add check_dir_in_index() and solve general dir check issue
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (6 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 7/8] mv: use flags mode for update_mode Shaoxuan Yuan
@ 2022-06-30  2:37   ` Shaoxuan Yuan
  2022-07-01 19:43   ` [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic Derrick Stolee
  8 siblings, 0 replies; 95+ messages in thread
From: Shaoxuan Yuan @ 2022-06-30  2:37 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, vdye, gitster, Shaoxuan Yuan

Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c                             | 50 +++++++++++++++++++++---
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 t/t7002-mv-sparse-checkout.sh            |  4 +-
 3 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index b805a0d0f6..2a38e2af46 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -25,6 +25,7 @@ enum update_mode {
 	WORKING_DIRECTORY = (1 << 1),
 	INDEX = (1 << 2),
 	SPARSE = (1 << 3),
+	SKIP_WORKTREE_DIR = (1 << 4),
 };
 
 #define DUP_BASENAME 1
@@ -123,6 +124,36 @@ static int index_range_of_same_dir(const char *src, int length,
 	return last - first;
 }
 
+/*
+ * Check if an out-of-cone directory should be in the index. Imagine this case
+ * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
+ * and thus the directory is sparsified.
+ *
+ * Return 0 if such directory exist (i.e. with any of its contained files not
+ * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
+ * Return 1 otherwise.
+ */
+static int check_dir_in_index(const char *name)
+{
+	const char *with_slash = add_slash(name);
+	int length = strlen(with_slash);
+
+	int pos = cache_name_pos(with_slash, length);
+	const struct cache_entry *ce;
+
+	if (pos < 0) {
+		pos = -pos - 1;
+		if (pos >= the_index.cache_nr)
+			return 1;
+		ce = active_cache[pos];
+		if (strncmp(with_slash, ce->name, length))
+			return 1;
+		if (ce_skip_worktree(ce))
+			return 0;
+	}
+	return 1;
+}
+
 int cmd_mv(int argc, const char **argv, const char *prefix)
 {
 	int i, flags, gitmodules_modified = 0;
@@ -184,7 +215,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length;
 		const char *bad = NULL;
 		int skip_sparse = 0;
 
@@ -198,12 +229,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 			pos = cache_name_pos(src, length);
 			if (pos < 0) {
+				const char *src_w_slash = add_slash(src);
+				if (!path_in_sparse_checkout(src_w_slash, &the_index) &&
+				    !check_dir_in_index(src)) {
+					modes[i] |= SKIP_WORKTREE_DIR;
+					goto dir_check;
+				}
 				/* only error if existence is expected. */
 				if (!(modes[i] & SPARSE))
 					bad = _("bad source");
 				goto act_on_entry;
 			}
-
 			ce = active_cache[pos];
 			if (!ce_skip_worktree(ce)) {
 				bad = _("bad source");
@@ -230,12 +266,14 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			bad = _("can not move directory into itself");
 			goto act_on_entry;
 		}
-		if ((src_is_dir = S_ISDIR(st.st_mode))
+		if (S_ISDIR(st.st_mode)
 		    && lstat(dst, &st) == 0) {
 			bad = _("cannot move directory over file");
 			goto act_on_entry;
 		}
-		if (src_is_dir) {
+
+dir_check:
+		if (S_ISDIR(st.st_mode)) {
 			int j, dst_len, n;
 			int first = cache_name_pos(src, length), last;
 
@@ -369,7 +407,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 			printf(_("Renaming %s to %s\n"), src, dst);
 		if (show_only)
 			continue;
-		if (!(mode & (INDEX | SPARSE)) &&
+		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
 		    rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
@@ -384,7 +422,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 							      1);
 		}
 
-		if (mode & (WORKING_DIRECTORY))
+		if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
 			continue;
 
 		pos = cache_name_pos(src, strlen(src));
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 5eef799e25..763c6cc684 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1828,7 +1828,7 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
 	test_cmp full-checkout-err sparse-index-err
 '
 
-test_expect_failure 'mv directory from out-of-cone to in-cone' '
+test_expect_success 'mv directory from out-of-cone to in-cone' '
 	init_repos &&
 
 	# <source> as a sparse directory (or SKIP_WORKTREE_DIR without enabling
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index 6d2fb4f8d2..71fe29690f 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -219,7 +219,7 @@ test_expect_success 'refuse to move file to non-skip-worktree sparse path' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
+test_expect_success 'refuse to move out-of-cone directory without --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -230,7 +230,7 @@ test_expect_failure 'refuse to move out-of-cone directory without --sparse' '
 	test_cmp expect stderr
 '
 
-test_expect_failure 'can move out-of-cone directory with --sparse' '
+test_expect_success 'can move out-of-cone directory with --sparse' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic
  2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
                     ` (7 preceding siblings ...)
  2022-06-30  2:37   ` [PATCH v5 8/8] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
@ 2022-07-01 19:43   ` Derrick Stolee
  2022-07-01 21:50     ` Junio C Hamano
  8 siblings, 1 reply; 95+ messages in thread
From: Derrick Stolee @ 2022-07-01 19:43 UTC (permalink / raw)
  To: Shaoxuan Yuan, git; +Cc: vdye, gitster

On 6/29/2022 10:37 PM, Shaoxuan Yuan wrote:
> ## Changes since PATCH v4 ##
> 
> 1. Fix style-nits.
> 
> 2. Add t1092 tests (2/8) for "mv: add check_dir_in_index() and solve 
>    general dir check issue" (8/8).

Thank you for these updates. I just took a quick
re-read and I'm happy with this version.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic
  2022-07-01 19:43   ` [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic Derrick Stolee
@ 2022-07-01 21:50     ` Junio C Hamano
  0 siblings, 0 replies; 95+ messages in thread
From: Junio C Hamano @ 2022-07-01 21:50 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Shaoxuan Yuan, git, vdye

Derrick Stolee <derrickstolee@github.com> writes:

> On 6/29/2022 10:37 PM, Shaoxuan Yuan wrote:
>> ## Changes since PATCH v4 ##
>> 
>> 1. Fix style-nits.
>> 
>> 2. Add t1092 tests (2/8) for "mv: add check_dir_in_index() and solve 
>>    general dir check issue" (8/8).
>
> Thank you for these updates. I just took a quick
> re-read and I'm happy with this version.
>
> Thanks,
> -Stolee

Thanks, all.  Will queue.

^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2022-07-01 21:51 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-31  9:17 [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
2022-03-31  9:17 ` [WIP v1 1/4] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
2022-03-31 16:39   ` Victoria Dye
2022-04-01 14:30     ` Derrick Stolee
2022-03-31  9:17 ` [WIP v1 2/4] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
2022-03-31 10:25   ` Ævar Arnfjörð Bjarmason
2022-04-01  3:51     ` Shaoxuan Yuan
2022-03-31 21:28   ` Victoria Dye
2022-04-01 12:49     ` Shaoxuan Yuan
2022-04-01 14:49       ` Derrick Stolee
2022-04-04  7:25         ` Shaoxuan Yuan
2022-04-04  7:49           ` Shaoxuan Yuan
2022-04-04 12:43             ` Derrick Stolee
2022-03-31  9:17 ` [WIP v1 3/4] mv: add advise_to_reapply hint for moving file into cone Shaoxuan Yuan
2022-03-31 10:30   ` Ævar Arnfjörð Bjarmason
2022-04-01  4:00     ` Shaoxuan Yuan
2022-04-01  8:02       ` Ævar Arnfjörð Bjarmason
2022-04-03  2:01         ` Eric Sunshine
2022-03-31 21:56   ` Victoria Dye
2022-04-01 14:55   ` Derrick Stolee
2022-03-31  9:17 ` [WIP v1 4/4] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
2022-03-31 10:33   ` Ævar Arnfjörð Bjarmason
2022-03-31 22:11   ` Victoria Dye
2022-03-31  9:28 ` [WIP v1 0/4] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
2022-03-31 22:21 ` Victoria Dye
2022-04-01 12:18   ` Shaoxuan Yuan
2022-04-08 12:22 ` Shaoxuan Yuan
2022-05-27 10:07 ` [WIP v2 0/5] " Shaoxuan Yuan
2022-05-27 10:08   ` [WIP v2 1/5] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
2022-05-27 12:07     ` Ævar Arnfjörð Bjarmason
2022-05-27 14:48     ` Derrick Stolee
2022-05-27 15:51     ` Victoria Dye
2022-05-27 10:08   ` [WIP v2 2/5] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
2022-05-27 15:13     ` Derrick Stolee
2022-05-27 22:38       ` Victoria Dye
2022-05-31  8:06       ` Shaoxuan Yuan
2022-05-27 10:08   ` [WIP v2 3/5] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
2022-05-27 22:04     ` Victoria Dye
2022-05-27 10:08   ` [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
2022-05-27 15:27     ` Derrick Stolee
2022-05-31  9:56       ` Shaoxuan Yuan
2022-05-31 15:49         ` Derrick Stolee
2022-05-27 10:08   ` [WIP v2 5/5] mv: use update_sparsity() after touching sparse contents Shaoxuan Yuan
2022-05-27 12:10     ` Ævar Arnfjörð Bjarmason
2022-05-27 19:36     ` Victoria Dye
2022-05-27 19:59       ` Junio C Hamano
2022-05-27 21:24         ` Victoria Dye
2022-06-16 13:51           ` Shaoxuan Yuan
2022-06-16 16:42             ` Victoria Dye
2022-06-17  2:15               ` Shaoxuan Yuan
2022-06-19  3:25 ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Shaoxuan Yuan
2022-06-19  3:25   ` [WIP v3 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
2022-06-21 21:23     ` Victoria Dye
2022-06-19  3:25   ` [WIP v3 2/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
2022-06-19  3:25   ` [WIP v3 3/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
2022-06-19  3:25   ` [WIP v3 4/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
2022-06-19  3:25   ` [WIP v3 5/7] mv: use flags mode for update_mode Shaoxuan Yuan
2022-06-21 22:32     ` Victoria Dye
2022-06-22  9:37       ` Shaoxuan Yuan
2022-06-19  3:25   ` [WIP v3 6/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
2022-06-21 22:55     ` Victoria Dye
2022-06-19  3:25   ` [WIP v3 7/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
2022-06-21 23:11     ` Victoria Dye
2022-06-21 23:30   ` [WIP v3 0/7] mv: fix out-of-cone file/directory move logic Victoria Dye
2022-06-23 15:06     ` Derrick Stolee
2022-06-23 16:19       ` Junio C Hamano
2022-06-24  8:26         ` Shaoxuan Yuan
2022-06-23 11:41 ` [PATCH v4 " Shaoxuan Yuan
2022-06-23 11:41   ` [PATCH v4 1/7] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
2022-06-23 11:41   ` [PATCH v4 2/7] mv: update sparsity after moving from out-of-cone to in-cone Shaoxuan Yuan
2022-06-23 15:08     ` Derrick Stolee
2022-06-24  8:04       ` Shaoxuan Yuan
2022-06-27 13:55         ` Derrick Stolee
2022-06-23 11:41   ` [PATCH v4 3/7] mv: decouple if/else-if checks using goto Shaoxuan Yuan
2022-06-23 11:41   ` [PATCH v4 4/7] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
2022-06-23 11:41   ` [PATCH v4 5/7] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
2022-06-23 11:41   ` [PATCH v4 6/7] mv: use flags mode for update_mode Shaoxuan Yuan
2022-06-23 15:10     ` Derrick Stolee
2022-06-23 11:41   ` [PATCH v4 7/7] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
2022-06-23 15:14     ` Derrick Stolee
2022-06-24  7:57       ` Shaoxuan Yuan
2022-06-27 13:59         ` Derrick Stolee
2022-06-23 15:16   ` [PATCH v4 0/7] mv: fix out-of-cone file/directory move logic Derrick Stolee
2022-06-23 18:05     ` Junio C Hamano
2022-06-30  2:37 ` [PATCH v5 0/8] " Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 1/8] t7002: add tests for moving out-of-cone file/directory Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 2/8] t1092: mv directory from out-of-cone to in-cone Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 3/8] mv: update sparsity after moving " Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 4/8] mv: decouple if/else-if checks using goto Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 5/8] mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 6/8] mv: check if <destination> exists in index to handle overwriting Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 7/8] mv: use flags mode for update_mode Shaoxuan Yuan
2022-06-30  2:37   ` [PATCH v5 8/8] mv: add check_dir_in_index() and solve general dir check issue Shaoxuan Yuan
2022-07-01 19:43   ` [PATCH v5 0/8] mv: fix out-of-cone file/directory move logic Derrick Stolee
2022-07-01 21:50     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).