git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/5] Sparse index: integrate with commit and checkout
@ 2021-06-29  2:13 Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 1/5] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git; +Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee

This series extends our integration of sparse-index to 'git commit' and 'git
checkout'.

This is based on ds/status-with-sparse-index (v7) and v2.32.0. The hard work
was already done in that topic, so these changes are simple.

Recall that we have delayed our integration with 'git add' until we can work
out the concerns about how to deal with pathspecs outside of the
sparse-checkout definition. Those concerns might have some overlap with how
'git commit' takes a pathspec, but this seems like a rare enough case to
handle here and we can be more careful with the behavior change in the next
series which will integrate with git add.

In addition to the tests that already exist in t1092, I have integrated
these changes in microsoft/git and tested them against the Scalar functional
tests, which go through quite a few complicated scenarios, verifying that
things work the same across the full index and sparse-index cases.

Thanks, -Stolee

Derrick Stolee (5):
  p2000: add 'git checkout -' test and decrease depth
  p2000: compress repo names
  commit: integrate with sparse-index
  sparse-index: recompute cache-tree
  checkout: stop expanding sparse indexes

 builtin/checkout.c                       |  8 ++--
 builtin/commit.c                         |  3 ++
 cache-tree.c                             |  2 -
 sparse-index.c                           |  2 +
 t/perf/p2000-sparse-operations.sh        | 47 ++++++++++++--------
 t/t1092-sparse-checkout-compatibility.sh | 55 ++++++++++++++++++++++--
 6 files changed, 89 insertions(+), 28 deletions(-)


base-commit: 1d744848ee6b58ccaf3a30f20abe9797ed5d2ce7
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-973%2Fderrickstolee%2Fsparse-index%2Fcommit-and-checkout-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-973/derrickstolee/sparse-index/commit-and-checkout-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/973
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/5] p2000: add 'git checkout -' test and decrease depth
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
@ 2021-06-29  2:13 ` Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 2/5] p2000: compress repo names Derrick Stolee via GitGitGadget
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

As we increase our list of commands to test in
p2000-sparse-operations.sh, we will want to have a slightly smaller test
repository. Reduce the size by a factor of four by reducing the depth of
the step that creates a big index around a moderately-sized repository.

Also add a step to run 'git checkout -' on repeat. This requires having
a previous location in the reflog, so add that to the initialization
steps.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/perf/p2000-sparse-operations.sh | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 94513c97748..f7f8c012103 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -6,7 +6,7 @@ test_description="test performance of Git operations using the index"
 
 test_perf_default_repo
 
-SPARSE_CONE=f2/f4/f1
+SPARSE_CONE=f2/f4
 
 test_expect_success 'setup repo and indexes' '
 	git reset --hard HEAD &&
@@ -27,7 +27,7 @@ test_expect_success 'setup repo and indexes' '
 	OLD_COMMIT=$(git rev-parse HEAD) &&
 	OLD_TREE=$(git rev-parse HEAD^{tree}) &&
 
-	for i in $(test_seq 1 4)
+	for i in $(test_seq 1 3)
 	do
 		cat >in <<-EOF &&
 			100755 blob $BLOB	a
@@ -43,14 +43,23 @@ test_expect_success 'setup repo and indexes' '
 	done &&
 
 	git sparse-checkout init --cone &&
-	git branch -f wide $OLD_COMMIT &&
+	git sparse-checkout set $SPARSE_CONE &&
+	git checkout -b wide $OLD_COMMIT &&
+
+	for l2 in f1 f2 f3 f4
+	do
+		echo more bogus >>$SPARSE_CONE/$l2/a &&
+		git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1
+	done &&
+
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
 	(
 		cd full-index-v3 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
-		git update-index --index-version=3
+		git update-index --index-version=3 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
 	(
@@ -58,7 +67,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
-		git update-index --index-version=4
+		git update-index --index-version=4 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 &&
 	(
@@ -66,7 +76,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
-		git update-index --index-version=3
+		git update-index --index-version=3 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 &&
 	(
@@ -74,7 +85,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
-		git update-index --index-version=4
+		git update-index --index-version=4 &&
+		git checkout HEAD~4
 	)
 '
 
@@ -97,5 +109,6 @@ test_perf_on_all git status
 test_perf_on_all git add -A
 test_perf_on_all git add .
 test_perf_on_all git commit -a -m A
+test_perf_on_all git checkout -f -
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/5] p2000: compress repo names
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 1/5] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
@ 2021-06-29  2:13 ` Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 3/5] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

By using shorter names for the test repos, we will get a slightly more
compressed performance summary without comprimising clarity.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/perf/p2000-sparse-operations.sh | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index f7f8c012103..597626276fb 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -52,36 +52,36 @@ test_expect_success 'setup repo and indexes' '
 		git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1
 	done &&
 
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v3 &&
 	(
-		cd full-index-v3 &&
+		cd full-v3 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
 		git update-index --index-version=3 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v4 &&
 	(
-		cd full-index-v4 &&
+		cd full-v4 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
 		git update-index --index-version=4 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v3 &&
 	(
-		cd sparse-index-v3 &&
+		cd sparse-v3 &&
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
 		git update-index --index-version=3 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v4 &&
 	(
-		cd sparse-index-v4 &&
+		cd sparse-v4 &&
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
@@ -92,8 +92,8 @@ test_expect_success 'setup repo and indexes' '
 
 test_perf_on_all () {
 	command="$@"
-	for repo in full-index-v3 full-index-v4 \
-		    sparse-index-v3 sparse-index-v4
+	for repo in full-v3 full-v4 \
+		    sparse-v3 sparse-v4
 	do
 		test_perf "$command ($repo)" "
 			(
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/5] commit: integrate with sparse-index
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 1/5] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 2/5] p2000: compress repo names Derrick Stolee via GitGitGadget
@ 2021-06-29  2:13 ` Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 4/5] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.

Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().

This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.

Here are the relevant lines from p2000-sparse-operations.sh:

Test                                      HEAD~1           HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3)     0.35(0.26+0.06)  0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4)     0.32(0.26+0.05)  0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3)   0.63(0.59+0.06)  0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4)   0.64(0.59+0.08)  0.04(0.04+0.04) -93.8%

It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.

In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/commit.c                         |  3 ++
 cache-tree.c                             |  2 -
 t/t1092-sparse-checkout-compatibility.sh | 47 ++++++++++++++++++++++--
 3 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 12f51db158a..0bc64892505 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1682,6 +1682,9 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_commit_usage, builtin_commit_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	status_init_config(&s, git_commit_config);
 	s.commit_template = 1;
 	status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
diff --git a/cache-tree.c b/cache-tree.c
index 45e58666afc..577b18d8811 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -461,8 +461,6 @@ int cache_tree_update(struct index_state *istate, int flags)
 	if (i)
 		return i;
 
-	ensure_full_index(istate);
-
 	if (!istate->cache_tree)
 		istate->cache_tree = cache_tree();
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index df217a2d10b..11c6dc8ec1f 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -275,6 +275,34 @@ test_expect_success 'add, commit, checkout' '
 	test_all_match git checkout -
 '
 
+test_expect_success 'commit including unstaged changes' '
+	init_repos &&
+
+	write_script edit-file <<-\EOF &&
+	echo $1 >$2
+	EOF
+
+	run_on_all ../edit-file 1 a &&
+	run_on_all ../edit-file 1 deep/a &&
+
+	test_all_match git commit -m "-a" -a &&
+	test_all_match git status --porcelain=v2 &&
+
+	run_on_all ../edit-file 2 a &&
+	run_on_all ../edit-file 2 deep/a &&
+
+	test_all_match git commit -m "--include" --include deep/a &&
+	test_all_match git status --porcelain=v2 &&
+	test_all_match git commit -m "--include" --include a &&
+	test_all_match git status --porcelain=v2 &&
+
+	run_on_all ../edit-file 3 a &&
+	run_on_all ../edit-file 3 deep/a &&
+
+	test_all_match git commit -m "--amend" -a --amend &&
+	test_all_match git status --porcelain=v2
+'
+
 test_expect_success 'status/add: outside sparse cone' '
 	init_repos &&
 
@@ -535,14 +563,25 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
-test_expect_success 'sparse-index is not expanded' '
-	init_repos &&
-
+ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index status &&
+		git -C sparse-index "$@" &&
 	test_region ! index ensure_full_index trace2.txt
+}
+
+test_expect_success 'sparse-index is not expanded' '
+	init_repos &&
+
+	ensure_not_expanded status &&
+	ensure_not_expanded commit --allow-empty -m empty &&
+	echo >>sparse-index/a &&
+	ensure_not_expanded commit -a -m a &&
+	echo >>sparse-index/a &&
+	ensure_not_expanded commit --include a -m a &&
+	echo >>sparse-index/deep/deeper1/a &&
+	ensure_not_expanded commit --include deep/deeper1/a -m deeper
 '
 
 test_expect_success 'reset mixed and checkout orphan' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/5] sparse-index: recompute cache-tree
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-06-29  2:13 ` [PATCH 3/5] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
@ 2021-06-29  2:13 ` Derrick Stolee via GitGitGadget
  2021-06-29  2:13 ` [PATCH 5/5] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When some commands run with command_requires_full_index=1, then the
index can get in a state where the in-memory cache tree is actually
equal to the sparse index's cache tree instead of the full one.

This results in incorrect entry_count values. By clearing the cache
tree before converting to sparse, we avoid this issue.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sparse-index.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sparse-index.c b/sparse-index.c
index 53c8f711ccc..c6b4feec413 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -170,6 +170,8 @@ int convert_to_sparse(struct index_state *istate)
 	if (index_has_unmerged_entries(istate))
 		return 0;
 
+	/* Clear and recompute the cache-tree */
+	cache_tree_free(&istate->cache_tree);
 	if (cache_tree_update(istate, 0)) {
 		warning(_("unable to update cache-tree, staying full"));
 		return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/5] checkout: stop expanding sparse indexes
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-06-29  2:13 ` [PATCH 4/5] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
@ 2021-06-29  2:13 ` Derrick Stolee via GitGitGadget
  2021-07-09 21:26 ` [PATCH 0/5] Sparse index: integrate with commit and checkout Elijah Newren
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
  6 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-06-29  2:13 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Previous changes did the necessary improvements to unpack-trees.c and
diff-lib.c in order to modify a sparse index based on its comparision
with a tree. The only remaining work is to remove some
ensure_full_index() calls and add tests that verify that the index is
not expanded in our interesting cases. Include 'switch' and 'restore' in
these tests, as they share a base implementation with 'checkout'.

Here are the relevant performance results from
p2000-sparse-operations.sh:

Test                                     HEAD~1           HEAD
--------------------------------------------------------------------------------
2000.18: git checkout -f - (full-v3)     0.49(0.43+0.03)  0.47(0.39+0.05) -4.1%
2000.19: git checkout -f - (full-v4)     0.45(0.37+0.06)  0.42(0.37+0.05) -6.7%
2000.20: git checkout -f - (sparse-v3)   0.76(0.71+0.07)  0.04(0.03+0.04) -94.7%
2000.21: git checkout -f - (sparse-v4)   0.75(0.72+0.04)  0.05(0.06+0.04) -93.3%

It is important to compare the full index case to the sparse index case,
as the previous results for the sparse index were inflated by the index
expansion. For index v4, this is an 88% improvement.

On an internal repository with over two million paths at HEAD and a
sparse-checkout definition containing ~60,000 of those paths, 'git
checkout' went from 3.5s to 297ms with this change. The theoretical
optimum where only those ~60,000 paths exist was 275ms, so the extra
sparse directory entries contribute a 22ms overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/checkout.c                       |  8 +++-----
 t/t1092-sparse-checkout-compatibility.sh | 10 +++++++++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index f4cd7747d35..b5d477919a7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -378,9 +378,6 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	if (pc_workers > 1)
 		init_parallel_checkout();
 
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
-
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -530,8 +527,6 @@ static int checkout_paths(const struct checkout_opts *opts,
 	 * Make sure all pathspecs participated in locating the paths
 	 * to be checked out.
 	 */
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
 	for (pos = 0; pos < active_nr; pos++)
 		if (opts->overlay_mode)
 			mark_ce_for_checkout_overlay(active_cache[pos],
@@ -1593,6 +1588,9 @@ static int checkout_main(int argc, const char **argv, const char *prefix,
 
 	git_config(git_checkout_config, opts);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	opts->track = BRANCH_TRACK_UNSPECIFIED;
 
 	if (!opts->accept_pathspec && !opts->accept_ref)
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 11c6dc8ec1f..1d7dd735d7d 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -581,7 +581,15 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/a &&
 	ensure_not_expanded commit --include a -m a &&
 	echo >>sparse-index/deep/deeper1/a &&
-	ensure_not_expanded commit --include deep/deeper1/a -m deeper
+	ensure_not_expanded commit --include deep/deeper1/a -m deeper &&
+	ensure_not_expanded checkout rename-out-to-out &&
+	ensure_not_expanded checkout - &&
+	ensure_not_expanded switch rename-out-to-out &&
+	ensure_not_expanded switch - &&
+	git -C sparse-index reset --hard &&
+	ensure_not_expanded checkout rename-out-to-out -- deep/deeper1 &&
+	git -C sparse-index reset --hard &&
+	ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1
 '
 
 test_expect_success 'reset mixed and checkout orphan' '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Sparse index: integrate with commit and checkout
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-06-29  2:13 ` [PATCH 5/5] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
@ 2021-07-09 21:26 ` Elijah Newren
  2021-07-12 18:46   ` Derrick Stolee
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
  6 siblings, 1 reply; 21+ messages in thread
From: Elijah Newren @ 2021-07-09 21:26 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Matheus Tavares Bernardino,
	Derrick Stolee, Derrick Stolee

On Mon, Jun 28, 2021 at 7:13 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series extends our integration of sparse-index to 'git commit' and 'git
> checkout'.
>
> This is based on ds/status-with-sparse-index (v7) and v2.32.0. The hard work
> was already done in that topic, so these changes are simple.
>
> Recall that we have delayed our integration with 'git add' until we can work
> out the concerns about how to deal with pathspecs outside of the
> sparse-checkout definition. Those concerns might have some overlap with how
> 'git commit' takes a pathspec, but this seems like a rare enough case to
> handle here and we can be more careful with the behavior change in the next
> series which will integrate with git add.
>
> In addition to the tests that already exist in t1092, I have integrated
> these changes in microsoft/git and tested them against the Scalar functional
> tests, which go through quite a few complicated scenarios, verifying that
> things work the same across the full index and sparse-index cases.
>
> Thanks, -Stolee
>
> Derrick Stolee (5):
>   p2000: add 'git checkout -' test and decrease depth
>   p2000: compress repo names
>   commit: integrate with sparse-index
>   sparse-index: recompute cache-tree
>   checkout: stop expanding sparse indexes
>
>  builtin/checkout.c                       |  8 ++--
>  builtin/commit.c                         |  3 ++
>  cache-tree.c                             |  2 -
>  sparse-index.c                           |  2 +
>  t/perf/p2000-sparse-operations.sh        | 47 ++++++++++++--------
>  t/t1092-sparse-checkout-compatibility.sh | 55 ++++++++++++++++++++++--
>  6 files changed, 89 insertions(+), 28 deletions(-)
>
>
> base-commit: 1d744848ee6b58ccaf3a30f20abe9797ed5d2ce7
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-973%2Fderrickstolee%2Fsparse-index%2Fcommit-and-checkout-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-973/derrickstolee/sparse-index/commit-and-checkout-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/973

I've read over these patches and didn't find any problems in them in
my reading; however, since it builds upon ds/status-with-sparse-index
(v7)...

I decided to retry some of my ideas and testing on Patch 10/16 of v7,
over at https://lore.kernel.org/git/CABPp-BHwTAKwFiWQ0-2P=_g+7HLK5FfOAz-uujRjLou1fXT3zw@mail.gmail.com/

It turns out that the block you added there is now triggered by t1092
after this series, and the testcase won't pass without that block.  It
might be clearer to move that code fragment, or perhaps the whole
patch, into this series...though the code fragment as is has
introduced a bug.  If you take t1092 test 12 ("diff with
directory/file conflicts") and modify it so that before the

    git checkout df-conflict

invocation from sparse-index, you first run:

    $ git sparse-checkout disable
    $ echo more stuff >>folder1/edited-content
    $ git add -u
    $ git diff HEAD   # note the changes
    $ git sparse-checkout init --cone --sparse-index
    $ git diff HEAD   # note the changes are still there
    $ git checkout df-conflict  # no error??  What about the
conflicting changes?
    $ git diff HEAD

then the last command will show that the staged changes from before
the commit have simply been discarded.  In short, this makes the
series behave like --force was passed with sparse directory entries,
when --force wasn't passed.

So we've still got some directory/file conflict issues.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Sparse index: integrate with commit and checkout
  2021-07-09 21:26 ` [PATCH 0/5] Sparse index: integrate with commit and checkout Elijah Newren
@ 2021-07-12 18:46   ` Derrick Stolee
  2021-07-16 13:59     ` Derrick Stolee
  0 siblings, 1 reply; 21+ messages in thread
From: Derrick Stolee @ 2021-07-12 18:46 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Matheus Tavares Bernardino,
	Derrick Stolee

On 7/9/2021 5:26 PM, Elijah Newren wrote:
> On Mon, Jun 28, 2021 at 7:13 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
...
> I've read over these patches and didn't find any problems in them in
> my reading; however, since it builds upon ds/status-with-sparse-index
> (v7)...
> 
> I decided to retry some of my ideas and testing on Patch 10/16 of v7,
> over at https://lore.kernel.org/git/CABPp-BHwTAKwFiWQ0-2P=_g+7HLK5FfOAz-uujRjLou1fXT3zw@mail.gmail.com/
> 
> It turns out that the block you added there is now triggered by t1092
> after this series, and the testcase won't pass without that block.  It
> might be clearer to move that code fragment, or perhaps the whole
> patch, into this series...though the code fragment as is has
> introduced a bug.  If you take t1092 test 12 ("diff with
> directory/file conflicts") and modify it so that before the
> 
>     git checkout df-conflict
> 
> invocation from sparse-index, you first run:
> 
>     $ git sparse-checkout disable
>     $ echo more stuff >>folder1/edited-content
>     $ git add -u
>     $ git diff HEAD   # note the changes
>     $ git sparse-checkout init --cone --sparse-index
>     $ git diff HEAD   # note the changes are still there
>     $ git checkout df-conflict  # no error??  What about the
> conflicting changes?
>     $ git diff HEAD
> 
> then the last command will show that the staged changes from before
> the commit have simply been discarded.  In short, this makes the
> series behave like --force was passed with sparse directory entries,
> when --force wasn't passed.
> 
> So we've still got some directory/file conflict issues.

You are absolutely right that this seems strange. In fact, there
is a behavior change during 'git checkout' for sparse-checkouts
in general, but also my sparse-index change creates an additional
change in this case.

Here is a test that demonstrates the issue:

test_expect_success 'staged directory/file conflict' '
	init_repos &&

	test_sparse_match git sparse-checkout disable &&
	write_script edit-contents <<-\EOF &&
	echo text >>folder1/edited-content
	EOF
	run_on_all ../edit-contents &&

	test_all_match git add folder1/edited-content &&
	test_all_match git diff HEAD &&
	git -C sparse-checkout sparse-checkout init --cone --no-sparse-index &&
	git -C sparse-index sparse-checkout init --cone --sparse-index &&
	test_all_match git diff HEAD &&

	# Sparse-checkouts handle this conflict differently than
	# full checkouts, as they consider the file "folder1" to
	# be deleted in favor of the staged file
	# "folder1/edited-content".
	test_sparse_match git checkout df-conflict &&
	test_sparse_match git diff HEAD
'

The sparse-index case drops all staged changes during the
'git checkout df-conflict' command, so the test fails on
that line.

That final diff looks like this in the sparse-checkout
repo (no sparse index):

diff --git a/folder1 b/folder1
deleted file mode 100644
index d95f3ad..0000000
--- a/folder1
+++ /dev/null
@@ -1 +0,0 @@
-content
diff --git a/folder1/edited-content b/folder1/edited-content
new file mode 100644
index 0000000..8e27be7
--- /dev/null
+++ b/folder1/edited-content
@@ -0,0 +1 @@
+text

This is a strange case in that we have a staged tree that is
outside of the sparse-checkout cone. When running the 'git
checkout df-conflict' command, the twoway_merge() method
receives the following values:

 current: "folder1/" (tree OID)
 oldtree: "" (NULL OID)
 newtree: "folder1" (blob OID)

Is this value for 'oldtree' correct? It seems strange to me,
so I'll look further into it.

Clearly, the resolution that was presented in the previous
patch was incorrect so I will try to understand this
situation better.

Further, I expect it to be simpler to modify the behavior
here to match the full checkout case than to make the
sparse-index case match the normal sparse-checkout case.
The "natural" thing would be to keep the staged "folder1/"
directory, but that would present as adding all contained
content, not just the single staged entry.

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Sparse index: integrate with commit and checkout
  2021-07-12 18:46   ` Derrick Stolee
@ 2021-07-16 13:59     ` Derrick Stolee
  2021-07-17 15:37       ` Elijah Newren
  0 siblings, 1 reply; 21+ messages in thread
From: Derrick Stolee @ 2021-07-16 13:59 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Matheus Tavares Bernardino,
	Derrick Stolee

On 7/12/2021 2:46 PM, Derrick Stolee wrote:
> On 7/9/2021 5:26 PM, Elijah Newren wrote:
>> On Mon, Jun 28, 2021 at 7:13 PM Derrick Stolee via GitGitGadget
>> <gitgitgadget@gmail.com> wrote:
>>>
...
> This is a strange case in that we have a staged tree that is
> outside of the sparse-checkout cone. When running the 'git
> checkout df-conflict' command, the twoway_merge() method
> receives the following values:
> 
>  current: "folder1/" (tree OID)
>  oldtree: "" (NULL OID)
>  newtree: "folder1" (blob OID)
> 
> Is this value for 'oldtree' correct? It seems strange to me,
> so I'll look further into it.

This is correct. This 'oldtree' entry is actually the o->df_conflict
placeholder and is set to NULL inside the method.

> Further, I expect it to be simpler to modify the behavior
> here to match the full checkout case than to make the
> sparse-index case match the normal sparse-checkout case.
> The "natural" thing would be to keep the staged "folder1/"
> directory, but that would present as adding all contained
> content, not just the single staged entry.
Taking a closer look at the full checkout case, I discovered that the
'git checkout df-conflict' command succeeds in the full checkout case if I
apply it directly to the 'master' branch. In that situation, it completely
removes the staged change to folder1/edited-content! This seems like
incorrect behavior, and has nothing to do with the sparse-checkout feature.

It just happens that a sparse-checkout will have a _different_ kind of
incorrect behavior!

However, when adding the test on top of the ds/status-with-sparse-index
branch, the full checkout case matches the sparse-checkout! I bisected
this to the additions of files adjacent to folder1/ (folder1. folder1-,
etc) in e669ffb (t1092: expand repository data shape, 2021-07-14). If I
switch the test to conflict on folder2, then I get the strange behavior
that I was noticing on 'master'.

Some very subtle things are going on here, and they don't necessarily
involve the sparse index. Adding the sparse index to the mix creates a
third incorrect behavior to this already-broken case.

If we agree that the correct thing to do here is to reject the merge and
fail the command, then I can start working on making that change in
isolation (because _none_ of the existing behaviors are correct).

That leaves a question as to whether we should hold up this series for
that reason, or if I should pursue a fix to this kind of conflict as a
forward fix on top of it. What do you think, Elijah and Junio?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Sparse index: integrate with commit and checkout
  2021-07-16 13:59     ` Derrick Stolee
@ 2021-07-17 15:37       ` Elijah Newren
  2021-07-19 14:05         ` Derrick Stolee
  0 siblings, 1 reply; 21+ messages in thread
From: Elijah Newren @ 2021-07-17 15:37 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, Git Mailing List, Junio C Hamano,
	Matheus Tavares Bernardino, Derrick Stolee

On Fri, Jul 16, 2021 at 6:59 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 7/12/2021 2:46 PM, Derrick Stolee wrote:
> > On 7/9/2021 5:26 PM, Elijah Newren wrote:
> >> On Mon, Jun 28, 2021 at 7:13 PM Derrick Stolee via GitGitGadget
> >> <gitgitgadget@gmail.com> wrote:
> >>>
> ...
> > Further, I expect it to be simpler to modify the behavior
> > here to match the full checkout case than to make the
> > sparse-index case match the normal sparse-checkout case.
> > The "natural" thing would be to keep the staged "folder1/"
> > directory, but that would present as adding all contained
> > content, not just the single staged entry.
> Taking a closer look at the full checkout case, I discovered that the
> 'git checkout df-conflict' command succeeds in the full checkout case if I
> apply it directly to the 'master' branch. In that situation, it completely
> removes the staged change to folder1/edited-content! This seems like
> incorrect behavior, and has nothing to do with the sparse-checkout feature.

I was not able to reproduce.  Do you have other modifications to git,
or is there some other special setup required to trigger the bug that
I am missing in reading the paragraph above?  Here's what I see:

<Add an "exit 1 &&" right after "init_repos &&" in the 'diff with
directory/file conflicts' test, run until first failure, then:

$ cd trash directory.t1092-sparse-checkout-compatibility/full-checkout
$ git reset --hard
$ git checkout rename-in-to-out
$ echo more stuff >>folder1/edited-content
$ git add -u
$ git checkout df-conflict
error: Your local changes to the following files would be overwritten
by checkout:
folder1/edited-content
Please commit your changes or stash them before you switch branches.
Aborting

This looks like the expected behavior to me, and is what I'd also
expect from the sparse-checkout and sparse-index cases.

> It just happens that a sparse-checkout will have a _different_ kind of
> incorrect behavior!
>
> However, when adding the test on top of the ds/status-with-sparse-index
> branch, the full checkout case matches the sparse-checkout! I bisected
> this to the additions of files adjacent to folder1/ (folder1. folder1-,
> etc) in e669ffb (t1092: expand repository data shape, 2021-07-14). If I
> switch the test to conflict on folder2, then I get the strange behavior
> that I was noticing on 'master'.
>
> Some very subtle things are going on here, and they don't necessarily
> involve the sparse index. Adding the sparse index to the mix creates a
> third incorrect behavior to this already-broken case.
>
> If we agree that the correct thing to do here is to reject the merge and
> fail the command, then I can start working on making that change in
> isolation (because _none_ of the existing behaviors are correct).

Yes, rejecting the merge is the correct behavior.  This is implied by
the existing documentation for both the --merge and --force options to
checkout.

> That leaves a question as to whether we should hold up this series for
> that reason, or if I should pursue a fix to this kind of conflict as a
> forward fix on top of it. What do you think, Elijah and Junio?

I only dug in and found the sparse-checkout/sparse-index bugs because
the D/F changes you made to twoway_merge() looked clearly wrong to me
and I was trying to find a case that would demonstrate it and make it
easier for you to fix up.  I still think the patch is wrong and that
it adds a bug.  If you can drop that patch, and still get correct
behavior in your tests, then I think we can ignore other bugs in this
area, but I'm not happy with that particular patch.  If you need that
patch, then it needs to be corrected, which probably means figuring
out all these bugs.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Sparse index: integrate with commit and checkout
  2021-07-17 15:37       ` Elijah Newren
@ 2021-07-19 14:05         ` Derrick Stolee
  0 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee @ 2021-07-19 14:05 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee via GitGitGadget, Git Mailing List, Junio C Hamano,
	Matheus Tavares Bernardino, Derrick Stolee

On 7/17/2021 11:37 AM, Elijah Newren wrote:
> On Fri, Jul 16, 2021 at 6:59 AM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 7/12/2021 2:46 PM, Derrick Stolee wrote:
>>> On 7/9/2021 5:26 PM, Elijah Newren wrote:
>>>> On Mon, Jun 28, 2021 at 7:13 PM Derrick Stolee via GitGitGadget
>>>> <gitgitgadget@gmail.com> wrote:
>>>>>
>> ...
>>> Further, I expect it to be simpler to modify the behavior
>>> here to match the full checkout case than to make the
>>> sparse-index case match the normal sparse-checkout case.
>>> The "natural" thing would be to keep the staged "folder1/"
>>> directory, but that would present as adding all contained
>>> content, not just the single staged entry.
>> Taking a closer look at the full checkout case, I discovered that the
>> 'git checkout df-conflict' command succeeds in the full checkout case if I
>> apply it directly to the 'master' branch. In that situation, it completely
>> removes the staged change to folder1/edited-content! This seems like
>> incorrect behavior, and has nothing to do with the sparse-checkout feature.
> 
> I was not able to reproduce.  Do you have other modifications to git,
> or is there some other special setup required to trigger the bug that
> I am missing in reading the paragraph above?  Here's what I see:
> 
> <Add an "exit 1 &&" right after "init_repos &&" in the 'diff with
> directory/file conflicts' test, run until first failure, then:
> 
> $ cd trash directory.t1092-sparse-checkout-compatibility/full-checkout
> $ git reset --hard
> $ git checkout rename-in-to-out
> $ echo more stuff >>folder1/edited-content
> $ git add -u
> $ git checkout df-conflict
> error: Your local changes to the following files would be overwritten
> by checkout:
> folder1/edited-content
> Please commit your changes or stash them before you switch branches.
> Aborting
> 
> This looks like the expected behavior to me, and is what I'd also
> expect from the sparse-checkout and sparse-index cases.

It is fragile to the data shape in my test, so I'll be sure to
include one in the next series version that demonstrates the change.

>> It just happens that a sparse-checkout will have a _different_ kind of
>> incorrect behavior!
>>
>> However, when adding the test on top of the ds/status-with-sparse-index
>> branch, the full checkout case matches the sparse-checkout! I bisected
>> this to the additions of files adjacent to folder1/ (folder1. folder1-,
>> etc) in e669ffb (t1092: expand repository data shape, 2021-07-14). If I
>> switch the test to conflict on folder2, then I get the strange behavior
>> that I was noticing on 'master'.
>>
>> Some very subtle things are going on here, and they don't necessarily
>> involve the sparse index. Adding the sparse index to the mix creates a
>> third incorrect behavior to this already-broken case.
>>
>> If we agree that the correct thing to do here is to reject the merge and
>> fail the command, then I can start working on making that change in
>> isolation (because _none_ of the existing behaviors are correct).
> 
> Yes, rejecting the merge is the correct behavior.  This is implied by
> the existing documentation for both the --merge and --force options to
> checkout.
> 
>> That leaves a question as to whether we should hold up this series for
>> that reason, or if I should pursue a fix to this kind of conflict as a
>> forward fix on top of it. What do you think, Elijah and Junio?
> 
> I only dug in and found the sparse-checkout/sparse-index bugs because
> the D/F changes you made to twoway_merge() looked clearly wrong to me
> and I was trying to find a case that would demonstrate it and make it
> easier for you to fix up.  I still think the patch is wrong and that
> it adds a bug.  If you can drop that patch, and still get correct
> behavior in your tests, then I think we can ignore other bugs in this
> area, but I'm not happy with that particular patch.  If you need that
> patch, then it needs to be corrected, which probably means figuring
> out all these bugs.

That's a good point. I reverted the patch and re-ran the test and
found that actually the patch is necessary in order to match the
_incorrect_ behavior. Without the patch, the sparse-index case
(correctly) refuses to complete the checkout.

I'll replace this patch with a test change that demonstrates these
subtleties and marks them as NEEDSWORK.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 0/7] Sparse index: integrate with commit and checkout
  2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-07-09 21:26 ` [PATCH 0/5] Sparse index: integrate with commit and checkout Elijah Newren
@ 2021-07-20 20:14 ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
                     ` (7 more replies)
  6 siblings, 8 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git; +Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee

This series extends our integration of sparse-index to 'git commit' and 'git
checkout'.

This is based on ds/status-with-sparse-index (v7) and v2.32.0. The hard work
was already done in that topic, so these changes are simple.

Recall that we have delayed our integration with 'git add' until we can work
out the concerns about how to deal with pathspecs outside of the
sparse-checkout definition. Those concerns might have some overlap with how
'git commit' takes a pathspec, but this seems like a rare enough case to
handle here and we can be more careful with the behavior change in the next
series which will integrate with git add.

In addition to the tests that already exist in t1092, I have integrated
these changes in microsoft/git and tested them against the Scalar functional
tests, which go through quite a few complicated scenarios, verifying that
things work the same across the full index and sparse-index cases.


Update in V2
============

 * There is no change to the code, but it is presented in a slightly
   different order.
 * We've been discussing some complicated directory/file conflict cases, in
   particular with a staged change inside the directory. These tests are
   added and described as documenting incorrect behavior that should be
   changed.
 * After those tests are in place, we can motivate the change to
   twoway_merge() as necessary for a more-common situation (still rare) but
   still incorrect in an already-broken situation. Hopefully that balance is
   sufficient for now, until we can do the bigger work of fixing the bad
   behavior.

Thanks, -Stolee

Derrick Stolee (7):
  p2000: add 'git checkout -' test and decrease depth
  p2000: compress repo names
  commit: integrate with sparse-index
  sparse-index: recompute cache-tree
  checkout: stop expanding sparse indexes
  t1092: document bad 'git checkout' behavior
  unpack-trees: resolve sparse-directory/file conflicts

 builtin/checkout.c                       |   8 +-
 builtin/commit.c                         |   3 +
 cache-tree.c                             |   2 -
 sparse-index.c                           |   2 +
 t/perf/p2000-sparse-operations.sh        |  47 ++++--
 t/t1092-sparse-checkout-compatibility.sh | 197 ++++++++++++++++++++++-
 unpack-trees.c                           |  11 ++
 7 files changed, 240 insertions(+), 30 deletions(-)


base-commit: e5ca291076a8a936283bb2c57433c4393d3f80c2
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-973%2Fderrickstolee%2Fsparse-index%2Fcommit-and-checkout-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-973/derrickstolee/sparse-index/commit-and-checkout-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/973

Range-diff vs v1:

 1:  bb3dd1fdd48 = 1:  6e74958f590 p2000: add 'git checkout -' test and decrease depth
 2:  eb15bf37685 = 2:  3e1d03c41be p2000: compress repo names
 3:  413babe6e77 ! 3:  cd94f820052 commit: integrate with sparse-index
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is e
      +	ensure_not_expanded commit --include deep/deeper1/a -m deeper
       '
       
     - test_expect_success 'reset mixed and checkout orphan' '
     + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 4:  ffe8473caab = 4:  65e79b8037c sparse-index: recompute cache-tree
 5:  8710fee36b7 ! 5:  e9a9981477e checkout: stop expanding sparse indexes
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +	ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1
       '
       
     - test_expect_success 'reset mixed and checkout orphan' '
     + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 -:  ----------- > 6:  4b801c854fb t1092: document bad 'git checkout' behavior
 -:  ----------- > 7:  71e301501c8 unpack-trees: resolve sparse-directory/file conflicts

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 2/7] p2000: compress repo names Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

As we increase our list of commands to test in
p2000-sparse-operations.sh, we will want to have a slightly smaller test
repository. Reduce the size by a factor of four by reducing the depth of
the step that creates a big index around a moderately-sized repository.

Also add a step to run 'git checkout -' on repeat. This requires having
a previous location in the reflog, so add that to the initialization
steps.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/perf/p2000-sparse-operations.sh | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 94513c97748..f7f8c012103 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -6,7 +6,7 @@ test_description="test performance of Git operations using the index"
 
 test_perf_default_repo
 
-SPARSE_CONE=f2/f4/f1
+SPARSE_CONE=f2/f4
 
 test_expect_success 'setup repo and indexes' '
 	git reset --hard HEAD &&
@@ -27,7 +27,7 @@ test_expect_success 'setup repo and indexes' '
 	OLD_COMMIT=$(git rev-parse HEAD) &&
 	OLD_TREE=$(git rev-parse HEAD^{tree}) &&
 
-	for i in $(test_seq 1 4)
+	for i in $(test_seq 1 3)
 	do
 		cat >in <<-EOF &&
 			100755 blob $BLOB	a
@@ -43,14 +43,23 @@ test_expect_success 'setup repo and indexes' '
 	done &&
 
 	git sparse-checkout init --cone &&
-	git branch -f wide $OLD_COMMIT &&
+	git sparse-checkout set $SPARSE_CONE &&
+	git checkout -b wide $OLD_COMMIT &&
+
+	for l2 in f1 f2 f3 f4
+	do
+		echo more bogus >>$SPARSE_CONE/$l2/a &&
+		git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1
+	done &&
+
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
 	(
 		cd full-index-v3 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
-		git update-index --index-version=3
+		git update-index --index-version=3 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
 	(
@@ -58,7 +67,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
-		git update-index --index-version=4
+		git update-index --index-version=4 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 &&
 	(
@@ -66,7 +76,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
-		git update-index --index-version=3
+		git update-index --index-version=3 &&
+		git checkout HEAD~4
 	) &&
 	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 &&
 	(
@@ -74,7 +85,8 @@ test_expect_success 'setup repo and indexes' '
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
-		git update-index --index-version=4
+		git update-index --index-version=4 &&
+		git checkout HEAD~4
 	)
 '
 
@@ -97,5 +109,6 @@ test_perf_on_all git status
 test_perf_on_all git add -A
 test_perf_on_all git add .
 test_perf_on_all git commit -a -m A
+test_perf_on_all git checkout -f -
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/7] p2000: compress repo names
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 3/7] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

By using shorter names for the test repos, we will get a slightly more
compressed performance summary without comprimising clarity.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/perf/p2000-sparse-operations.sh | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index f7f8c012103..597626276fb 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -52,36 +52,36 @@ test_expect_success 'setup repo and indexes' '
 		git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1
 	done &&
 
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v3 &&
 	(
-		cd full-index-v3 &&
+		cd full-v3 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
 		git update-index --index-version=3 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v4 &&
 	(
-		cd full-index-v4 &&
+		cd full-v4 &&
 		git sparse-checkout init --cone &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
 		git update-index --index-version=4 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v3 &&
 	(
-		cd sparse-index-v3 &&
+		cd sparse-v3 &&
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 3 &&
 		git update-index --index-version=3 &&
 		git checkout HEAD~4
 	) &&
-	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 &&
+	git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v4 &&
 	(
-		cd sparse-index-v4 &&
+		cd sparse-v4 &&
 		git sparse-checkout init --cone --sparse-index &&
 		git sparse-checkout set $SPARSE_CONE &&
 		git config index.version 4 &&
@@ -92,8 +92,8 @@ test_expect_success 'setup repo and indexes' '
 
 test_perf_on_all () {
 	command="$@"
-	for repo in full-index-v3 full-index-v4 \
-		    sparse-index-v3 sparse-index-v4
+	for repo in full-v3 full-v4 \
+		    sparse-v3 sparse-v4
 	do
 		test_perf "$command ($repo)" "
 			(
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/7] commit: integrate with sparse-index
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 2/7] p2000: compress repo names Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 4/7] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.

Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().

This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.

Here are the relevant lines from p2000-sparse-operations.sh:

Test                                      HEAD~1           HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3)     0.35(0.26+0.06)  0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4)     0.32(0.26+0.05)  0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3)   0.63(0.59+0.06)  0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4)   0.64(0.59+0.08)  0.04(0.04+0.04) -93.8%

It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.

In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/commit.c                         |  3 ++
 cache-tree.c                             |  2 -
 t/t1092-sparse-checkout-compatibility.sh | 47 ++++++++++++++++++++++--
 3 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 12f51db158a..0bc64892505 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1682,6 +1682,9 @@ int cmd_commit(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_commit_usage, builtin_commit_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	status_init_config(&s, git_commit_config);
 	s.commit_template = 1;
 	status_format = STATUS_FORMAT_NONE; /* Ignore status.short */
diff --git a/cache-tree.c b/cache-tree.c
index 45e58666afc..577b18d8811 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -461,8 +461,6 @@ int cache_tree_update(struct index_state *istate, int flags)
 	if (i)
 		return i;
 
-	ensure_full_index(istate);
-
 	if (!istate->cache_tree)
 		istate->cache_tree = cache_tree();
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index cabbd42e339..d3e34d0acac 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -262,6 +262,34 @@ test_expect_success 'add, commit, checkout' '
 	test_all_match git checkout -
 '
 
+test_expect_success 'commit including unstaged changes' '
+	init_repos &&
+
+	write_script edit-file <<-\EOF &&
+	echo $1 >$2
+	EOF
+
+	run_on_all ../edit-file 1 a &&
+	run_on_all ../edit-file 1 deep/a &&
+
+	test_all_match git commit -m "-a" -a &&
+	test_all_match git status --porcelain=v2 &&
+
+	run_on_all ../edit-file 2 a &&
+	run_on_all ../edit-file 2 deep/a &&
+
+	test_all_match git commit -m "--include" --include deep/a &&
+	test_all_match git status --porcelain=v2 &&
+	test_all_match git commit -m "--include" --include a &&
+	test_all_match git status --porcelain=v2 &&
+
+	run_on_all ../edit-file 3 a &&
+	run_on_all ../edit-file 3 deep/a &&
+
+	test_all_match git commit -m "--amend" -a --amend &&
+	test_all_match git status --porcelain=v2
+'
+
 test_expect_success 'status/add: outside sparse cone' '
 	init_repos &&
 
@@ -514,14 +542,25 @@ test_expect_success 'sparse-index is expanded and converted back' '
 	test_region index ensure_full_index trace2.txt
 '
 
-test_expect_success 'sparse-index is not expanded' '
-	init_repos &&
-
+ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index status &&
+		git -C sparse-index "$@" &&
 	test_region ! index ensure_full_index trace2.txt
+}
+
+test_expect_success 'sparse-index is not expanded' '
+	init_repos &&
+
+	ensure_not_expanded status &&
+	ensure_not_expanded commit --allow-empty -m empty &&
+	echo >>sparse-index/a &&
+	ensure_not_expanded commit -a -m a &&
+	echo >>sparse-index/a &&
+	ensure_not_expanded commit --include a -m a &&
+	echo >>sparse-index/deep/deeper1/a &&
+	ensure_not_expanded commit --include deep/deeper1/a -m deeper
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/7] sparse-index: recompute cache-tree
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-07-20 20:14   ` [PATCH v2 3/7] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 5/7] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When some commands run with command_requires_full_index=1, then the
index can get in a state where the in-memory cache tree is actually
equal to the sparse index's cache tree instead of the full one.

This results in incorrect entry_count values. By clearing the cache
tree before converting to sparse, we avoid this issue.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sparse-index.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sparse-index.c b/sparse-index.c
index 53c8f711ccc..c6b4feec413 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -170,6 +170,8 @@ int convert_to_sparse(struct index_state *istate)
 	if (index_has_unmerged_entries(istate))
 		return 0;
 
+	/* Clear and recompute the cache-tree */
+	cache_tree_free(&istate->cache_tree);
 	if (cache_tree_update(istate, 0)) {
 		warning(_("unable to update cache-tree, staying full"));
 		return -1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/7] checkout: stop expanding sparse indexes
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-07-20 20:14   ` [PATCH v2 4/7] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 6/7] t1092: document bad 'git checkout' behavior Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Previous changes did the necessary improvements to unpack-trees.c and
diff-lib.c in order to modify a sparse index based on its comparision
with a tree. The only remaining work is to remove some
ensure_full_index() calls and add tests that verify that the index is
not expanded in our interesting cases. Include 'switch' and 'restore' in
these tests, as they share a base implementation with 'checkout'.

Here are the relevant performance results from
p2000-sparse-operations.sh:

Test                                     HEAD~1           HEAD
--------------------------------------------------------------------------------
2000.18: git checkout -f - (full-v3)     0.49(0.43+0.03)  0.47(0.39+0.05) -4.1%
2000.19: git checkout -f - (full-v4)     0.45(0.37+0.06)  0.42(0.37+0.05) -6.7%
2000.20: git checkout -f - (sparse-v3)   0.76(0.71+0.07)  0.04(0.03+0.04) -94.7%
2000.21: git checkout -f - (sparse-v4)   0.75(0.72+0.04)  0.05(0.06+0.04) -93.3%

It is important to compare the full index case to the sparse index case,
as the previous results for the sparse index were inflated by the index
expansion. For index v4, this is an 88% improvement.

On an internal repository with over two million paths at HEAD and a
sparse-checkout definition containing ~60,000 of those paths, 'git
checkout' went from 3.5s to 297ms with this change. The theoretical
optimum where only those ~60,000 paths exist was 275ms, so the extra
sparse directory entries contribute a 22ms overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/checkout.c                       |  8 +++-----
 t/t1092-sparse-checkout-compatibility.sh | 10 +++++++++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index f4cd7747d35..b5d477919a7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -378,9 +378,6 @@ static int checkout_worktree(const struct checkout_opts *opts,
 	if (pc_workers > 1)
 		init_parallel_checkout();
 
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
-
 	for (pos = 0; pos < active_nr; pos++) {
 		struct cache_entry *ce = active_cache[pos];
 		if (ce->ce_flags & CE_MATCHED) {
@@ -530,8 +527,6 @@ static int checkout_paths(const struct checkout_opts *opts,
 	 * Make sure all pathspecs participated in locating the paths
 	 * to be checked out.
 	 */
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
 	for (pos = 0; pos < active_nr; pos++)
 		if (opts->overlay_mode)
 			mark_ce_for_checkout_overlay(active_cache[pos],
@@ -1593,6 +1588,9 @@ static int checkout_main(int argc, const char **argv, const char *prefix,
 
 	git_config(git_checkout_config, opts);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	opts->track = BRANCH_TRACK_UNSPECIFIED;
 
 	if (!opts->accept_pathspec && !opts->accept_ref)
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index d3e34d0acac..fde3b41aba8 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -560,7 +560,15 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/a &&
 	ensure_not_expanded commit --include a -m a &&
 	echo >>sparse-index/deep/deeper1/a &&
-	ensure_not_expanded commit --include deep/deeper1/a -m deeper
+	ensure_not_expanded commit --include deep/deeper1/a -m deeper &&
+	ensure_not_expanded checkout rename-out-to-out &&
+	ensure_not_expanded checkout - &&
+	ensure_not_expanded switch rename-out-to-out &&
+	ensure_not_expanded switch - &&
+	git -C sparse-index reset --hard &&
+	ensure_not_expanded checkout rename-out-to-out -- deep/deeper1 &&
+	git -C sparse-index reset --hard &&
+	ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 6/7] t1092: document bad 'git checkout' behavior
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-07-20 20:14   ` [PATCH v2 5/7] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-20 20:14   ` [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts Derrick Stolee via GitGitGadget
  2021-07-22  4:22   ` [PATCH v2 0/7] Sparse index: integrate with commit and checkout Elijah Newren
  7 siblings, 0 replies; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.

We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.

Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.

Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.

At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 142 ++++++++++++++++++++++-
 1 file changed, 140 insertions(+), 2 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index fde3b41aba8..79b4a8ce199 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -95,6 +95,25 @@ test_expect_success 'setup' '
 		git add . &&
 		git commit -m "rename deep/deeper1/... to folder1/..." &&
 
+		git checkout -b df-conflict-1 base &&
+		rm -rf folder1 &&
+		echo content >folder1 &&
+		git add . &&
+		git commit -m "dir to file" &&
+
+		git checkout -b df-conflict-2 base &&
+		rm -rf folder2 &&
+		echo content >folder2 &&
+		git add . &&
+		git commit -m "dir to file" &&
+
+		git checkout -b fd-conflict base &&
+		rm a &&
+		mkdir a &&
+		echo content >a/a &&
+		git add . &&
+		git commit -m "file to dir" &&
+
 		git checkout -b deepest base &&
 		echo "updated deepest" >deep/deeper1/deepest/a &&
 		git commit -a -m "update deepest" &&
@@ -358,10 +377,16 @@ test_expect_success 'diff --staged' '
 	test_all_match git diff --staged
 '
 
+# NEEDSWORK: sparse-checkout behaves differently from full-checkout when
+# running this test with 'df-conflict-2' after 'df-conflict-1'.
 test_expect_success 'diff with renames and conflicts' '
 	init_repos &&
 
-	for branch in rename-out-to-out rename-out-to-in rename-in-to-out
+	for branch in rename-out-to-out \
+		      rename-out-to-in \
+		      rename-in-to-out \
+		      df-conflict-1 \
+		      fd-conflict
 	do
 		test_all_match git checkout rename-base &&
 		test_all_match git checkout $branch -- . &&
@@ -371,10 +396,15 @@ test_expect_success 'diff with renames and conflicts' '
 	done
 '
 
+# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file
+# conflict such as when checking out df-conflict-1 and df-conflict2.
 test_expect_success 'diff with directory/file conflicts' '
 	init_repos &&
 
-	for branch in rename-out-to-out rename-out-to-in rename-in-to-out
+	for branch in rename-out-to-out \
+		      rename-out-to-in \
+		      rename-in-to-out \
+		      fd-conflict
 	do
 		git -C full-checkout reset --hard &&
 		test_sparse_match git reset --hard &&
@@ -606,4 +636,112 @@ test_expect_success 'add everything with deep new file' '
 	test_all_match git status --porcelain=v2
 '
 
+# NEEDSWORK: 'git checkout' behaves incorrectly in the case of
+# directory/file conflicts, even without sparse-checkout. Use this
+# test only as a documentation of the incorrect behavior, not a
+# measure of how it _should_ behave.
+test_expect_success 'checkout behaves oddly with df-conflict-1' '
+	init_repos &&
+
+	test_sparse_match git sparse-checkout disable &&
+
+	write_script edit-content <<-\EOF &&
+	echo content >>folder1/larger-content
+	git add folder1
+	EOF
+
+	run_on_all ../edit-content &&
+	test_all_match git status --porcelain=v2 &&
+
+	git -C sparse-checkout sparse-checkout init --cone &&
+	git -C sparse-index sparse-checkout init --cone --sparse-index &&
+
+	test_all_match git status --porcelain=v2 &&
+
+	# This checkout command should fail, because we have a staged
+	# change to folder1/larger-content, but the destination changes
+	# folder1 to a file.
+	git -C full-checkout checkout df-conflict-1 \
+		1>full-checkout-out \
+		2>full-checkout-err &&
+	git -C sparse-checkout checkout df-conflict-1 \
+		1>sparse-checkout-out \
+		2>sparse-checkout-err &&
+
+	# NEEDSWORK: the sparse-index case refuses to change HEAD here,
+	# but for the wrong reason.
+	test_must_fail git -C sparse-index checkout df-conflict-1 \
+		1>sparse-index-out \
+		2>sparse-index-err &&
+
+	# Instead, the checkout deletes the folder1 file and adds the
+	# folder1/larger-content file, leaving all other paths that were
+	# in folder1/ as deleted (without any warning).
+	cat >expect <<-EOF &&
+	D	folder1
+	A	folder1/larger-content
+	EOF
+	test_cmp expect full-checkout-out &&
+	test_cmp expect sparse-checkout-out &&
+
+	# stderr: Switched to branch df-conflict-1
+	test_cmp full-checkout-err sparse-checkout-err
+'
+
+# NEEDSWORK: 'git checkout' behaves incorrectly in the case of
+# directory/file conflicts, even without sparse-checkout. Use this
+# test only as a documentation of the incorrect behavior, not a
+# measure of how it _should_ behave.
+test_expect_success 'checkout behaves oddly with df-conflict-2' '
+	init_repos &&
+
+	test_sparse_match git sparse-checkout disable &&
+
+	write_script edit-content <<-\EOF &&
+	echo content >>folder2/larger-content
+	git add folder2
+	EOF
+
+	run_on_all ../edit-content &&
+	test_all_match git status --porcelain=v2 &&
+
+	git -C sparse-checkout sparse-checkout init --cone &&
+	git -C sparse-index sparse-checkout init --cone --sparse-index &&
+
+	test_all_match git status --porcelain=v2 &&
+
+	# This checkout command should fail, because we have a staged
+	# change to folder1/larger-content, but the destination changes
+	# folder1 to a file.
+	git -C full-checkout checkout df-conflict-2 \
+		1>full-checkout-out \
+		2>full-checkout-err &&
+	git -C sparse-checkout checkout df-conflict-2 \
+		1>sparse-checkout-out \
+		2>sparse-checkout-err &&
+
+	# NEEDSWORK: the sparse-index case refuses to change HEAD
+	# here, but for the wrong reason.
+	test_must_fail git -C sparse-index checkout df-conflict-2 \
+		1>sparse-index-out \
+		2>sparse-index-err &&
+
+	# The full checkout deviates from the df-conflict-1 case here!
+	# It drops the change to folder1/larger-content and leaves the
+	# folder1 path as-is on disk.
+	test_must_be_empty full-checkout-out &&
+
+	# In the sparse-checkout case, the checkout deletes the folder1
+	# file and adds the folder1/larger-content file, leaving all other
+	# paths that were in folder1/ as deleted (without any warning).
+	cat >expect <<-EOF &&
+	D	folder2
+	A	folder2/larger-content
+	EOF
+	test_cmp expect sparse-checkout-out &&
+
+	# Switched to branch df-conflict-1
+	test_cmp full-checkout-err sparse-checkout-err
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2021-07-20 20:14   ` [PATCH v2 6/7] t1092: document bad 'git checkout' behavior Derrick Stolee via GitGitGadget
@ 2021-07-20 20:14   ` Derrick Stolee via GitGitGadget
  2021-07-22  4:19     ` Elijah Newren
  2021-07-22  4:22   ` [PATCH v2 0/7] Sparse index: integrate with commit and checkout Elijah Newren
  7 siblings, 1 reply; 21+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-07-20 20:14 UTC (permalink / raw)
  To: git
  Cc: gitster, newren, matheus.bernardino, stolee, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.

There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.

When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.

However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().

That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++------------
 unpack-trees.c                           | 11 +++++++++++
 2 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 79b4a8ce199..91e30d6ec22 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -396,14 +396,14 @@ test_expect_success 'diff with renames and conflicts' '
 	done
 '
 
-# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file
-# conflict such as when checking out df-conflict-1 and df-conflict2.
 test_expect_success 'diff with directory/file conflicts' '
 	init_repos &&
 
 	for branch in rename-out-to-out \
 		      rename-out-to-in \
 		      rename-in-to-out \
+		      df-conflict-1 \
+		      df-conflict-2 \
 		      fd-conflict
 	do
 		git -C full-checkout reset --hard &&
@@ -667,10 +667,7 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' '
 	git -C sparse-checkout checkout df-conflict-1 \
 		1>sparse-checkout-out \
 		2>sparse-checkout-err &&
-
-	# NEEDSWORK: the sparse-index case refuses to change HEAD here,
-	# but for the wrong reason.
-	test_must_fail git -C sparse-index checkout df-conflict-1 \
+	git -C sparse-index checkout df-conflict-1 \
 		1>sparse-index-out \
 		2>sparse-index-err &&
 
@@ -684,7 +681,11 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' '
 	test_cmp expect full-checkout-out &&
 	test_cmp expect sparse-checkout-out &&
 
+	# The sparse-index reports no output
+	test_must_be_empty sparse-index-out &&
+
 	# stderr: Switched to branch df-conflict-1
+	test_cmp full-checkout-err sparse-checkout-err &&
 	test_cmp full-checkout-err sparse-checkout-err
 '
 
@@ -719,17 +720,15 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
 	git -C sparse-checkout checkout df-conflict-2 \
 		1>sparse-checkout-out \
 		2>sparse-checkout-err &&
-
-	# NEEDSWORK: the sparse-index case refuses to change HEAD
-	# here, but for the wrong reason.
-	test_must_fail git -C sparse-index checkout df-conflict-2 \
+	git -C sparse-index checkout df-conflict-2 \
 		1>sparse-index-out \
 		2>sparse-index-err &&
 
 	# The full checkout deviates from the df-conflict-1 case here!
 	# It drops the change to folder1/larger-content and leaves the
-	# folder1 path as-is on disk.
+	# folder1 path as-is on disk. The sparse-index behaves the same.
 	test_must_be_empty full-checkout-out &&
+	test_must_be_empty sparse-index-out &&
 
 	# In the sparse-checkout case, the checkout deletes the folder1
 	# file and adds the folder1/larger-content file, leaving all other
@@ -741,7 +740,8 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
 	test_cmp expect sparse-checkout-out &&
 
 	# Switched to branch df-conflict-1
-	test_cmp full-checkout-err sparse-checkout-err
+	test_cmp full-checkout-err sparse-checkout-err &&
+	test_cmp full-checkout-err sparse-index-err
 '
 
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 0a5135ab397..309c1352f5d 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2619,6 +2619,17 @@ int twoway_merge(const struct cache_entry * const *src,
 			 same(current, oldtree) && !same(current, newtree)) {
 			/* 20 or 21 */
 			return merged_entry(newtree, current, o);
+		} else if (current && !oldtree && newtree &&
+			   S_ISSPARSEDIR(current->ce_mode) != S_ISSPARSEDIR(newtree->ce_mode) &&
+			   ce_stage(current) == 0) {
+			/*
+			 * This case is a directory/file conflict across the sparse-index
+			 * boundary. When we are changing from one path to another via
+			 * 'git checkout', then we want to replace one entry with another
+			 * via merged_entry(). If there are staged changes, then we should
+			 * reject the merge instead.
+			 */
+			return merged_entry(newtree, current, o);
 		} else
 			return reject_merge(current, o);
 	}
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts
  2021-07-20 20:14   ` [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts Derrick Stolee via GitGitGadget
@ 2021-07-22  4:19     ` Elijah Newren
  0 siblings, 0 replies; 21+ messages in thread
From: Elijah Newren @ 2021-07-22  4:19 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Matheus Tavares Bernardino,
	Derrick Stolee, Derrick Stolee, Derrick Stolee

On Tue, Jul 20, 2021 at 1:14 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> When running unpack_trees() with a sparse index, we attempt to operate
> on the index without expanding the sparse directory entries. Thus, we
> operate by manipulating entire directories and passing them to the
> unpack function. In the case of the 'git checkout' command, this is the
> twoway_merge() function.
>
> There are several cases in twoway_merge() that handle different
> situations. One new one to add is the case of a directory/file conflict
> where the directory is sparse. Before the sparse index, such a conflict
> would appear as a list of file additions and deletions. Now,
> twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
> src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
> equal to the df_conflict_entry. The way to determine that we have a
> directory/file conflict is to test that 'current' and 'newtree' disagree
> on being sparse directory entries.
>
> When we are in this case, we want to resolve the situation by calling
> merged_entry(). This allows replacing the 'current' entry with the
> 'newtree' entry. This is important for cases where we want to run 'git
> checkout' across the conflict and have the new HEAD represent the new
> file type at that path. The first NEEDSWORK comment dropped in t1092
> demonstrates this necessary behavior.
>
> However, we still are in a confusing state when 'current' corresponds to
> a staged change within a sparse directory that is not present at HEAD.
> This should be atypical, because it requires adding a change outside of
> the sparse-checkout cone, but it is possible. Since we are unable to
> determine that this is a staged change within twoway_merge(), we cannot
> add a case to reject the merge at this point. I believe this is due to
> the use of df_conflict_entry in the place of 'oldtree' instead of using
> the valud at HEAD, which would provide some perspective to this
> decision. Any change that would allow this differentiation for staged
> entries would need to involve information further up in unpack_trees().
>
> That work should be done, sometime, because we are further confusing the
> behavior of a directory/file conflict when staging a change in the
> directory. The two cases 'checkout behaves oddly with df-conflict-?' in
> t1092 demonstrate that even without a sparse-checkout, Git is not
> consistent in its behavior. Neither of the two options seems correct,
> either. This change makes the sparse-index behave differently than the
> typcial sparse-checkout case, but it does match the full checkout
> behavior in the df-conflict-2 case.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++------------
>  unpack-trees.c                           | 11 +++++++++++
>  2 files changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 79b4a8ce199..91e30d6ec22 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -396,14 +396,14 @@ test_expect_success 'diff with renames and conflicts' '
>         done
>  '
>
> -# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file
> -# conflict such as when checking out df-conflict-1 and df-conflict2.
>  test_expect_success 'diff with directory/file conflicts' '
>         init_repos &&
>
>         for branch in rename-out-to-out \
>                       rename-out-to-in \
>                       rename-in-to-out \
> +                     df-conflict-1 \
> +                     df-conflict-2 \
>                       fd-conflict
>         do
>                 git -C full-checkout reset --hard &&
> @@ -667,10 +667,7 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' '
>         git -C sparse-checkout checkout df-conflict-1 \
>                 1>sparse-checkout-out \
>                 2>sparse-checkout-err &&
> -
> -       # NEEDSWORK: the sparse-index case refuses to change HEAD here,
> -       # but for the wrong reason.
> -       test_must_fail git -C sparse-index checkout df-conflict-1 \
> +       git -C sparse-index checkout df-conflict-1 \
>                 1>sparse-index-out \
>                 2>sparse-index-err &&
>
> @@ -684,7 +681,11 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' '
>         test_cmp expect full-checkout-out &&
>         test_cmp expect sparse-checkout-out &&
>
> +       # The sparse-index reports no output
> +       test_must_be_empty sparse-index-out &&
> +
>         # stderr: Switched to branch df-conflict-1
> +       test_cmp full-checkout-err sparse-checkout-err &&
>         test_cmp full-checkout-err sparse-checkout-err
>  '
>
> @@ -719,17 +720,15 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
>         git -C sparse-checkout checkout df-conflict-2 \
>                 1>sparse-checkout-out \
>                 2>sparse-checkout-err &&
> -
> -       # NEEDSWORK: the sparse-index case refuses to change HEAD
> -       # here, but for the wrong reason.
> -       test_must_fail git -C sparse-index checkout df-conflict-2 \
> +       git -C sparse-index checkout df-conflict-2 \
>                 1>sparse-index-out \
>                 2>sparse-index-err &&
>
>         # The full checkout deviates from the df-conflict-1 case here!
>         # It drops the change to folder1/larger-content and leaves the
> -       # folder1 path as-is on disk.
> +       # folder1 path as-is on disk. The sparse-index behaves the same.
>         test_must_be_empty full-checkout-out &&
> +       test_must_be_empty sparse-index-out &&
>
>         # In the sparse-checkout case, the checkout deletes the folder1
>         # file and adds the folder1/larger-content file, leaving all other
> @@ -741,7 +740,8 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' '
>         test_cmp expect sparse-checkout-out &&
>
>         # Switched to branch df-conflict-1
> -       test_cmp full-checkout-err sparse-checkout-err
> +       test_cmp full-checkout-err sparse-checkout-err &&
> +       test_cmp full-checkout-err sparse-index-err
>  '
>
>  test_done
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 0a5135ab397..309c1352f5d 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -2619,6 +2619,17 @@ int twoway_merge(const struct cache_entry * const *src,
>                          same(current, oldtree) && !same(current, newtree)) {
>                         /* 20 or 21 */
>                         return merged_entry(newtree, current, o);
> +               } else if (current && !oldtree && newtree &&
> +                          S_ISSPARSEDIR(current->ce_mode) != S_ISSPARSEDIR(newtree->ce_mode) &&
> +                          ce_stage(current) == 0) {
> +                       /*
> +                        * This case is a directory/file conflict across the sparse-index
> +                        * boundary. When we are changing from one path to another via
> +                        * 'git checkout', then we want to replace one entry with another
> +                        * via merged_entry(). If there are staged changes, then we should
> +                        * reject the merge instead.
> +                        */
> +                       return merged_entry(newtree, current, o);
>                 } else
>                         return reject_merge(current, o);
>         }
> --

I'm still a bit unhappy with the unpack-trees.c change (I wonder if
having "path/" vs "path" is going to make D/F conflicts hard to handle
and whether we need to make unpack_trees do something special to make
both paths be considered at the same time with one call to
twoway_merge() instead of two in order to fix this), BUT I think
you've done a really good job of documenting it and pointing out that
unpack_trees() messes up even without sparse checkouts on D/F
conflicts (though in a different way).  I think you've documented it
well enough, and argued about the likelihood of issues well enough,
that it makes sense to proceed and circle back and fix this up later.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/7] Sparse index: integrate with commit and checkout
  2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2021-07-20 20:14   ` [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts Derrick Stolee via GitGitGadget
@ 2021-07-22  4:22   ` Elijah Newren
  7 siblings, 0 replies; 21+ messages in thread
From: Elijah Newren @ 2021-07-22  4:22 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Matheus Tavares Bernardino,
	Derrick Stolee, Derrick Stolee

On Tue, Jul 20, 2021 at 1:14 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series extends our integration of sparse-index to 'git commit' and 'git
> checkout'.
>
> This is based on ds/status-with-sparse-index (v7) and v2.32.0. The hard work
> was already done in that topic, so these changes are simple.
>
> Recall that we have delayed our integration with 'git add' until we can work
> out the concerns about how to deal with pathspecs outside of the
> sparse-checkout definition. Those concerns might have some overlap with how
> 'git commit' takes a pathspec, but this seems like a rare enough case to
> handle here and we can be more careful with the behavior change in the next
> series which will integrate with git add.
>
> In addition to the tests that already exist in t1092, I have integrated
> these changes in microsoft/git and tested them against the Scalar functional
> tests, which go through quite a few complicated scenarios, verifying that
> things work the same across the full index and sparse-index cases.
>
>
> Update in V2
> ============
>
>  * There is no change to the code, but it is presented in a slightly
>    different order.
>  * We've been discussing some complicated directory/file conflict cases, in
>    particular with a staged change inside the directory. These tests are
>    added and described as documenting incorrect behavior that should be
>    changed.
>  * After those tests are in place, we can motivate the change to
>    twoway_merge() as necessary for a more-common situation (still rare) but
>    still incorrect in an already-broken situation. Hopefully that balance is
>    sufficient for now, until we can do the bigger work of fixing the bad
>    behavior.

I read the first five patches previously.  The tiny changes there in
the range-diff still look good to me.

I very much appreciate the new patch 6.

As noted in 7/7, I'm a little unhappy with the patch to
twoway_merge(), BUT you've clearly documented the shortcomings in very
good detail and pointed out how git has (likely for decades) messed up
in related ways for non-sparse checkouts with D/F conflicts.  You've
documented it well enough and argued well enough about the relative
merits, that I have to agree with you that this is a good step
forward.  I do hope we circle back and tie up the loose ends at some
point.

So, the whole series is:

Reviewed-by: Elijah Newren <newren@gmail.com>

>
> Thanks, -Stolee
>
> Derrick Stolee (7):
>   p2000: add 'git checkout -' test and decrease depth
>   p2000: compress repo names
>   commit: integrate with sparse-index
>   sparse-index: recompute cache-tree
>   checkout: stop expanding sparse indexes
>   t1092: document bad 'git checkout' behavior
>   unpack-trees: resolve sparse-directory/file conflicts
>
>  builtin/checkout.c                       |   8 +-
>  builtin/commit.c                         |   3 +
>  cache-tree.c                             |   2 -
>  sparse-index.c                           |   2 +
>  t/perf/p2000-sparse-operations.sh        |  47 ++++--
>  t/t1092-sparse-checkout-compatibility.sh | 197 ++++++++++++++++++++++-
>  unpack-trees.c                           |  11 ++
>  7 files changed, 240 insertions(+), 30 deletions(-)
>
>
> base-commit: e5ca291076a8a936283bb2c57433c4393d3f80c2
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-973%2Fderrickstolee%2Fsparse-index%2Fcommit-and-checkout-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-973/derrickstolee/sparse-index/commit-and-checkout-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/973
>
> Range-diff vs v1:
>
>  1:  bb3dd1fdd48 = 1:  6e74958f590 p2000: add 'git checkout -' test and decrease depth
>  2:  eb15bf37685 = 2:  3e1d03c41be p2000: compress repo names
>  3:  413babe6e77 ! 3:  cd94f820052 commit: integrate with sparse-index
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is e
>       + ensure_not_expanded commit --include deep/deeper1/a -m deeper
>        '
>
>      - test_expect_success 'reset mixed and checkout orphan' '
>      + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>  4:  ffe8473caab = 4:  65e79b8037c sparse-index: recompute cache-tree
>  5:  8710fee36b7 ! 5:  e9a9981477e checkout: stop expanding sparse indexes
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
>       + ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1
>        '
>
>      - test_expect_success 'reset mixed and checkout orphan' '
>      + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>  -:  ----------- > 6:  4b801c854fb t1092: document bad 'git checkout' behavior
>  -:  ----------- > 7:  71e301501c8 unpack-trees: resolve sparse-directory/file conflicts
>
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-07-22  4:22 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29  2:13 [PATCH 0/5] Sparse index: integrate with commit and checkout Derrick Stolee via GitGitGadget
2021-06-29  2:13 ` [PATCH 1/5] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
2021-06-29  2:13 ` [PATCH 2/5] p2000: compress repo names Derrick Stolee via GitGitGadget
2021-06-29  2:13 ` [PATCH 3/5] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
2021-06-29  2:13 ` [PATCH 4/5] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
2021-06-29  2:13 ` [PATCH 5/5] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
2021-07-09 21:26 ` [PATCH 0/5] Sparse index: integrate with commit and checkout Elijah Newren
2021-07-12 18:46   ` Derrick Stolee
2021-07-16 13:59     ` Derrick Stolee
2021-07-17 15:37       ` Elijah Newren
2021-07-19 14:05         ` Derrick Stolee
2021-07-20 20:14 ` [PATCH v2 0/7] " Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 2/7] p2000: compress repo names Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 3/7] commit: integrate with sparse-index Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 4/7] sparse-index: recompute cache-tree Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 5/7] checkout: stop expanding sparse indexes Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 6/7] t1092: document bad 'git checkout' behavior Derrick Stolee via GitGitGadget
2021-07-20 20:14   ` [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts Derrick Stolee via GitGitGadget
2021-07-22  4:19     ` Elijah Newren
2021-07-22  4:22   ` [PATCH v2 0/7] Sparse index: integrate with commit and checkout Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).