git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/7] Sparse index: integrate with 'git stash'
@ 2022-04-25 17:49 Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
                   ` (8 more replies)
  0 siblings, 9 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye

This series, in combination with the sparse index integrations of reset [1],
update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
most subcommands of 'git stash' to use the sparse index end-to-end without
index expansion.

Like the earlier series, this series starts with new tests ensuring
compatibility of the sparse index with non-sparse index full and sparse
checkouts [1/7]. Next, sparse index is trivially enabled [2/7].
Functionally, sparse index-enabled sparse-checkouts remain compatible with
non-sparse index sparse-checkouts, but there are still some cases where the
index (or a temporary index) is expanded unnecessarily. These cases are
fixed in three parts:

 * First, 'git stash -u' is made sparse index-compatible by ensuring the
   "temporary" index holding the stashed, untracked files is created as a
   sparse index whenever possible (per repo settings &
   'is_sparse_index_allowed()'). Patch [3/7] exposes
   'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
   patch [4/7] uses that function to mark the temporary index sparse when
   appropriate.
 * Next, 'git stash (apply|pop)' are made sparse index-compatible by
   changing their internal "merge" function (executed via
   'merge_recursive_generic()') from 'merge_recursive()' to
   'merge_ort_recursive()'. This requires first allowing
   'merge_recursive_generic()' to accept a merge function as an input
   (rather than hardcoding use of 'merge_recursive()') in patch [5/7], then
   changing the call in 'stash.c' to specify 'merge_ort_recursive()' in
   patch [6/7]. See note [4] for possible alternate implementations.
 * Finally, while patches 5 & 6 avoid index expansion for most cases of 'git
   stash (apply|pop)', applying a stash that includes untracked files still
   expands the index. This is a result of an internal 'read-tree' execution
   (specifically in its 'unpack_trees' call) creating a result index that is
   never sparse in-core, thus forcing the index to be unnecessarily
   collapsed and re-expanded in 'do_write_locked_index()'. In patch [7/7],
   'unpack_trees' is updated to set the default sparsity of the resultant
   index to "sparse" if allowed by repo settings and
   'is_sparse_index_allowed()' (similar to the change in patch 4).

Performance results (from the 'p2000' tests):

(git stash &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%

(echo >>new &&
 git stash -u &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%


[1]
https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
[2]
https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
[3]
https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/
[4] I went with changing 'stash' to always use 'merge-ort' in
'merge_recursive_generic()' as a sort of "middle ground" between "replace
'merge_recursive()' with 'merge_ort_recursive()' in all of its hardcoded
internal usage" and "only use 'merge-ort' if using a sparse index in 'git
stash', otherwise 'merge-recursive'". The former would extend the use of
'merge-ort' to 'git am' and 'git merge-recursive', whereas the latter is a
more cautious/narrowly-focused option. If anyone has any other thoughts, I'm
interested in hearing them.

Thanks!

-Victoria

Victoria Dye (7):
  stash: expand sparse-checkout compatibility testing
  stash: integrate with sparse index
  sparse-index: expose 'is_sparse_index_allowed()'
  read-cache: set sparsity when index is new
  merge-recursive: add merge function arg to 'merge_recursive_generic'
  stash: merge applied stash with merge-ort
  unpack-trees: preserve index sparsity

 builtin/am.c                             |  2 +-
 builtin/merge-recursive.c                |  2 +-
 builtin/stash.c                          |  6 +-
 merge-ort.c                              |  3 +-
 merge-recursive.c                        |  4 +-
 merge-recursive.h                        |  9 ++-
 read-cache.c                             | 18 ++++-
 sparse-index.c                           |  2 +-
 sparse-index.h                           |  1 +
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 94 +++++++++++++++++++++---
 unpack-trees.c                           |  6 ++
 12 files changed, 131 insertions(+), 18 deletions(-)


base-commit: 6cd33dceed60949e2dbc32e3f0f5e67c4c882e1e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1171%2Fvdye%2Fsparse%2Fstash-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1171/vdye/sparse/stash-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1171
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/7] stash: expand sparse-checkout compatibility testing
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Add tests verifying expected 'git stash' behavior in
't1092-sparse-checkout-compatibility'. These cases establish the expected
behavior of 'git stash' in a sparse-checkout and verify consistency both
with and without a sparse index. Although no sparse index compatibility has
been integrated into 'git stash' yet, the tests are all 'expect_success' -
we don't want the cone-mode sparse-checkout behavior to change depending on
whether it is using a sparse index or not. Therefore, we expect these tests
to continue passing once sparse index is integrated with 'git stash'.

Additionally, add performance test cases for 'git stash' both with and
without untracked files. Note that, unlike the other tests in
'p2000-sparse-operations.sh', the tests added for 'stash' are combination
operations. This is done to ensure the stash/unstash is not blocked by the
modification of '$SPARSE_CONE/a' performed as part of 'test_perf_on_all'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 49 ++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 382716cfca9..76710cbef35 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -106,6 +106,8 @@ test_perf_on_all () {
 }
 
 test_perf_on_all git status
+test_perf_on_all 'git stash && git stash pop'
+test_perf_on_all 'echo >>new && git stash -u && git stash pop'
 test_perf_on_all git add -A
 test_perf_on_all git add .
 test_perf_on_all git commit -a -m A
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 236ab530284..baf95906729 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1151,6 +1151,55 @@ test_expect_success 'clean' '
 	test_sparse_match test_path_is_dir folder1
 '
 
+test_expect_success 'stash' '
+	init_repos &&
+
+	write_script edit-contents <<-\EOF &&
+	echo text >>$1
+	EOF
+
+	# Stash a sparse directory (folder1)
+	test_all_match git checkout -b test-branch rename-base &&
+	test_all_match git reset --soft rename-out-to-out &&
+	test_all_match git stash &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Apply the sparse directory stash without reinstating the index
+	test_all_match git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where stash can be applied
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard rename-out-to-out &&
+
+	# Apply the sparse directory stash *with* reinstating the index
+	test_all_match git stash apply --index -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where we will get a conflict applying the stash
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard update-folder1 &&
+
+	# Apply the sparse directory stash with conflicts
+	test_all_match test_must_fail git stash apply --index -q &&
+	test_all_match test_must_fail git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to base branch
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard base &&
+
+	# Stash & unstash an untracked file outside of the sparse checkout
+	# definition.
+	run_on_sparse mkdir -p folder1 &&
+	run_on_all ../edit-contents folder1/new &&
+	test_all_match git stash -u &&
+	test_all_match git status --porcelain=v2 &&
+
+	test_all_match git stash pop -q &&
+	test_all_match git status --porcelain=v2
+'
+
 test_expect_success 'submodule handling' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 2/7] stash: integrate with sparse index
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-25 21:34   ` Junio C Hamano
  2022-04-26 12:53   ` Derrick Stolee
  2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.

With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.

Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 45 +++++++++++++++++++-----
 2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 0c7b6a95882..1bfba532044 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1770,6 +1770,9 @@ int cmd_stash(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, git_stash_usage,
 			     PARSE_OPT_KEEP_UNKNOWN | PARSE_OPT_KEEP_DASHDASH);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	index_file = get_index_file();
 	strbuf_addf(&stash_index_path, "%s.stash.%" PRIuMAX, index_file,
 		    (uintmax_t)pid);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index baf95906729..b00c65c7770 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1271,7 +1271,10 @@ test_expect_success 'index.sparse disabled inline uses full index' '
 
 ensure_not_expanded () {
 	rm -f trace2.txt &&
-	echo >>sparse-index/untracked.txt &&
+	if test -z $WITHOUT_UNTRACKED_TXT
+	then
+		echo >>sparse-index/untracked.txt
+	fi &&
 
 	if test "$1" = "!"
 	then
@@ -1314,14 +1317,6 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/untracked.txt &&
 	ensure_not_expanded add . &&
 
-	ensure_not_expanded checkout-index -f a &&
-	ensure_not_expanded checkout-index -f --all &&
-	for ref in update-deep update-folder1 update-folder2 update-deep
-	do
-		echo >>sparse-index/README.md &&
-		ensure_not_expanded reset --hard $ref || return 1
-	done &&
-
 	ensure_not_expanded reset --mixed base &&
 	ensure_not_expanded reset --hard update-deep &&
 	ensure_not_expanded reset --keep base &&
@@ -1375,6 +1370,38 @@ test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
 	)
 '
 
+test_expect_success 'sparse-index is not expanded: stash' '
+	init_repos &&
+
+	echo >>sparse-index/a &&
+	ensure_not_expanded stash &&
+	ensure_not_expanded stash list &&
+	ensure_not_expanded stash show stash@{0} &&
+	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash drop stash@{0} &&
+
+	echo >>sparse-index/deep/new &&
+	! ensure_not_expanded stash -u &&
+	(
+		WITHOUT_UNTRACKED_TXT=1 &&
+		! ensure_not_expanded stash pop
+	) &&
+
+	ensure_not_expanded stash create &&
+	oid=$(git -C sparse-index stash create) &&
+	ensure_not_expanded stash store -m "test" $oid &&
+	ensure_not_expanded reset --hard &&
+	! ensure_not_expanded stash pop &&
+
+	ensure_not_expanded checkout-index -f a &&
+	ensure_not_expanded checkout-index -f --all &&
+	for ref in update-deep update-folder1 update-folder2 update-deep
+	do
+		echo >>sparse-index/README.md &&
+		ensure_not_expanded reset --hard $ref || return 1
+	done
+'
+
 test_expect_success 'sparse index is not expanded: diff' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()'
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Expose 'is_sparse_index_allowed()' publicly so that it may be used by
callers outside of 'sparse-index.c'. While no such callers exist yet, it
will be used in a subsequent commit.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 2 +-
 sparse-index.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/sparse-index.c b/sparse-index.c
index 8636af72de5..ffbab7d35f1 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -118,7 +118,7 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-static int is_sparse_index_allowed(struct index_state *istate, int flags)
+int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
 	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
diff --git a/sparse-index.h b/sparse-index.h
index 633d4fb7e31..f57c65d972f 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -3,6 +3,7 @@
 
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
+int is_sparse_index_allowed(struct index_state *istate, int flags);
 int convert_to_sparse(struct index_state *istate, int flags);
 void ensure_correct_sparsity(struct index_state *istate);
 void clear_skip_worktree_from_present_files(struct index_state *istate);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 4/7] read-cache: set sparsity when index is new
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-25 21:35   ` Junio C Hamano
  2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When the index read in 'do_read_index()' does not exist on-disk, mark the
index "sparse" if the executing command does not require a full index and
sparse index is otherwise enabled.

Some commands (such as 'git stash -u') implicitly create a new index (when
the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
some operation on it. However, when this index is created, it isn't created
with the same sparsity settings as the repo index. As a result, while these
indexes may be sparse during the operation, they are always expanded before
being written to disk. We can avoid that expansion by defaulting the index
to "sparse", in which case it will only be expanded if the full index is
needed.

Note that the function 'set_new_index_sparsity()' is created despite having
only a single caller because additional callers will be added in a
subsequent patch.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             | 18 +++++++++++++++++-
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 4df97e185e9..60355f5ad6a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2260,6 +2260,20 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	return consumed;
 }
 
+static void set_new_index_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index's repo exists, mark it sparse according to
+	 * repo settings.
+	 */
+	if (istate->repo) {
+		prepare_repo_settings(istate->repo);
+		if (!istate->repo->settings.command_requires_full_index &&
+		    is_sparse_index_allowed(istate, 0))
+			istate->sparse_index = 1;
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -2281,8 +2295,10 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	istate->timestamp.nsec = 0;
 	fd = open(path, O_RDONLY);
 	if (fd < 0) {
-		if (!must_exist && errno == ENOENT)
+		if (!must_exist && errno == ENOENT) {
+			set_new_index_sparsity(istate);
 			return 0;
+		}
 		die_errno(_("%s: index file open failed"), path);
 	}
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index b00c65c7770..a8c1c345ab0 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1381,7 +1381,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
-	! ensure_not_expanded stash -u &&
+	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
 		! ensure_not_expanded stash pop
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-25 21:38   ` Junio C Hamano
  2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Replace the hardcoded 'merge_recursive()' function used by the
'merge_recursive_generic()' with a caller-specific merge function. This will
allow us to use 'merge_ort_recursive()' (and therefore avoid the index
expansion of 'merge_recursive()') in commands that perform merges with
'merge_recursive_generic()', such as 'git stash pop'.

Note that this patch is strictly a refactor; all callers still use
'merge_recursive()', and any changing to 'merge_ort_recursive()' will be
done in a later commit.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/am.c              | 2 +-
 builtin/merge-recursive.c | 2 +-
 builtin/stash.c           | 2 +-
 merge-ort.c               | 3 ++-
 merge-recursive.c         | 4 ++--
 merge-recursive.h         | 9 ++++++++-
 6 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 0f4111bafa0..6d01185d122 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1614,7 +1614,7 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
 	if (state->quiet)
 		o.verbosity = 0;
 
-	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
+	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
 		repo_rerere(the_repository, state->allow_rerere_autoupdate);
 		free(their_tree_name);
 		return error(_("Failed to merge in the changes."));
diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index b9acbf5d342..687ed1e527b 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -81,7 +81,7 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
 
-	failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, &result);
+	failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, merge_recursive, &result);
 
 	free(better1);
 	free(better2);
diff --git a/builtin/stash.c b/builtin/stash.c
index 1bfba532044..16171eb1dab 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -554,7 +554,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 	bases[0] = &info->b_tree;
 
 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
-				      &result);
+				      merge_recursive, &result);
 	if (ret) {
 		rerere(0);
 
diff --git a/merge-ort.c b/merge-ort.c
index 8545354dafd..4bccdfcf355 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4737,7 +4737,8 @@ void merge_incore_recursive(struct merge_options *opt,
 	trace2_region_enter("merge", "incore_recursive", opt->repo);
 
 	/* We set the ancestor label based on the merge_bases */
-	assert(opt->ancestor == NULL);
+	assert(opt->ancestor == NULL ||
+	       !strcmp(opt->ancestor, "constructed merge base"));
 
 	trace2_region_enter("merge", "merge_start", opt->repo);
 	merge_start(opt, result);
diff --git a/merge-recursive.c b/merge-recursive.c
index 1ee6364e8b1..2088f5c5fb3 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3806,6 +3806,7 @@ int merge_recursive_generic(struct merge_options *opt,
 			    const struct object_id *merge,
 			    int num_merge_bases,
 			    const struct object_id **merge_bases,
+			    recursive_merge_fn_t merge_fn,
 			    struct commit **result)
 {
 	int clean;
@@ -3829,8 +3830,7 @@ int merge_recursive_generic(struct merge_options *opt,
 	}
 
 	repo_hold_locked_index(opt->repo, &lock, LOCK_DIE_ON_ERROR);
-	clean = merge_recursive(opt, head_commit, next_commit, ca,
-				result);
+	clean = merge_fn(opt, head_commit, next_commit, ca, result);
 	if (clean < 0) {
 		rollback_lock_file(&lock);
 		return clean;
diff --git a/merge-recursive.h b/merge-recursive.h
index b88000e3c25..6a21f2da538 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -53,6 +53,12 @@ struct merge_options {
 	struct merge_options_internal *priv;
 };
 
+typedef int (*recursive_merge_fn_t)(struct merge_options *opt,
+				    struct commit *h1,
+				    struct commit *h2,
+				    struct commit_list *merge_bases,
+				    struct commit **result);
+
 void init_merge_options(struct merge_options *opt, struct repository *repo);
 
 /* parse the option in s and update the relevant field of opt */
@@ -105,7 +111,7 @@ int merge_recursive(struct merge_options *opt,
 
 /*
  * merge_recursive_generic can operate on trees instead of commits, by
- * wrapping the trees into virtual commits, and calling merge_recursive().
+ * wrapping the trees into virtual commits, and calling the provided merge_fn.
  * It also writes out the in-memory index to disk if the merge is successful.
  *
  * Outputs:
@@ -120,6 +126,7 @@ int merge_recursive_generic(struct merge_options *opt,
 			    const struct object_id *merge,
 			    int num_merge_bases,
 			    const struct object_id **merge_bases,
+			    recursive_merge_fn_t merge_fn,
 			    struct commit **result);
 
 #endif
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 6/7] stash: merge applied stash with merge-ort
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-26 13:02   ` Derrick Stolee
  2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Change the merge function used in 'do_apply_stash()' from 'merge_recursive'
to 'merge_ort_recursive'. In addition to aligning with the default merge
strategy used by 'git merge' (6a5fb96672 (Change default merge backend from
recursive to ort, 2021-08-04)), this allows 'git stash <apply|pop>' to
operate without expanding the index by default. Update tests in 't1092'
verifying index expansion for 'git stash' accordingly.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          | 3 ++-
 t/t1092-sparse-checkout-compatibility.sh | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 16171eb1dab..cd77d448546 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -7,6 +7,7 @@
 #include "cache-tree.h"
 #include "unpack-trees.h"
 #include "merge-recursive.h"
+#include "merge-ort-wrappers.h"
 #include "strvec.h"
 #include "run-command.h"
 #include "dir.h"
@@ -554,7 +555,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 	bases[0] = &info->b_tree;
 
 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
-				      merge_recursive, &result);
+				      merge_ort_recursive, &result);
 	if (ret) {
 		rerere(0);
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index a8c1c345ab0..8545a865e04 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1377,7 +1377,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash &&
 	ensure_not_expanded stash list &&
 	ensure_not_expanded stash show stash@{0} &&
-	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash apply stash@{0} &&
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
@@ -1391,7 +1391,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	oid=$(git -C sparse-index stash create) &&
 	ensure_not_expanded stash store -m "test" $oid &&
 	ensure_not_expanded reset --hard &&
-	! ensure_not_expanded stash pop &&
+	ensure_not_expanded stash pop &&
 
 	ensure_not_expanded checkout-index -f a &&
 	ensure_not_expanded checkout-index -f --all &&
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 7/7] unpack-trees: preserve index sparsity
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
@ 2022-04-25 17:49 ` Victoria Dye via GitGitGadget
  2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-25 17:49 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When unpacking trees, set the default sparsity of the resultant index based
on repo settings and 'is_sparse_index_allowed()'.

Normally, when executing 'unpack_trees', the output index is marked sparse
when (and only when) it unpacks a sparse directory. However, an index may be
"sparse" even if it contains no sparse directories - when all files fall
inside the sparse-checkout definition or otherwise have SKIP_WORKTREE
disabled. Therefore, the output index may be marked "full" even when it is
"sparse", resulting in unnecessary 'ensure_full_index' calls when writing to
disk. Avoid this by setting the "default" index sparsity to match what is
expected for the repository.

As a consequence of this fix, the (non-merge) 'read-tree' performed when
applying a stash with untracked files no longer expands the index. Update
the corresponding test in 't1092'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 2 +-
 unpack-trees.c                           | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 8545a865e04..8443c7e65ae 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1384,7 +1384,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
-		! ensure_not_expanded stash pop
+		ensure_not_expanded stash pop
 	) &&
 
 	ensure_not_expanded stash create &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 7f528d35cc2..a1d0ff3a4d3 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -11,6 +11,7 @@
 #include "refs.h"
 #include "attr.h"
 #include "split-index.h"
+#include "sparse-index.h"
 #include "submodule.h"
 #include "submodule-config.h"
 #include "fsmonitor.h"
@@ -1839,6 +1840,11 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
 
+	if (!o->src_index->initialized &&
+	    !repo->settings.command_requires_full_index &&
+	    is_sparse_index_allowed(&o->result, 0))
+		o->result.sparse_index = 1;
+
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/7] stash: integrate with sparse index
  2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
@ 2022-04-25 21:34   ` Junio C Hamano
  2022-04-26 12:53   ` Derrick Stolee
  1 sibling, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2022-04-25 21:34 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, derrickstolee, newren, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> Enable sparse index in 'git stash' by disabling
> 'command_requires_full_index'.
>
> With sparse index enabled, some subcommands of 'stash' work without
> expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
> etc. Others ensure the index is expanded either directly (as in the case of
> 'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
> 'do_apply_stash()' triggers the expansion), or in a command called
> internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
> addition to enabling sparse index, add tests to 't1092' demonstrating which
> variants of 'git stash' expand the index, and which do not.

As always, it is suprising that the change id deceptively easy, but
it is only true because various components like the merge machinery
used by the code have been taught to expand the sparse index entries
as needed.  Looking good so far.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 4/7] read-cache: set sparsity when index is new
  2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
@ 2022-04-25 21:35   ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2022-04-25 21:35 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, derrickstolee, newren, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Victoria Dye <vdye@github.com>
>
> When the index read in 'do_read_index()' does not exist on-disk, mark the
> index "sparse" if the executing command does not require a full index and
> sparse index is otherwise enabled.
>
> Some commands (such as 'git stash -u') implicitly create a new index (when
> the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
> some operation on it. However, when this index is created, it isn't created
> with the same sparsity settings as the repo index. As a result, while these
> indexes may be sparse during the operation, they are always expanded before
> being written to disk. We can avoid that expansion by defaulting the index
> to "sparse", in which case it will only be expanded if the full index is
> needed.

Makes sense.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
@ 2022-04-25 21:38   ` Junio C Hamano
  2022-04-26 12:57     ` Derrick Stolee
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2022-04-25 21:38 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, derrickstolee, newren, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
> +	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {

This one does need input from Elijah to properly evaluate.  Off the
top of my head, I wonder if the choice of the merge function should
be part of "struct merge_options".


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] Sparse index: integrate with 'git stash'
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
@ 2022-04-26 12:49 ` Derrick Stolee
  2022-04-26 13:09   ` Derrick Stolee
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
  8 siblings, 1 reply; 42+ messages in thread
From: Derrick Stolee @ 2022-04-26 12:49 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: newren, gitster, Victoria Dye

On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
> This series, in combination with the sparse index integrations of reset [1],
> update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
> most subcommands of 'git stash' to use the sparse index end-to-end without
> index expansion.

I'm really excited to see this culmination of your work in this area!

> Performance results (from the 'p2000' tests):
> 
> (git stash &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
> full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
> sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
> sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%
> 
> (echo >>new &&
>  git stash -u &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
> full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
> sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
> sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%

I wanted to add some performance results from real-world users of
an internal monorepo. These are numbers for all runs of "git stash",
so it is not broken down by subcommand:

Index Type         P50     P80     P99
---------------------------------------
Full Index       13.3s   26.3s   47.5s
Sparse Index      2.1s    5.8s    9.4s

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/7] stash: integrate with sparse index
  2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
  2022-04-25 21:34   ` Junio C Hamano
@ 2022-04-26 12:53   ` Derrick Stolee
  2022-04-26 15:26     ` Victoria Dye
  2022-04-26 16:21     ` Junio C Hamano
  1 sibling, 2 replies; 42+ messages in thread
From: Derrick Stolee @ 2022-04-26 12:53 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: newren, gitster, Victoria Dye

On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>
> 
> Enable sparse index in 'git stash' by disabling
> 'command_requires_full_index'.

>  ensure_not_expanded () {
>  	rm -f trace2.txt &&
> -	echo >>sparse-index/untracked.txt &&
> +	if test -z $WITHOUT_UNTRACKED_TXT

Do we need quotes around "$WITHOUT_UNTRACKED_TXT"?

I mean, I suppose we don't since the tests pass when this variable
is unset, but I think it is a good practice to be careful about
empty variables. Or am I wrong?

> +	then
> +		echo >>sparse-index/untracked.txt
> +	fi &&


> -	ensure_not_expanded checkout-index -f a &&
> -	ensure_not_expanded checkout-index -f --all &&
> -	for ref in update-deep update-folder1 update-folder2 update-deep
> -	do
> -		echo >>sparse-index/README.md &&
> -		ensure_not_expanded reset --hard $ref || return 1
> -	done &&
> -

Moving these to be within the stash tests is interesting.

> +test_expect_success 'sparse-index is not expanded: stash' '
> +	init_repos &&
> +
> +	echo >>sparse-index/a &&
> +	ensure_not_expanded stash &&
> +	ensure_not_expanded stash list &&
> +	ensure_not_expanded stash show stash@{0} &&
> +	! ensure_not_expanded stash apply stash@{0} &&
> +	ensure_not_expanded stash drop stash@{0} &&
> +
> +	echo >>sparse-index/deep/new &&
> +	! ensure_not_expanded stash -u &&
> +	(
> +		WITHOUT_UNTRACKED_TXT=1 &&
> +		! ensure_not_expanded stash pop
> +	) &&
> +
> +	ensure_not_expanded stash create &&
> +	oid=$(git -C sparse-index stash create) &&
> +	ensure_not_expanded stash store -m "test" $oid &&
> +	ensure_not_expanded reset --hard &&
> +	! ensure_not_expanded stash pop &&
> +
> +	ensure_not_expanded checkout-index -f a &&
> +	ensure_not_expanded checkout-index -f --all &&
> +	for ref in update-deep update-folder1 update-folder2 update-deep
> +	do
> +		echo >>sparse-index/README.md &&
> +		ensure_not_expanded reset --hard $ref || return 1
> +	done

It is not obvious why that's necessary here. Perhaps a later
change will build upon these checkout-index commands within
this test?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-04-25 21:38   ` Junio C Hamano
@ 2022-04-26 12:57     ` Derrick Stolee
  0 siblings, 0 replies; 42+ messages in thread
From: Derrick Stolee @ 2022-04-26 12:57 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye via GitGitGadget; +Cc: git, newren, Victoria Dye

On 4/25/2022 5:38 PM, Junio C Hamano wrote:
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> -	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
>> +	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
> 
> This one does need input from Elijah to properly evaluate.  Off the
> top of my head, I wonder if the choice of the merge function should
> be part of "struct merge_options".
 
I agree that when we have an options struct, it is better to add a
member to the struct than to update the prototype of the function,
if only for the simplicity of not updating all callers.

I am interested in the effect that moving from recursive to ORT for
all 'git stash' commands might have, regardless of sparse index. I
expect 'git stash' to improve on a number of dimensions with that
change.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 6/7] stash: merge applied stash with merge-ort
  2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
@ 2022-04-26 13:02   ` Derrick Stolee
  0 siblings, 0 replies; 42+ messages in thread
From: Derrick Stolee @ 2022-04-26 13:02 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: newren, gitster, Victoria Dye

On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
> From: Victoria Dye <vdye@github.com>
> 
> Change the merge function used in 'do_apply_stash()' from 'merge_recursive'
> to 'merge_ort_recursive'. In addition to aligning with the default merge
> strategy used by 'git merge' (6a5fb96672 (Change default merge backend from
> recursive to ort, 2021-08-04)), this allows 'git stash <apply|pop>' to
> operate without expanding the index by default. Update tests in 't1092'
> verifying index expansion for 'git stash' accordingly.

This is an interesting change, not just in the sparse index sense. Yes,
using ORT by default is a good idea to align with our default merge
strategy _and_ allowing us to avoid expanding the sparse index.

But: should we allow this algorithm to change via our pull.twoHead
config value? By default, it will be fast with the sparse index. Would
there be value in allowing a user to change back to the recursive
strategy if they want?

This also seems like we should have a measurable performance impact
for 'git stash <apply|pop>' when the stash contains a lot of renames.
It would be interesting to have that documented here. Elijah might
have some thoughts on how to measure this performance impact.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 0/7] Sparse index: integrate with 'git stash'
  2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
@ 2022-04-26 13:09   ` Derrick Stolee
  0 siblings, 0 replies; 42+ messages in thread
From: Derrick Stolee @ 2022-04-26 13:09 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget, git; +Cc: newren, gitster, Victoria Dye

On 4/26/2022 8:49 AM, Derrick Stolee wrote:
> On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
>> This series, in combination with the sparse index integrations of reset [1],
>> update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
>> most subcommands of 'git stash' to use the sparse index end-to-end without
>> index expansion.
> 
> I'm really excited to see this culmination of your work in this area!

After taking a careful read of this series, I'm really pleased with how
simple the actual code changes are at this point. This is a testament
to the hard work you've done on the child processes and also some luck
in how we've added protections to different portions of the lower-level
APIs.

The biggest interesting points from my review and Junio's comments are:

1. The merge_recursive_generic() change could maybe be done by modifying
   the options struct instead of changing the prototype.

2. Using merge ORT for applying stashes could maybe be changed by config,
   allowing users to select recursive if they want. This change could
   also use some testing to show the merits of ORT even without a sparse
   index.

And then I had some nitpicks:

* A script variable could use quoting, probably.

* You moved the checkout-index tests to be next to the stash tests, but
  I don't see the value of that. (Perhaps I just need to be enlightened.)

Thanks!
-Stolee

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/7] stash: integrate with sparse index
  2022-04-26 12:53   ` Derrick Stolee
@ 2022-04-26 15:26     ` Victoria Dye
  2022-04-26 16:21     ` Junio C Hamano
  1 sibling, 0 replies; 42+ messages in thread
From: Victoria Dye @ 2022-04-26 15:26 UTC (permalink / raw)
  To: Derrick Stolee, Victoria Dye via GitGitGadget, git; +Cc: newren, gitster

Derrick Stolee wrote:
> On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
>> From: Victoria Dye <vdye@github.com>
>>
>> Enable sparse index in 'git stash' by disabling
>> 'command_requires_full_index'.
> 
>>  ensure_not_expanded () {
>>  	rm -f trace2.txt &&
>> -	echo >>sparse-index/untracked.txt &&
>> +	if test -z $WITHOUT_UNTRACKED_TXT
> 
> Do we need quotes around "$WITHOUT_UNTRACKED_TXT"?
> 
> I mean, I suppose we don't since the tests pass when this variable
> is unset, but I think it is a good practice to be careful about
> empty variables. Or am I wrong?
> 
>> +	then
>> +		echo >>sparse-index/untracked.txt
>> +	fi &&
> 
> 
>> -	ensure_not_expanded checkout-index -f a &&
>> -	ensure_not_expanded checkout-index -f --all &&
>> -	for ref in update-deep update-folder1 update-folder2 update-deep
>> -	do
>> -		echo >>sparse-index/README.md &&
>> -		ensure_not_expanded reset --hard $ref || return 1
>> -	done &&
>> -
> 
> Moving these to be within the stash tests is interesting.
> 

That was unintentional. I originally had the 'stash' cases in the big
'sparse-index is not expanded' test, but decided to move them into their own
test for more appropriate granularity. In doing that, I accidentally took
some of the 'checkout-index' tests with it. I'll move them back in V2.

Thanks for the catch!

>> +test_expect_success 'sparse-index is not expanded: stash' '
>> +	init_repos &&
>> +
>> +	echo >>sparse-index/a &&
>> +	ensure_not_expanded stash &&
>> +	ensure_not_expanded stash list &&
>> +	ensure_not_expanded stash show stash@{0} &&
>> +	! ensure_not_expanded stash apply stash@{0} &&
>> +	ensure_not_expanded stash drop stash@{0} &&
>> +
>> +	echo >>sparse-index/deep/new &&
>> +	! ensure_not_expanded stash -u &&
>> +	(
>> +		WITHOUT_UNTRACKED_TXT=1 &&
>> +		! ensure_not_expanded stash pop
>> +	) &&
>> +
>> +	ensure_not_expanded stash create &&
>> +	oid=$(git -C sparse-index stash create) &&
>> +	ensure_not_expanded stash store -m "test" $oid &&
>> +	ensure_not_expanded reset --hard &&
>> +	! ensure_not_expanded stash pop &&
>> +
>> +	ensure_not_expanded checkout-index -f a &&
>> +	ensure_not_expanded checkout-index -f --all &&
>> +	for ref in update-deep update-folder1 update-folder2 update-deep
>> +	do
>> +		echo >>sparse-index/README.md &&
>> +		ensure_not_expanded reset --hard $ref || return 1
>> +	done
> 
> It is not obvious why that's necessary here. Perhaps a later
> change will build upon these checkout-index commands within
> this test?
> 
> Thanks,
> -Stolee


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 2/7] stash: integrate with sparse index
  2022-04-26 12:53   ` Derrick Stolee
  2022-04-26 15:26     ` Victoria Dye
@ 2022-04-26 16:21     ` Junio C Hamano
  1 sibling, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2022-04-26 16:21 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Victoria Dye via GitGitGadget, git, newren, Victoria Dye

Derrick Stolee <derrickstolee@github.com> writes:

> On 4/25/2022 1:49 PM, Victoria Dye via GitGitGadget wrote:
>> From: Victoria Dye <vdye@github.com>
>> 
>> Enable sparse index in 'git stash' by disabling
>> 'command_requires_full_index'.
>
>>  ensure_not_expanded () {
>>  	rm -f trace2.txt &&
>> -	echo >>sparse-index/untracked.txt &&
>> +	if test -z $WITHOUT_UNTRACKED_TXT
>
> Do we need quotes around "$WITHOUT_UNTRACKED_TXT"?
>
> I mean, I suppose we don't since the tests pass when this variable
> is unset, but I think it is a good practice to be careful about
> empty variables. Or am I wrong?

Interesting.  I do not think the reason why

	test -z && echo true

says "true" is "-z" noticed an empty string---it is "test $string"
that says "true" when "$string" is not a recognised command and is
not an empty string, no?

You are absolutely right that the above must be quoted.

Thanks for carefully reading.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 0/7] Sparse index: integrate with 'git stash'
  2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
                   ` (7 preceding siblings ...)
  2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
@ 2022-04-27 18:16 ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
                     ` (8 more replies)
  8 siblings, 9 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye

This series, in combination with the sparse index integrations of reset [1],
update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
most subcommands of 'git stash' to use the sparse index end-to-end without
index expansion.

Like the earlier series, this series starts with new tests ensuring
compatibility of the sparse index with non-sparse index full and sparse
checkouts [1/7]. Next, sparse index is trivially enabled [2/7].
Functionally, sparse index-enabled sparse-checkouts remain compatible with
non-sparse index sparse-checkouts, but there are still some cases where the
index (or a temporary index) is expanded unnecessarily. These cases are
fixed in three parts:

 * First, 'git stash -u' is made sparse index-compatible by ensuring the
   "temporary" index holding the stashed, untracked files is created as a
   sparse index whenever possible (per repo settings &
   'is_sparse_index_allowed()'). Patch [3/7] exposes
   'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
   patch [4/7] uses that function to mark the temporary index sparse when
   appropriate.
 * Next, 'git stash (apply|pop)' are made sparse index-compatible by
   changing their internal "merge" function (executed via
   'merge_recursive_generic()') from 'merge_recursive()' to
   'merge_ort_recursive()'. This requires first allowing
   'merge_recursive_generic()' to accept a merge function as an input
   (rather than hardcoding use of 'merge_recursive()') in patch [5/7], then
   changing the call in 'stash.c' to specify 'merge_ort_recursive()' in
   patch [6/7]. See note [4] for possible alternate implementations.
 * Finally, while patches 5 & 6 avoid index expansion for most cases of 'git
   stash (apply|pop)', applying a stash that includes untracked files still
   expands the index. This is a result of an internal 'read-tree' execution
   (specifically in its 'unpack_trees' call) creating a result index that is
   never sparse in-core, thus forcing the index to be unnecessarily
   collapsed and re-expanded in 'do_write_locked_index()'. In patch [7/7],
   'unpack_trees' is updated to set the default sparsity of the resultant
   index to "sparse" if allowed by repo settings and
   'is_sparse_index_allowed()' (similar to the change in patch 4).

Performance results (from the 'p2000' tests):

(git stash &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%

(echo >>new &&
 git stash -u &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%



Changes since V1
================

 * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in
   'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh')
 * Moved the 'stash' test in 't1092' elsewhere in the file, so that it
   doesn't conflict (even trivially) with the also-in-flight 'git show'
   integration
 * Moved the 'ensure_not_expended' tests for 'checkout-index' back to
   original location

[1]
https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
[2]
https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
[3]
https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/
[4] I went with changing 'stash' to always use 'merge-ort' in
'merge_recursive_generic()' as a sort of "middle ground" between "replace
'merge_recursive()' with 'merge_ort_recursive()' in all of its hardcoded
internal usage" and "only use 'merge-ort' if using a sparse index in 'git
stash', otherwise 'merge-recursive'". The former would extend the use of
'merge-ort' to 'git am' and 'git merge-recursive', whereas the latter is a
more cautious/narrowly-focused option. If anyone has any other thoughts, I'm
interested in hearing them.

Thanks!

-Victoria

Victoria Dye (7):
  stash: expand sparse-checkout compatibility testing
  stash: integrate with sparse index
  sparse-index: expose 'is_sparse_index_allowed()'
  read-cache: set sparsity when index is new
  merge-recursive: add merge function arg to 'merge_recursive_generic'
  stash: merge applied stash with merge-ort
  unpack-trees: preserve index sparsity

 builtin/am.c                             |  2 +-
 builtin/merge-recursive.c                |  2 +-
 builtin/stash.c                          |  6 +-
 merge-ort.c                              |  3 +-
 merge-recursive.c                        |  4 +-
 merge-recursive.h                        |  9 ++-
 read-cache.c                             | 18 +++++-
 sparse-index.c                           |  2 +-
 sparse-index.h                           |  1 +
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 78 +++++++++++++++++++++++-
 unpack-trees.c                           |  6 ++
 12 files changed, 123 insertions(+), 10 deletions(-)


base-commit: 6cd33dceed60949e2dbc32e3f0f5e67c4c882e1e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1171%2Fvdye%2Fsparse%2Fstash-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1171/vdye/sparse/stash-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1171

Range-diff vs v1:

 1:  994864852a0 ! 1:  8ea986cb249 stash: expand sparse-checkout compatibility testing
     @@ t/perf/p2000-sparse-operations.sh: test_perf_on_all () {
       test_perf_on_all git commit -a -m A
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
     -@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'clean' '
     - 	test_sparse_match test_path_is_dir folder1
     +@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'cherry-pick with conflicts' '
     + 	test_all_match test_must_fail git cherry-pick to-cherry-pick
       '
       
      +test_expect_success 'stash' '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'clean' '
      +	test_all_match git status --porcelain=v2
      +'
      +
     - test_expect_success 'submodule handling' '
     + test_expect_success 'checkout-index inside sparse definition' '
       	init_repos &&
       
 2:  f6cf05a5bee ! 2:  b3e3f0298fb stash: integrate with sparse index
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'index.sparse disa
       ensure_not_expanded () {
       	rm -f trace2.txt &&
      -	echo >>sparse-index/untracked.txt &&
     -+	if test -z $WITHOUT_UNTRACKED_TXT
     ++	if test -z "$WITHOUT_UNTRACKED_TXT"
      +	then
      +		echo >>sparse-index/untracked.txt
      +	fi &&
       
       	if test "$1" = "!"
       	then
     -@@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded' '
     - 	echo >>sparse-index/untracked.txt &&
     - 	ensure_not_expanded add . &&
     - 
     --	ensure_not_expanded checkout-index -f a &&
     --	ensure_not_expanded checkout-index -f --all &&
     --	for ref in update-deep update-folder1 update-folder2 update-deep
     --	do
     --		echo >>sparse-index/README.md &&
     --		ensure_not_expanded reset --hard $ref || return 1
     --	done &&
     --
     - 	ensure_not_expanded reset --mixed base &&
     - 	ensure_not_expanded reset --hard update-deep &&
     - 	ensure_not_expanded reset --keep base &&
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
       	)
       '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +	oid=$(git -C sparse-index stash create) &&
      +	ensure_not_expanded stash store -m "test" $oid &&
      +	ensure_not_expanded reset --hard &&
     -+	! ensure_not_expanded stash pop &&
     -+
     -+	ensure_not_expanded checkout-index -f a &&
     -+	ensure_not_expanded checkout-index -f --all &&
     -+	for ref in update-deep update-folder1 update-folder2 update-deep
     -+	do
     -+		echo >>sparse-index/README.md &&
     -+		ensure_not_expanded reset --hard $ref || return 1
     -+	done
     ++	! ensure_not_expanded stash pop
      +'
      +
       test_expect_success 'sparse index is not expanded: diff' '
 3:  27d9920366f = 3:  73f04e95400 sparse-index: expose 'is_sparse_index_allowed()'
 4:  64edbed0f95 = 4:  42550f39a75 read-cache: set sparsity when index is new
 5:  80c25c75874 = 5:  4537d473b93 merge-recursive: add merge function arg to 'merge_recursive_generic'
 6:  76f2a9e8722 ! 6:  22fee0732ad stash: merge applied stash with merge-ort
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
       	oid=$(git -C sparse-index stash create) &&
       	ensure_not_expanded stash store -m "test" $oid &&
       	ensure_not_expanded reset --hard &&
     --	! ensure_not_expanded stash pop &&
     -+	ensure_not_expanded stash pop &&
     +-	! ensure_not_expanded stash pop
     ++	ensure_not_expanded stash pop
     + '
       
     - 	ensure_not_expanded checkout-index -f a &&
     - 	ensure_not_expanded checkout-index -f --all &&
     + test_expect_success 'sparse index is not expanded: diff' '
 7:  1daecbe45c1 = 7:  3179018a8cb unpack-trees: preserve index sparsity

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Add tests verifying expected 'git stash' behavior in
't1092-sparse-checkout-compatibility'. These cases establish the expected
behavior of 'git stash' in a sparse-checkout and verify consistency both
with and without a sparse index. Although no sparse index compatibility has
been integrated into 'git stash' yet, the tests are all 'expect_success' -
we don't want the cone-mode sparse-checkout behavior to change depending on
whether it is using a sparse index or not. Therefore, we expect these tests
to continue passing once sparse index is integrated with 'git stash'.

Additionally, add performance test cases for 'git stash' both with and
without untracked files. Note that, unlike the other tests in
'p2000-sparse-operations.sh', the tests added for 'stash' are combination
operations. This is done to ensure the stash/unstash is not blocked by the
modification of '$SPARSE_CONE/a' performed as part of 'test_perf_on_all'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 49 ++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 382716cfca9..76710cbef35 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -106,6 +106,8 @@ test_perf_on_all () {
 }
 
 test_perf_on_all git status
+test_perf_on_all 'git stash && git stash pop'
+test_perf_on_all 'echo >>new && git stash -u && git stash pop'
 test_perf_on_all git add -A
 test_perf_on_all git add .
 test_perf_on_all git commit -a -m A
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 236ab530284..86312b30444 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1034,6 +1034,55 @@ test_expect_success 'cherry-pick with conflicts' '
 	test_all_match test_must_fail git cherry-pick to-cherry-pick
 '
 
+test_expect_success 'stash' '
+	init_repos &&
+
+	write_script edit-contents <<-\EOF &&
+	echo text >>$1
+	EOF
+
+	# Stash a sparse directory (folder1)
+	test_all_match git checkout -b test-branch rename-base &&
+	test_all_match git reset --soft rename-out-to-out &&
+	test_all_match git stash &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Apply the sparse directory stash without reinstating the index
+	test_all_match git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where stash can be applied
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard rename-out-to-out &&
+
+	# Apply the sparse directory stash *with* reinstating the index
+	test_all_match git stash apply --index -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where we will get a conflict applying the stash
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard update-folder1 &&
+
+	# Apply the sparse directory stash with conflicts
+	test_all_match test_must_fail git stash apply --index -q &&
+	test_all_match test_must_fail git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to base branch
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard base &&
+
+	# Stash & unstash an untracked file outside of the sparse checkout
+	# definition.
+	run_on_sparse mkdir -p folder1 &&
+	run_on_all ../edit-contents folder1/new &&
+	test_all_match git stash -u &&
+	test_all_match git status --porcelain=v2 &&
+
+	test_all_match git stash pop -q &&
+	test_all_match git status --porcelain=v2
+'
+
 test_expect_success 'checkout-index inside sparse definition' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 2/7] stash: integrate with sparse index
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.

With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.

Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          |  3 +++
 t/t1092-sparse-checkout-compatibility.sh | 29 +++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 0c7b6a95882..1bfba532044 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1770,6 +1770,9 @@ int cmd_stash(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, git_stash_usage,
 			     PARSE_OPT_KEEP_UNKNOWN | PARSE_OPT_KEEP_DASHDASH);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	index_file = get_index_file();
 	strbuf_addf(&stash_index_path, "%s.stash.%" PRIuMAX, index_file,
 		    (uintmax_t)pid);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 86312b30444..75d844cd71d 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1271,7 +1271,10 @@ test_expect_success 'index.sparse disabled inline uses full index' '
 
 ensure_not_expanded () {
 	rm -f trace2.txt &&
-	echo >>sparse-index/untracked.txt &&
+	if test -z "$WITHOUT_UNTRACKED_TXT"
+	then
+		echo >>sparse-index/untracked.txt
+	fi &&
 
 	if test "$1" = "!"
 	then
@@ -1375,6 +1378,30 @@ test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
 	)
 '
 
+test_expect_success 'sparse-index is not expanded: stash' '
+	init_repos &&
+
+	echo >>sparse-index/a &&
+	ensure_not_expanded stash &&
+	ensure_not_expanded stash list &&
+	ensure_not_expanded stash show stash@{0} &&
+	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash drop stash@{0} &&
+
+	echo >>sparse-index/deep/new &&
+	! ensure_not_expanded stash -u &&
+	(
+		WITHOUT_UNTRACKED_TXT=1 &&
+		! ensure_not_expanded stash pop
+	) &&
+
+	ensure_not_expanded stash create &&
+	oid=$(git -C sparse-index stash create) &&
+	ensure_not_expanded stash store -m "test" $oid &&
+	ensure_not_expanded reset --hard &&
+	! ensure_not_expanded stash pop
+'
+
 test_expect_success 'sparse index is not expanded: diff' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()'
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Expose 'is_sparse_index_allowed()' publicly so that it may be used by
callers outside of 'sparse-index.c'. While no such callers exist yet, it
will be used in a subsequent commit.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 2 +-
 sparse-index.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/sparse-index.c b/sparse-index.c
index 8636af72de5..ffbab7d35f1 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -118,7 +118,7 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-static int is_sparse_index_allowed(struct index_state *istate, int flags)
+int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
 	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
diff --git a/sparse-index.h b/sparse-index.h
index 633d4fb7e31..f57c65d972f 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -3,6 +3,7 @@
 
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
+int is_sparse_index_allowed(struct index_state *istate, int flags);
 int convert_to_sparse(struct index_state *istate, int flags);
 void ensure_correct_sparsity(struct index_state *istate);
 void clear_skip_worktree_from_present_files(struct index_state *istate);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 4/7] read-cache: set sparsity when index is new
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-04-27 18:16   ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When the index read in 'do_read_index()' does not exist on-disk, mark the
index "sparse" if the executing command does not require a full index and
sparse index is otherwise enabled.

Some commands (such as 'git stash -u') implicitly create a new index (when
the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
some operation on it. However, when this index is created, it isn't created
with the same sparsity settings as the repo index. As a result, while these
indexes may be sparse during the operation, they are always expanded before
being written to disk. We can avoid that expansion by defaulting the index
to "sparse", in which case it will only be expanded if the full index is
needed.

Note that the function 'set_new_index_sparsity()' is created despite having
only a single caller because additional callers will be added in a
subsequent patch.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             | 18 +++++++++++++++++-
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 4df97e185e9..60355f5ad6a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2260,6 +2260,20 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	return consumed;
 }
 
+static void set_new_index_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index's repo exists, mark it sparse according to
+	 * repo settings.
+	 */
+	if (istate->repo) {
+		prepare_repo_settings(istate->repo);
+		if (!istate->repo->settings.command_requires_full_index &&
+		    is_sparse_index_allowed(istate, 0))
+			istate->sparse_index = 1;
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -2281,8 +2295,10 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	istate->timestamp.nsec = 0;
 	fd = open(path, O_RDONLY);
 	if (fd < 0) {
-		if (!must_exist && errno == ENOENT)
+		if (!must_exist && errno == ENOENT) {
+			set_new_index_sparsity(istate);
 			return 0;
+		}
 		die_errno(_("%s: index file open failed"), path);
 	}
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 75d844cd71d..85c6a56f1b7 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1389,7 +1389,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
-	! ensure_not_expanded stash -u &&
+	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
 		! ensure_not_expanded stash pop
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-04-27 18:16   ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-05-06  7:23     ` Elijah Newren
  2022-04-27 18:16   ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Replace the hardcoded 'merge_recursive()' function used by the
'merge_recursive_generic()' with a caller-specific merge function. This will
allow us to use 'merge_ort_recursive()' (and therefore avoid the index
expansion of 'merge_recursive()') in commands that perform merges with
'merge_recursive_generic()', such as 'git stash pop'.

Note that this patch is strictly a refactor; all callers still use
'merge_recursive()', and any changing to 'merge_ort_recursive()' will be
done in a later commit.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/am.c              | 2 +-
 builtin/merge-recursive.c | 2 +-
 builtin/stash.c           | 2 +-
 merge-ort.c               | 3 ++-
 merge-recursive.c         | 4 ++--
 merge-recursive.h         | 9 ++++++++-
 6 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 0f4111bafa0..6d01185d122 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1614,7 +1614,7 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
 	if (state->quiet)
 		o.verbosity = 0;
 
-	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
+	if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
 		repo_rerere(the_repository, state->allow_rerere_autoupdate);
 		free(their_tree_name);
 		return error(_("Failed to merge in the changes."));
diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index b9acbf5d342..687ed1e527b 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -81,7 +81,7 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
 
-	failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, &result);
+	failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, merge_recursive, &result);
 
 	free(better1);
 	free(better2);
diff --git a/builtin/stash.c b/builtin/stash.c
index 1bfba532044..16171eb1dab 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -554,7 +554,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 	bases[0] = &info->b_tree;
 
 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
-				      &result);
+				      merge_recursive, &result);
 	if (ret) {
 		rerere(0);
 
diff --git a/merge-ort.c b/merge-ort.c
index 8545354dafd..4bccdfcf355 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4737,7 +4737,8 @@ void merge_incore_recursive(struct merge_options *opt,
 	trace2_region_enter("merge", "incore_recursive", opt->repo);
 
 	/* We set the ancestor label based on the merge_bases */
-	assert(opt->ancestor == NULL);
+	assert(opt->ancestor == NULL ||
+	       !strcmp(opt->ancestor, "constructed merge base"));
 
 	trace2_region_enter("merge", "merge_start", opt->repo);
 	merge_start(opt, result);
diff --git a/merge-recursive.c b/merge-recursive.c
index 1ee6364e8b1..2088f5c5fb3 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3806,6 +3806,7 @@ int merge_recursive_generic(struct merge_options *opt,
 			    const struct object_id *merge,
 			    int num_merge_bases,
 			    const struct object_id **merge_bases,
+			    recursive_merge_fn_t merge_fn,
 			    struct commit **result)
 {
 	int clean;
@@ -3829,8 +3830,7 @@ int merge_recursive_generic(struct merge_options *opt,
 	}
 
 	repo_hold_locked_index(opt->repo, &lock, LOCK_DIE_ON_ERROR);
-	clean = merge_recursive(opt, head_commit, next_commit, ca,
-				result);
+	clean = merge_fn(opt, head_commit, next_commit, ca, result);
 	if (clean < 0) {
 		rollback_lock_file(&lock);
 		return clean;
diff --git a/merge-recursive.h b/merge-recursive.h
index b88000e3c25..6a21f2da538 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -53,6 +53,12 @@ struct merge_options {
 	struct merge_options_internal *priv;
 };
 
+typedef int (*recursive_merge_fn_t)(struct merge_options *opt,
+				    struct commit *h1,
+				    struct commit *h2,
+				    struct commit_list *merge_bases,
+				    struct commit **result);
+
 void init_merge_options(struct merge_options *opt, struct repository *repo);
 
 /* parse the option in s and update the relevant field of opt */
@@ -105,7 +111,7 @@ int merge_recursive(struct merge_options *opt,
 
 /*
  * merge_recursive_generic can operate on trees instead of commits, by
- * wrapping the trees into virtual commits, and calling merge_recursive().
+ * wrapping the trees into virtual commits, and calling the provided merge_fn.
  * It also writes out the in-memory index to disk if the merge is successful.
  *
  * Outputs:
@@ -120,6 +126,7 @@ int merge_recursive_generic(struct merge_options *opt,
 			    const struct object_id *merge,
 			    int num_merge_bases,
 			    const struct object_id **merge_bases,
+			    recursive_merge_fn_t merge_fn,
 			    struct commit **result);
 
 #endif
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 6/7] stash: merge applied stash with merge-ort
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-04-27 18:16   ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Change the merge function used in 'do_apply_stash()' from 'merge_recursive'
to 'merge_ort_recursive'. In addition to aligning with the default merge
strategy used by 'git merge' (6a5fb96672 (Change default merge backend from
recursive to ort, 2021-08-04)), this allows 'git stash <apply|pop>' to
operate without expanding the index by default. Update tests in 't1092'
verifying index expansion for 'git stash' accordingly.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          | 3 ++-
 t/t1092-sparse-checkout-compatibility.sh | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 16171eb1dab..cd77d448546 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -7,6 +7,7 @@
 #include "cache-tree.h"
 #include "unpack-trees.h"
 #include "merge-recursive.h"
+#include "merge-ort-wrappers.h"
 #include "strvec.h"
 #include "run-command.h"
 #include "dir.h"
@@ -554,7 +555,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 	bases[0] = &info->b_tree;
 
 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
-				      merge_recursive, &result);
+				      merge_ort_recursive, &result);
 	if (ret) {
 		rerere(0);
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 85c6a56f1b7..aaf4d880dbc 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1385,7 +1385,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash &&
 	ensure_not_expanded stash list &&
 	ensure_not_expanded stash show stash@{0} &&
-	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash apply stash@{0} &&
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
@@ -1399,7 +1399,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	oid=$(git -C sparse-index stash create) &&
 	ensure_not_expanded stash store -m "test" $oid &&
 	ensure_not_expanded reset --hard &&
-	! ensure_not_expanded stash pop
+	ensure_not_expanded stash pop
 '
 
 test_expect_success 'sparse index is not expanded: diff' '
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 7/7] unpack-trees: preserve index sparsity
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-04-27 18:16   ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
@ 2022-04-27 18:16   ` Victoria Dye via GitGitGadget
  2022-05-06  7:46   ` [PATCH v2 0/7] Sparse index: integrate with 'git stash' Elijah Newren
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
  8 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-04-27 18:16 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When unpacking trees, set the default sparsity of the resultant index based
on repo settings and 'is_sparse_index_allowed()'.

Normally, when executing 'unpack_trees', the output index is marked sparse
when (and only when) it unpacks a sparse directory. However, an index may be
"sparse" even if it contains no sparse directories - when all files fall
inside the sparse-checkout definition or otherwise have SKIP_WORKTREE
disabled. Therefore, the output index may be marked "full" even when it is
"sparse", resulting in unnecessary 'ensure_full_index' calls when writing to
disk. Avoid this by setting the "default" index sparsity to match what is
expected for the repository.

As a consequence of this fix, the (non-merge) 'read-tree' performed when
applying a stash with untracked files no longer expands the index. Update
the corresponding test in 't1092'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 2 +-
 unpack-trees.c                           | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index aaf4d880dbc..19221c14225 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1392,7 +1392,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
-		! ensure_not_expanded stash pop
+		ensure_not_expanded stash pop
 	) &&
 
 	ensure_not_expanded stash create &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 7f528d35cc2..a1d0ff3a4d3 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -11,6 +11,7 @@
 #include "refs.h"
 #include "attr.h"
 #include "split-index.h"
+#include "sparse-index.h"
 #include "submodule.h"
 #include "submodule-config.h"
 #include "fsmonitor.h"
@@ -1839,6 +1840,11 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
 
+	if (!o->src_index->initialized &&
+	    !repo->settings.command_requires_full_index &&
+	    is_sparse_index_allowed(&o->result, 0))
+		o->result.sparse_index = 1;
+
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
@ 2022-05-06  7:23     ` Elijah Newren
  2022-05-09 19:24       ` Victoria Dye
  0 siblings, 1 reply; 42+ messages in thread
From: Elijah Newren @ 2022-05-06  7:23 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Victoria Dye <vdye@github.com>
>
> Replace the hardcoded 'merge_recursive()' function used by the
> 'merge_recursive_generic()' with a caller-specific merge function. This will
> allow us to use 'merge_ort_recursive()' (and therefore avoid the index
> expansion of 'merge_recursive()') in commands that perform merges with
> 'merge_recursive_generic()', such as 'git stash pop'.
>
> Note that this patch is strictly a refactor; all callers still use
> 'merge_recursive()', and any changing to 'merge_ort_recursive()' will be
> done in a later commit.

I'm not sure if we can gut merge_recursive_generic(), but I don't
think stash should use it...

> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  builtin/am.c              | 2 +-
>  builtin/merge-recursive.c | 2 +-
>  builtin/stash.c           | 2 +-
>  merge-ort.c               | 3 ++-
>  merge-recursive.c         | 4 ++--
>  merge-recursive.h         | 9 ++++++++-
>  6 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/builtin/am.c b/builtin/am.c
> index 0f4111bafa0..6d01185d122 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1614,7 +1614,7 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
>         if (state->quiet)
>                 o.verbosity = 0;
>
> -       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
> +       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
>                 repo_rerere(the_repository, state->allow_rerere_autoupdate);
>                 free(their_tree_name);
>                 return error(_("Failed to merge in the changes."));
> diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
> index b9acbf5d342..687ed1e527b 100644
> --- a/builtin/merge-recursive.c
> +++ b/builtin/merge-recursive.c
> @@ -81,7 +81,7 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
>         if (o.verbosity >= 3)
>                 printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
>
> -       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, &result);
> +       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, merge_recursive, &result);
>
>         free(better1);
>         free(better2);
> diff --git a/builtin/stash.c b/builtin/stash.c
> index 1bfba532044..16171eb1dab 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -554,7 +554,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>         bases[0] = &info->b_tree;
>
>         ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
> -                                     &result);
> +                                     merge_recursive, &result);
>         if (ret) {
>                 rerere(0);
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 8545354dafd..4bccdfcf355 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4737,7 +4737,8 @@ void merge_incore_recursive(struct merge_options *opt,
>         trace2_region_enter("merge", "incore_recursive", opt->repo);
>
>         /* We set the ancestor label based on the merge_bases */
> -       assert(opt->ancestor == NULL);
> +       assert(opt->ancestor == NULL ||
> +              !strcmp(opt->ancestor, "constructed merge base"));

...and here's one of the reasons why.  The fact that
merge_recursive_generic() uses this string when exactly one merge base
is passed is something that is only correct for git-am; it is wrong
and actively misleading for git-stash since it has a real merge base
that is not internally constructed by the operation using the merge
machinery.  (The merge base it uses is something like $STASH^1, IIRC.)

In fact, this was half the coin around why merge_recursive_generic()
wasn't converted when merge-ort was written; see
https://lore.kernel.org/git/CABPp-BHW61zA+MefvWK47iVZKY97rxc2XZ-NjXzuJxEhgSLqUw@mail.gmail.com/
and https://lore.kernel.org/git/CABPp-BFr=1iVY739cfh-1Hp82x-Mny-k4y0f3zZ_YuP3PxiGfQ@mail.gmail.com/
for more details.

The use of merge_recursive_generic() by stash is also a bit weird;
most of the time, stash is going to have actual commits instead of
just trees.  But stash dereferences those commits to trees, passes
them to merge_recursive_generic(), and then merge_recursive_generic()
has to create fake commits containing those trees, because the merge
machinery wants commits.  It feels a bit like a Rube Goldberg machine.
Also, stash also always calls merge_recursive_generic() with exactly
one merge base, which together with having real commits both kind of
defeat the need for "generic".    I think stash should just use
merge_trees()/merge_incore_nonrecursive() directly (much as
sequencer.c does).  The only special case to worry about with stash is
when c_tree != HEAD^{tree}, i.e. when applying changes on top of
already present changes instead of just on top of HEAD.  But in that
case, I think stash should be the thing to create a fake commit rather
than invoking some wrapper that will create fake commits for all three
trees.

>         trace2_region_enter("merge", "merge_start", opt->repo);
>         merge_start(opt, result);
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 1ee6364e8b1..2088f5c5fb3 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3806,6 +3806,7 @@ int merge_recursive_generic(struct merge_options *opt,
>                             const struct object_id *merge,
>                             int num_merge_bases,
>                             const struct object_id **merge_bases,
> +                           recursive_merge_fn_t merge_fn,
>                             struct commit **result)
>  {
>         int clean;
> @@ -3829,8 +3830,7 @@ int merge_recursive_generic(struct merge_options *opt,
>         }
>
>         repo_hold_locked_index(opt->repo, &lock, LOCK_DIE_ON_ERROR);
> -       clean = merge_recursive(opt, head_commit, next_commit, ca,
> -                               result);
> +       clean = merge_fn(opt, head_commit, next_commit, ca, result);
>         if (clean < 0) {
>                 rollback_lock_file(&lock);
>                 return clean;
> diff --git a/merge-recursive.h b/merge-recursive.h
> index b88000e3c25..6a21f2da538 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -53,6 +53,12 @@ struct merge_options {
>         struct merge_options_internal *priv;
>  };
>
> +typedef int (*recursive_merge_fn_t)(struct merge_options *opt,
> +                                   struct commit *h1,
> +                                   struct commit *h2,
> +                                   struct commit_list *merge_bases,
> +                                   struct commit **result);
> +
>  void init_merge_options(struct merge_options *opt, struct repository *repo);
>
>  /* parse the option in s and update the relevant field of opt */
> @@ -105,7 +111,7 @@ int merge_recursive(struct merge_options *opt,
>
>  /*
>   * merge_recursive_generic can operate on trees instead of commits, by
> - * wrapping the trees into virtual commits, and calling merge_recursive().
> + * wrapping the trees into virtual commits, and calling the provided merge_fn.
>   * It also writes out the in-memory index to disk if the merge is successful.
>   *
>   * Outputs:
> @@ -120,6 +126,7 @@ int merge_recursive_generic(struct merge_options *opt,
>                             const struct object_id *merge,
>                             int num_merge_bases,
>                             const struct object_id **merge_bases,
> +                           recursive_merge_fn_t merge_fn,
>                             struct commit **result);
>
>  #endif
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 0/7] Sparse index: integrate with 'git stash'
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (6 preceding siblings ...)
  2022-04-27 18:16   ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
@ 2022-05-06  7:46   ` Elijah Newren
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
  8 siblings, 0 replies; 42+ messages in thread
From: Elijah Newren @ 2022-05-06  7:46 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series, in combination with the sparse index integrations of reset [1],
> update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
> most subcommands of 'git stash' to use the sparse index end-to-end without
> index expansion.

I've read through the series.  Like Stolee, I'm pleased with how
simple some of the code changes are (much due to prior hard work in
the area) and how nicely you describe the changes.

At first when I was reading through, I was again a bit disappointed
that we're just converting the subcommands stash uses instead of
fixing stash to stop forking subprocesses, but...then I realized this
served as a forcing function of sorts to make more of Git sparse-index
compatible, so I think it's actually the better route.  We can always
make the orthogonal fixes to stash's implementation design later.  :-)

However, there is one small issue around the way merging is handled...

> Like the earlier series, this series starts with new tests ensuring
> compatibility of the sparse index with non-sparse index full and sparse
> checkouts [1/7]. Next, sparse index is trivially enabled [2/7].
> Functionally, sparse index-enabled sparse-checkouts remain compatible with
> non-sparse index sparse-checkouts, but there are still some cases where the
> index (or a temporary index) is expanded unnecessarily. These cases are
> fixed in three parts:
>
>  * First, 'git stash -u' is made sparse index-compatible by ensuring the
>    "temporary" index holding the stashed, untracked files is created as a
>    sparse index whenever possible (per repo settings &
>    'is_sparse_index_allowed()'). Patch [3/7] exposes
>    'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
>    patch [4/7] uses that function to mark the temporary index sparse when
>    appropriate.
>  * Next, 'git stash (apply|pop)' are made sparse index-compatible by
>    changing their internal "merge" function (executed via
>    'merge_recursive_generic()') from 'merge_recursive()' to
>    'merge_ort_recursive()'. This requires first allowing
>    'merge_recursive_generic()' to accept a merge function as an input
>    (rather than hardcoding use of 'merge_recursive()') in patch [5/7], then
>    changing the call in 'stash.c' to specify 'merge_ort_recursive()' in
>    patch [6/7]. See note [4] for possible alternate implementations.
>  * Finally, while patches 5 & 6 avoid index expansion for most cases of 'git
>    stash (apply|pop)', applying a stash that includes untracked files still
>    expands the index. This is a result of an internal 'read-tree' execution
>    (specifically in its 'unpack_trees' call) creating a result index that is
>    never sparse in-core, thus forcing the index to be unnecessarily
>    collapsed and re-expanded in 'do_write_locked_index()'. In patch [7/7],
>    'unpack_trees' is updated to set the default sparsity of the resultant
>    index to "sparse" if allowed by repo settings and
>    'is_sparse_index_allowed()' (similar to the change in patch 4).
>
> Performance results (from the 'p2000' tests):
>
> (git stash &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
> full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
> sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
> sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%
>
> (echo >>new &&
>  git stash -u &&
>  git stash pop)              master            this series
> ---------------------------------------------------------------------
> full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
> full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
> sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
> sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%
>
>
>
> Changes since V1
> ================
>
>  * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in
>    'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh')
>  * Moved the 'stash' test in 't1092' elsewhere in the file, so that it
>    doesn't conflict (even trivially) with the also-in-flight 'git show'
>    integration
>  * Moved the 'ensure_not_expended' tests for 'checkout-index' back to
>    original location
>
> [1]
> https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
> [2]
> https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
> [3]
> https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/
> [4] I went with changing 'stash' to always use 'merge-ort' in
> 'merge_recursive_generic()' as a sort of "middle ground" between "replace
> 'merge_recursive()' with 'merge_ort_recursive()' in all of its hardcoded
> internal usage" and "only use 'merge-ort' if using a sparse index in 'git
> stash', otherwise 'merge-recursive'". The former would extend the use of
> 'merge-ort' to 'git am' and 'git merge-recursive', whereas the latter is a
> more cautious/narrowly-focused option. If anyone has any other thoughts, I'm
> interested in hearing them.

I understand that you'd want to modify and use
merge_recursive_generic() in order to make the smallest code change
possible, but merge_recursive_generic() has no equivalent in ort
because during the review of ort, we discovered that porting over
merge_recursive_generic() meant porting bugs.  We need to fix those
bugs.  However, in the case of stash, I think we can just sidestep
those bugs. stash probably only uses merge_recursive_generic() because
it was the easiest transliteration of shell (which originally invoked
git-merge-recursive), as opposed to the best conversion.  The best
conversion at the time would have been to call merge_trees() instead
(stash doesn't need recursiveness, and tends to have commits available
rather than just trees).  I'd like to see us modify stash to call
merge_incore_nonrecursive() directly (and/or merge_trees() if we're
supporting using both merge-ort and merge-recursive backends).

No need to worry about git-am or git-merge-recursive for now; we can
replace/fix/update their calls to merge_recursive_generic() later.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-05-06  7:23     ` Elijah Newren
@ 2022-05-09 19:24       ` Victoria Dye
  2022-05-10  7:06         ` Elijah Newren
  0 siblings, 1 reply; 42+ messages in thread
From: Victoria Dye @ 2022-05-09 19:24 UTC (permalink / raw)
  To: Elijah Newren, Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano

Elijah Newren wrote:
> On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Victoria Dye <vdye@github.com>
>>
>> Replace the hardcoded 'merge_recursive()' function used by the
>> 'merge_recursive_generic()' with a caller-specific merge function. This will
>> allow us to use 'merge_ort_recursive()' (and therefore avoid the index
>> expansion of 'merge_recursive()') in commands that perform merges with
>> 'merge_recursive_generic()', such as 'git stash pop'.
>>
>> Note that this patch is strictly a refactor; all callers still use
>> 'merge_recursive()', and any changing to 'merge_ort_recursive()' will be
>> done in a later commit.
> 
> I'm not sure if we can gut merge_recursive_generic(), but I don't
> think stash should use it...
> 
>> Signed-off-by: Victoria Dye <vdye@github.com>
>> ---
>>  builtin/am.c              | 2 +-
>>  builtin/merge-recursive.c | 2 +-
>>  builtin/stash.c           | 2 +-
>>  merge-ort.c               | 3 ++-
>>  merge-recursive.c         | 4 ++--
>>  merge-recursive.h         | 9 ++++++++-
>>  6 files changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/builtin/am.c b/builtin/am.c
>> index 0f4111bafa0..6d01185d122 100644
>> --- a/builtin/am.c
>> +++ b/builtin/am.c
>> @@ -1614,7 +1614,7 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
>>         if (state->quiet)
>>                 o.verbosity = 0;
>>
>> -       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
>> +       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
>>                 repo_rerere(the_repository, state->allow_rerere_autoupdate);
>>                 free(their_tree_name);
>>                 return error(_("Failed to merge in the changes."));
>> diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
>> index b9acbf5d342..687ed1e527b 100644
>> --- a/builtin/merge-recursive.c
>> +++ b/builtin/merge-recursive.c
>> @@ -81,7 +81,7 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
>>         if (o.verbosity >= 3)
>>                 printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
>>
>> -       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, &result);
>> +       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, merge_recursive, &result);
>>
>>         free(better1);
>>         free(better2);
>> diff --git a/builtin/stash.c b/builtin/stash.c
>> index 1bfba532044..16171eb1dab 100644
>> --- a/builtin/stash.c
>> +++ b/builtin/stash.c
>> @@ -554,7 +554,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>>         bases[0] = &info->b_tree;
>>
>>         ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
>> -                                     &result);
>> +                                     merge_recursive, &result);
>>         if (ret) {
>>                 rerere(0);
>>
>> diff --git a/merge-ort.c b/merge-ort.c
>> index 8545354dafd..4bccdfcf355 100644
>> --- a/merge-ort.c
>> +++ b/merge-ort.c
>> @@ -4737,7 +4737,8 @@ void merge_incore_recursive(struct merge_options *opt,
>>         trace2_region_enter("merge", "incore_recursive", opt->repo);
>>
>>         /* We set the ancestor label based on the merge_bases */
>> -       assert(opt->ancestor == NULL);
>> +       assert(opt->ancestor == NULL ||
>> +              !strcmp(opt->ancestor, "constructed merge base"));
> 
> ...and here's one of the reasons why.  The fact that
> merge_recursive_generic() uses this string when exactly one merge base
> is passed is something that is only correct for git-am; it is wrong
> and actively misleading for git-stash since it has a real merge base
> that is not internally constructed by the operation using the merge
> machinery.  (The merge base it uses is something like $STASH^1, IIRC.)
> 
> In fact, this was half the coin around why merge_recursive_generic()
> wasn't converted when merge-ort was written; see
> https://lore.kernel.org/git/CABPp-BHW61zA+MefvWK47iVZKY97rxc2XZ-NjXzuJxEhgSLqUw@mail.gmail.com/
> and https://lore.kernel.org/git/CABPp-BFr=1iVY739cfh-1Hp82x-Mny-k4y0f3zZ_YuP3PxiGfQ@mail.gmail.com/
> for more details.
> 

All of that makes sense, thanks for the context!

> The use of merge_recursive_generic() by stash is also a bit weird;
> most of the time, stash is going to have actual commits instead of
> just trees.  But stash dereferences those commits to trees, passes
> them to merge_recursive_generic(), and then merge_recursive_generic()
> has to create fake commits containing those trees, because the merge
> machinery wants commits.  It feels a bit like a Rube Goldberg machine.
> Also, stash also always calls merge_recursive_generic() with exactly
> one merge base, which together with having real commits both kind of
> defeat the need for "generic".    I think stash should just use
> merge_trees()/merge_incore_nonrecursive() directly (much as
> sequencer.c does).  The only special case to worry about with stash is
> when c_tree != HEAD^{tree}, i.e. when applying changes on top of
> already present changes instead of just on top of HEAD.  But in that
> case, I think stash should be the thing to create a fake commit rather
> than invoking some wrapper that will create fake commits for all three
> trees.
> 

I'm a bit confused about this. The non-recursive merge functions
('merge_trees()' & 'merge_ort_nonrecursive()' or the lower-level
'merge_incore_nonrecursive()') merge trees, not commits, so performing a
non-recursive merge requires dereferencing commits to trees anyway. I think
I agree with your other message [1] that the 'stash' merge doesn't need to
merge recursively, but that would mean it also doesn't use the commits
*directly* (i.e., as arguments in the merge). 

Apologies if I'm missing something obvious, but I want to make sure I
understand your suggestion.

[1] https://lore.kernel.org/git/CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com/ 


>>         trace2_region_enter("merge", "merge_start", opt->repo);
>>         merge_start(opt, result);
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index 1ee6364e8b1..2088f5c5fb3 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -3806,6 +3806,7 @@ int merge_recursive_generic(struct merge_options *opt,
>>                             const struct object_id *merge,
>>                             int num_merge_bases,
>>                             const struct object_id **merge_bases,
>> +                           recursive_merge_fn_t merge_fn,
>>                             struct commit **result)
>>  {
>>         int clean;
>> @@ -3829,8 +3830,7 @@ int merge_recursive_generic(struct merge_options *opt,
>>         }
>>
>>         repo_hold_locked_index(opt->repo, &lock, LOCK_DIE_ON_ERROR);
>> -       clean = merge_recursive(opt, head_commit, next_commit, ca,
>> -                               result);
>> +       clean = merge_fn(opt, head_commit, next_commit, ca, result);
>>         if (clean < 0) {
>>                 rollback_lock_file(&lock);
>>                 return clean;
>> diff --git a/merge-recursive.h b/merge-recursive.h
>> index b88000e3c25..6a21f2da538 100644
>> --- a/merge-recursive.h
>> +++ b/merge-recursive.h
>> @@ -53,6 +53,12 @@ struct merge_options {
>>         struct merge_options_internal *priv;
>>  };
>>
>> +typedef int (*recursive_merge_fn_t)(struct merge_options *opt,
>> +                                   struct commit *h1,
>> +                                   struct commit *h2,
>> +                                   struct commit_list *merge_bases,
>> +                                   struct commit **result);
>> +
>>  void init_merge_options(struct merge_options *opt, struct repository *repo);
>>
>>  /* parse the option in s and update the relevant field of opt */
>> @@ -105,7 +111,7 @@ int merge_recursive(struct merge_options *opt,
>>
>>  /*
>>   * merge_recursive_generic can operate on trees instead of commits, by
>> - * wrapping the trees into virtual commits, and calling merge_recursive().
>> + * wrapping the trees into virtual commits, and calling the provided merge_fn.
>>   * It also writes out the in-memory index to disk if the merge is successful.
>>   *
>>   * Outputs:
>> @@ -120,6 +126,7 @@ int merge_recursive_generic(struct merge_options *opt,
>>                             const struct object_id *merge,
>>                             int num_merge_bases,
>>                             const struct object_id **merge_bases,
>> +                           recursive_merge_fn_t merge_fn,
>>                             struct commit **result);
>>
>>  #endif
>> --
>> gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic'
  2022-05-09 19:24       ` Victoria Dye
@ 2022-05-10  7:06         ` Elijah Newren
  0 siblings, 0 replies; 42+ messages in thread
From: Elijah Newren @ 2022-05-10  7:06 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Victoria Dye via GitGitGadget, Git Mailing List, Derrick Stolee,
	Junio C Hamano

On Mon, May 9, 2022 at 12:24 PM Victoria Dye <vdye@github.com> wrote:
>
> Elijah Newren wrote:
> > On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Victoria Dye <vdye@github.com>
> >>
> >> Replace the hardcoded 'merge_recursive()' function used by the
> >> 'merge_recursive_generic()' with a caller-specific merge function. This will
> >> allow us to use 'merge_ort_recursive()' (and therefore avoid the index
> >> expansion of 'merge_recursive()') in commands that perform merges with
> >> 'merge_recursive_generic()', such as 'git stash pop'.
> >>
> >> Note that this patch is strictly a refactor; all callers still use
> >> 'merge_recursive()', and any changing to 'merge_ort_recursive()' will be
> >> done in a later commit.
> >
> > I'm not sure if we can gut merge_recursive_generic(), but I don't
> > think stash should use it...
> >
> >> Signed-off-by: Victoria Dye <vdye@github.com>
> >> ---
> >>  builtin/am.c              | 2 +-
> >>  builtin/merge-recursive.c | 2 +-
> >>  builtin/stash.c           | 2 +-
> >>  merge-ort.c               | 3 ++-
> >>  merge-recursive.c         | 4 ++--
> >>  merge-recursive.h         | 9 ++++++++-
> >>  6 files changed, 15 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/builtin/am.c b/builtin/am.c
> >> index 0f4111bafa0..6d01185d122 100644
> >> --- a/builtin/am.c
> >> +++ b/builtin/am.c
> >> @@ -1614,7 +1614,7 @@ static int fall_back_threeway(const struct am_state *state, const char *index_pa
> >>         if (state->quiet)
> >>                 o.verbosity = 0;
> >>
> >> -       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, &result)) {
> >> +       if (merge_recursive_generic(&o, &our_tree, &their_tree, 1, bases, merge_recursive, &result)) {
> >>                 repo_rerere(the_repository, state->allow_rerere_autoupdate);
> >>                 free(their_tree_name);
> >>                 return error(_("Failed to merge in the changes."));
> >> diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
> >> index b9acbf5d342..687ed1e527b 100644
> >> --- a/builtin/merge-recursive.c
> >> +++ b/builtin/merge-recursive.c
> >> @@ -81,7 +81,7 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
> >>         if (o.verbosity >= 3)
> >>                 printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
> >>
> >> -       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, &result);
> >> +       failed = merge_recursive_generic(&o, &h1, &h2, bases_count, bases, merge_recursive, &result);
> >>
> >>         free(better1);
> >>         free(better2);
> >> diff --git a/builtin/stash.c b/builtin/stash.c
> >> index 1bfba532044..16171eb1dab 100644
> >> --- a/builtin/stash.c
> >> +++ b/builtin/stash.c
> >> @@ -554,7 +554,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
> >>         bases[0] = &info->b_tree;
> >>
> >>         ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
> >> -                                     &result);
> >> +                                     merge_recursive, &result);
> >>         if (ret) {
> >>                 rerere(0);
> >>
> >> diff --git a/merge-ort.c b/merge-ort.c
> >> index 8545354dafd..4bccdfcf355 100644
> >> --- a/merge-ort.c
> >> +++ b/merge-ort.c
> >> @@ -4737,7 +4737,8 @@ void merge_incore_recursive(struct merge_options *opt,
> >>         trace2_region_enter("merge", "incore_recursive", opt->repo);
> >>
> >>         /* We set the ancestor label based on the merge_bases */
> >> -       assert(opt->ancestor == NULL);
> >> +       assert(opt->ancestor == NULL ||
> >> +              !strcmp(opt->ancestor, "constructed merge base"));
> >
> > ...and here's one of the reasons why.  The fact that
> > merge_recursive_generic() uses this string when exactly one merge base
> > is passed is something that is only correct for git-am; it is wrong
> > and actively misleading for git-stash since it has a real merge base
> > that is not internally constructed by the operation using the merge
> > machinery.  (The merge base it uses is something like $STASH^1, IIRC.)
> >
> > In fact, this was half the coin around why merge_recursive_generic()
> > wasn't converted when merge-ort was written; see
> > https://lore.kernel.org/git/CABPp-BHW61zA+MefvWK47iVZKY97rxc2XZ-NjXzuJxEhgSLqUw@mail.gmail.com/
> > and https://lore.kernel.org/git/CABPp-BFr=1iVY739cfh-1Hp82x-Mny-k4y0f3zZ_YuP3PxiGfQ@mail.gmail.com/
> > for more details.
> >
>
> All of that makes sense, thanks for the context!
>
> > The use of merge_recursive_generic() by stash is also a bit weird;
> > most of the time, stash is going to have actual commits instead of
> > just trees.  But stash dereferences those commits to trees, passes
> > them to merge_recursive_generic(), and then merge_recursive_generic()
> > has to create fake commits containing those trees, because the merge
> > machinery wants commits.  It feels a bit like a Rube Goldberg machine.
> > Also, stash also always calls merge_recursive_generic() with exactly
> > one merge base, which together with having real commits both kind of
> > defeat the need for "generic".    I think stash should just use
> > merge_trees()/merge_incore_nonrecursive() directly (much as
> > sequencer.c does).  The only special case to worry about with stash is
> > when c_tree != HEAD^{tree}, i.e. when applying changes on top of
> > already present changes instead of just on top of HEAD.  But in that
> > case, I think stash should be the thing to create a fake commit rather
> > than invoking some wrapper that will create fake commits for all three
> > trees.
> >
>
> I'm a bit confused about this. The non-recursive merge functions
> ('merge_trees()' & 'merge_ort_nonrecursive()' or the lower-level
> 'merge_incore_nonrecursive()') merge trees, not commits, so performing a
> non-recursive merge requires dereferencing commits to trees anyway. I think
> I agree with your other message [1] that the 'stash' merge doesn't need to
> merge recursively, but that would mean it also doesn't use the commits
> *directly* (i.e., as arguments in the merge).
>
> Apologies if I'm missing something obvious, but I want to make sure I
> understand your suggestion.
>
> [1] https://lore.kernel.org/git/CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com/

Oh, right, it's only the recursive merge that needs commits (so that
it can find ancestors and ancestors of ancestors, etc.).  So, ignore
my comments about making fake commits; that's not needed.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 0/6] Sparse index: integrate with 'git stash'
  2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
                     ` (7 preceding siblings ...)
  2022-05-06  7:46   ` [PATCH v2 0/7] Sparse index: integrate with 'git stash' Elijah Newren
@ 2022-05-10 23:32   ` Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
                       ` (5 more replies)
  8 siblings, 6 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye

This series, in combination with the sparse index integrations of reset [1],
update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
most subcommands of 'git stash' to use the sparse index end-to-end without
index expansion.

Like the earlier series, this series starts with new tests ensuring
compatibility of the sparse index with non-sparse index full and sparse
checkouts [1/6]. Next, sparse index is trivially enabled [2/6].
Functionally, sparse index-enabled sparse-checkouts remain compatible with
non-sparse index sparse-checkouts, but there are still some cases where the
index (or a temporary index) is expanded unnecessarily. These cases are
fixed in three parts:

 * First, 'git stash -u' is made sparse index-compatible by ensuring the
   "temporary" index holding the stashed, untracked files is created as a
   sparse index whenever possible (per repo settings &
   'is_sparse_index_allowed()'). Patch [3/6] exposes
   'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
   patch [4/6] uses that function to mark the temporary index sparse when
   appropriate.
 * Next, 'git stash (apply|pop)' are made sparse index-compatible by
   changing their internal merge function from 'merge_recursive_generic()'
   (which constructs "fake" commits from the working tree, stash, and stash
   base) to 'merge_ort_nonrecursive()' (which operates on the trees
   directly) in patch [5/6]. The use of the non-recursive 'merge-ort' helps
   us to avoid sparse index expansion associated with 'merge-recursive', as
   well as avoid the unused/unnecessary complexity of a recursive merge.
 * Finally, while patch 5 skips index expansion for most cases of 'git stash
   (apply|pop)', applying a stash that includes untracked files still
   expands the index. This is a result of an internal 'read-tree' execution
   (specifically in its 'unpack_trees' call) creating a result index that is
   never sparse in-core, thus forcing the index to be unnecessarily
   collapsed and re-expanded in 'do_write_locked_index()'. In patch [6/6],
   'unpack_trees' is updated to set the default sparsity of the resultant
   index to "sparse" if allowed by repo settings and
   'is_sparse_index_allowed()' (similar to the change in patch 4).

Performance results (from the 'p2000' tests):

(git stash &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%

(echo >>new &&
 git stash -u &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%



Changes since V2
================

 * Replaced use of 'merge_recursive_generic' with 'merge_ort_nonrecursive'
   in 'do_apply_stash()'
 * Rebased on top of 'master'


Changes since V1
================

 * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in
   'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh')
 * Moved the 'stash' test in 't1092' elsewhere in the file, so that it
   doesn't conflict (even trivially) with the also-in-flight 'git show'
   integration
 * Moved the 'ensure_not_expended' tests for 'checkout-index' back to
   original location

[1]
https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
[2]
https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
[3]
https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/

Thanks!

-Victoria

Victoria Dye (6):
  stash: expand sparse-checkout compatibility testing
  stash: integrate with sparse index
  sparse-index: expose 'is_sparse_index_allowed()'
  read-cache: set sparsity when index is new
  stash: apply stash using 'merge_ort_nonrecursive()'
  unpack-trees: preserve index sparsity

 builtin/stash.c                          | 33 ++++++++--
 read-cache.c                             | 18 +++++-
 sparse-index.c                           |  2 +-
 sparse-index.h                           |  1 +
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 78 +++++++++++++++++++++++-
 unpack-trees.c                           |  6 ++
 7 files changed, 131 insertions(+), 9 deletions(-)


base-commit: 0f828332d5ac36fc63b7d8202652efa152809856
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1171%2Fvdye%2Fsparse%2Fstash-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1171/vdye/sparse/stash-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1171

Range-diff vs v2:

 1:  8ea986cb249 = 1:  4e0a09f7a3c stash: expand sparse-checkout compatibility testing
 2:  b3e3f0298fb = 2:  7be484a8c0b stash: integrate with sparse index
 3:  73f04e95400 = 3:  6f00fca9267 sparse-index: expose 'is_sparse_index_allowed()'
 4:  42550f39a75 = 4:  bb092c075f4 read-cache: set sparsity when index is new
 5:  4537d473b93 < -:  ----------- merge-recursive: add merge function arg to 'merge_recursive_generic'
 6:  22fee0732ad ! 5:  e733c2fd9f4 stash: merge applied stash with merge-ort
     @@ Metadata
      Author: Victoria Dye <vdye@github.com>
      
       ## Commit message ##
     -    stash: merge applied stash with merge-ort
     +    stash: apply stash using 'merge_ort_nonrecursive()'
      
     -    Change the merge function used in 'do_apply_stash()' from 'merge_recursive'
     -    to 'merge_ort_recursive'. In addition to aligning with the default merge
     -    strategy used by 'git merge' (6a5fb96672 (Change default merge backend from
     -    recursive to ort, 2021-08-04)), this allows 'git stash <apply|pop>' to
     -    operate without expanding the index by default. Update tests in 't1092'
     -    verifying index expansion for 'git stash' accordingly.
     +    Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
     +    current working tree. When 'git stash apply' was converted from its shell
     +    script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
     +    builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
     +    into the working tree as part of 'git stash (apply|pop)'. However, with the
     +    single merge base used in 'do_apply_stash()', the commit wrapping done by
     +    'merge_recursive_generic()' is not only unnecessary, but misleading (the
     +    *real* merge base is labeled "constructed merge base"). Therefore, a
     +    non-recursive merge of the working tree, stashed tree, and stash base tree
     +    is more appropriate.
     +
     +    There are two options for a non-recursive merge-then-update-worktree
     +    function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
     +    'merge_ort_nonrecursive()' to align with the default merge strategy used by
     +    'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
     +    2021-08-04)) and, because merge-ort does not operate in-place on the index,
     +    avoid unnecessary index expansion. Update tests in 't1092' verifying index
     +    expansion for 'git stash' accordingly.
      
          Signed-off-by: Victoria Dye <vdye@github.com>
      
     @@ builtin/stash.c
       #include "strvec.h"
       #include "run-command.h"
       #include "dir.h"
     +@@ builtin/stash.c: static void unstage_changes_unless_new(struct object_id *orig_tree)
     + static int do_apply_stash(const char *prefix, struct stash_info *info,
     + 			  int index, int quiet)
     + {
     +-	int ret;
     ++	int clean, ret;
     + 	int has_index = index;
     + 	struct merge_options o;
     + 	struct object_id c_tree;
     + 	struct object_id index_tree;
     +-	struct commit *result;
     +-	const struct object_id *bases[1];
     ++	struct tree *head, *merge, *merge_base;
     ++	struct lock_file lock = LOCK_INIT;
     + 
     + 	read_cache_preload(NULL);
     + 	if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
     +@@ builtin/stash.c: static int do_apply_stash(const char *prefix, struct stash_info *info,
     + 
     + 	o.branch1 = "Updated upstream";
     + 	o.branch2 = "Stashed changes";
     ++	o.ancestor = "Stash base";
     + 
     + 	if (oideq(&info->b_tree, &c_tree))
     + 		o.branch1 = "Version stash was based on";
      @@ builtin/stash.c: static int do_apply_stash(const char *prefix, struct stash_info *info,
     - 	bases[0] = &info->b_tree;
     + 	if (o.verbosity >= 3)
     + 		printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
     + 
     +-	bases[0] = &info->b_tree;
     ++	head = lookup_tree(o.repo, &c_tree);
     ++	merge = lookup_tree(o.repo, &info->w_tree);
     ++	merge_base = lookup_tree(o.repo, &info->b_tree);
     ++
     ++	repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
     ++	clean = merge_ort_nonrecursive(&o, head, merge, merge_base);
     ++
     ++	/*
     ++	 * If 'clean' >= 0, reverse the value for 'ret' so 'ret' is 0 when the
     ++	 * merge was clean, and nonzero if the merge was unclean or encountered
     ++	 * an error.
     ++	 */
     ++	ret = clean >= 0 ? !clean : clean;
     ++
     ++	if (ret < 0)
     ++		rollback_lock_file(&lock);
     ++	else if (write_locked_index(o.repo->index, &lock,
     ++				      COMMIT_LOCK | SKIP_IF_UNCHANGED))
     ++		ret = error(_("could not write index"));
       
     - 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
     --				      merge_recursive, &result);
     -+				      merge_ort_recursive, &result);
     +-	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
     +-				      &result);
       	if (ret) {
       		rerere(0);
       
 7:  3179018a8cb = 6:  4b4c38fcc03 unpack-trees: preserve index sparsity

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Add tests verifying expected 'git stash' behavior in
't1092-sparse-checkout-compatibility'. These cases establish the expected
behavior of 'git stash' in a sparse-checkout and verify consistency both
with and without a sparse index. Although no sparse index compatibility has
been integrated into 'git stash' yet, the tests are all 'expect_success' -
we don't want the cone-mode sparse-checkout behavior to change depending on
whether it is using a sparse index or not. Therefore, we expect these tests
to continue passing once sparse index is integrated with 'git stash'.

Additionally, add performance test cases for 'git stash' both with and
without untracked files. Note that, unlike the other tests in
'p2000-sparse-operations.sh', the tests added for 'stash' are combination
operations. This is done to ensure the stash/unstash is not blocked by the
modification of '$SPARSE_CONE/a' performed as part of 'test_perf_on_all'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 49 ++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 382716cfca9..76710cbef35 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -106,6 +106,8 @@ test_perf_on_all () {
 }
 
 test_perf_on_all git status
+test_perf_on_all 'git stash && git stash pop'
+test_perf_on_all 'echo >>new && git stash -u && git stash pop'
 test_perf_on_all git add -A
 test_perf_on_all git add .
 test_perf_on_all git commit -a -m A
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 236ab530284..86312b30444 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1034,6 +1034,55 @@ test_expect_success 'cherry-pick with conflicts' '
 	test_all_match test_must_fail git cherry-pick to-cherry-pick
 '
 
+test_expect_success 'stash' '
+	init_repos &&
+
+	write_script edit-contents <<-\EOF &&
+	echo text >>$1
+	EOF
+
+	# Stash a sparse directory (folder1)
+	test_all_match git checkout -b test-branch rename-base &&
+	test_all_match git reset --soft rename-out-to-out &&
+	test_all_match git stash &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Apply the sparse directory stash without reinstating the index
+	test_all_match git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where stash can be applied
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard rename-out-to-out &&
+
+	# Apply the sparse directory stash *with* reinstating the index
+	test_all_match git stash apply --index -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to state where we will get a conflict applying the stash
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard update-folder1 &&
+
+	# Apply the sparse directory stash with conflicts
+	test_all_match test_must_fail git stash apply --index -q &&
+	test_all_match test_must_fail git stash apply -q &&
+	test_all_match git status --porcelain=v2 &&
+
+	# Reset to base branch
+	test_sparse_match git sparse-checkout reapply &&
+	test_all_match git reset --hard base &&
+
+	# Stash & unstash an untracked file outside of the sparse checkout
+	# definition.
+	run_on_sparse mkdir -p folder1 &&
+	run_on_all ../edit-contents folder1/new &&
+	test_all_match git stash -u &&
+	test_all_match git status --porcelain=v2 &&
+
+	test_all_match git stash pop -q &&
+	test_all_match git status --porcelain=v2
+'
+
 test_expect_success 'checkout-index inside sparse definition' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 2/6] stash: integrate with sparse index
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.

With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.

Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          |  3 +++
 t/t1092-sparse-checkout-compatibility.sh | 29 +++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 0c7b6a95882..1bfba532044 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1770,6 +1770,9 @@ int cmd_stash(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, git_stash_usage,
 			     PARSE_OPT_KEEP_UNKNOWN | PARSE_OPT_KEEP_DASHDASH);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	index_file = get_index_file();
 	strbuf_addf(&stash_index_path, "%s.stash.%" PRIuMAX, index_file,
 		    (uintmax_t)pid);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 86312b30444..75d844cd71d 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1271,7 +1271,10 @@ test_expect_success 'index.sparse disabled inline uses full index' '
 
 ensure_not_expanded () {
 	rm -f trace2.txt &&
-	echo >>sparse-index/untracked.txt &&
+	if test -z "$WITHOUT_UNTRACKED_TXT"
+	then
+		echo >>sparse-index/untracked.txt
+	fi &&
 
 	if test "$1" = "!"
 	then
@@ -1375,6 +1378,30 @@ test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
 	)
 '
 
+test_expect_success 'sparse-index is not expanded: stash' '
+	init_repos &&
+
+	echo >>sparse-index/a &&
+	ensure_not_expanded stash &&
+	ensure_not_expanded stash list &&
+	ensure_not_expanded stash show stash@{0} &&
+	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash drop stash@{0} &&
+
+	echo >>sparse-index/deep/new &&
+	! ensure_not_expanded stash -u &&
+	(
+		WITHOUT_UNTRACKED_TXT=1 &&
+		! ensure_not_expanded stash pop
+	) &&
+
+	ensure_not_expanded stash create &&
+	oid=$(git -C sparse-index stash create) &&
+	ensure_not_expanded stash store -m "test" $oid &&
+	ensure_not_expanded reset --hard &&
+	! ensure_not_expanded stash pop
+'
+
 test_expect_success 'sparse index is not expanded: diff' '
 	init_repos &&
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()'
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Expose 'is_sparse_index_allowed()' publicly so that it may be used by
callers outside of 'sparse-index.c'. While no such callers exist yet, it
will be used in a subsequent commit.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 sparse-index.c | 2 +-
 sparse-index.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/sparse-index.c b/sparse-index.c
index 8636af72de5..ffbab7d35f1 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -118,7 +118,7 @@ static int index_has_unmerged_entries(struct index_state *istate)
 	return 0;
 }
 
-static int is_sparse_index_allowed(struct index_state *istate, int flags)
+int is_sparse_index_allowed(struct index_state *istate, int flags)
 {
 	if (!core_apply_sparse_checkout || !core_sparse_checkout_cone)
 		return 0;
diff --git a/sparse-index.h b/sparse-index.h
index 633d4fb7e31..f57c65d972f 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -3,6 +3,7 @@
 
 struct index_state;
 #define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
+int is_sparse_index_allowed(struct index_state *istate, int flags);
 int convert_to_sparse(struct index_state *istate, int flags);
 void ensure_correct_sparsity(struct index_state *istate);
 void clear_skip_worktree_from_present_files(struct index_state *istate);
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 4/6] read-cache: set sparsity when index is new
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-05-10 23:32     ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
  2022-05-10 23:32     ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
  5 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When the index read in 'do_read_index()' does not exist on-disk, mark the
index "sparse" if the executing command does not require a full index and
sparse index is otherwise enabled.

Some commands (such as 'git stash -u') implicitly create a new index (when
the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
some operation on it. However, when this index is created, it isn't created
with the same sparsity settings as the repo index. As a result, while these
indexes may be sparse during the operation, they are always expanded before
being written to disk. We can avoid that expansion by defaulting the index
to "sparse", in which case it will only be expanded if the full index is
needed.

Note that the function 'set_new_index_sparsity()' is created despite having
only a single caller because additional callers will be added in a
subsequent patch.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 read-cache.c                             | 18 +++++++++++++++++-
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 4df97e185e9..60355f5ad6a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2260,6 +2260,20 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	return consumed;
 }
 
+static void set_new_index_sparsity(struct index_state *istate)
+{
+	/*
+	 * If the index's repo exists, mark it sparse according to
+	 * repo settings.
+	 */
+	if (istate->repo) {
+		prepare_repo_settings(istate->repo);
+		if (!istate->repo->settings.command_requires_full_index &&
+		    is_sparse_index_allowed(istate, 0))
+			istate->sparse_index = 1;
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -2281,8 +2295,10 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	istate->timestamp.nsec = 0;
 	fd = open(path, O_RDONLY);
 	if (fd < 0) {
-		if (!must_exist && errno == ENOENT)
+		if (!must_exist && errno == ENOENT) {
+			set_new_index_sparsity(istate);
 			return 0;
+		}
 		die_errno(_("%s: index file open failed"), path);
 	}
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 75d844cd71d..85c6a56f1b7 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1389,7 +1389,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
-	! ensure_not_expanded stash -u &&
+	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
 		! ensure_not_expanded stash pop
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-05-10 23:32     ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  2022-05-11  0:26       ` Junio C Hamano
                         ` (2 more replies)
  2022-05-10 23:32     ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
  5 siblings, 3 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.

There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 builtin/stash.c                          | 30 +++++++++++++++++++-----
 t/t1092-sparse-checkout-compatibility.sh |  4 ++--
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 1bfba532044..3fe549f7d3c 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -7,6 +7,7 @@
 #include "cache-tree.h"
 #include "unpack-trees.h"
 #include "merge-recursive.h"
+#include "merge-ort-wrappers.h"
 #include "strvec.h"
 #include "run-command.h"
 #include "dir.h"
@@ -492,13 +493,13 @@ static void unstage_changes_unless_new(struct object_id *orig_tree)
 static int do_apply_stash(const char *prefix, struct stash_info *info,
 			  int index, int quiet)
 {
-	int ret;
+	int clean, ret;
 	int has_index = index;
 	struct merge_options o;
 	struct object_id c_tree;
 	struct object_id index_tree;
-	struct commit *result;
-	const struct object_id *bases[1];
+	struct tree *head, *merge, *merge_base;
+	struct lock_file lock = LOCK_INIT;
 
 	read_cache_preload(NULL);
 	if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
@@ -541,6 +542,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 
 	o.branch1 = "Updated upstream";
 	o.branch2 = "Stashed changes";
+	o.ancestor = "Stash base";
 
 	if (oideq(&info->b_tree, &c_tree))
 		o.branch1 = "Version stash was based on";
@@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 	if (o.verbosity >= 3)
 		printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
 
-	bases[0] = &info->b_tree;
+	head = lookup_tree(o.repo, &c_tree);
+	merge = lookup_tree(o.repo, &info->w_tree);
+	merge_base = lookup_tree(o.repo, &info->b_tree);
+
+	repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
+	clean = merge_ort_nonrecursive(&o, head, merge, merge_base);
+
+	/*
+	 * If 'clean' >= 0, reverse the value for 'ret' so 'ret' is 0 when the
+	 * merge was clean, and nonzero if the merge was unclean or encountered
+	 * an error.
+	 */
+	ret = clean >= 0 ? !clean : clean;
+
+	if (ret < 0)
+		rollback_lock_file(&lock);
+	else if (write_locked_index(o.repo->index, &lock,
+				      COMMIT_LOCK | SKIP_IF_UNCHANGED))
+		ret = error(_("could not write index"));
 
-	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
-				      &result);
 	if (ret) {
 		rerere(0);
 
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 85c6a56f1b7..aaf4d880dbc 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1385,7 +1385,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash &&
 	ensure_not_expanded stash list &&
 	ensure_not_expanded stash show stash@{0} &&
-	! ensure_not_expanded stash apply stash@{0} &&
+	ensure_not_expanded stash apply stash@{0} &&
 	ensure_not_expanded stash drop stash@{0} &&
 
 	echo >>sparse-index/deep/new &&
@@ -1399,7 +1399,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	oid=$(git -C sparse-index stash create) &&
 	ensure_not_expanded stash store -m "test" $oid &&
 	ensure_not_expanded reset --hard &&
-	! ensure_not_expanded stash pop
+	ensure_not_expanded stash pop
 '
 
 test_expect_success 'sparse index is not expanded: diff' '
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 6/6] unpack-trees: preserve index sparsity
  2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
@ 2022-05-10 23:32     ` Victoria Dye via GitGitGadget
  5 siblings, 0 replies; 42+ messages in thread
From: Victoria Dye via GitGitGadget @ 2022-05-10 23:32 UTC (permalink / raw)
  To: git; +Cc: derrickstolee, newren, gitster, Victoria Dye, Victoria Dye

From: Victoria Dye <vdye@github.com>

When unpacking trees, set the default sparsity of the resultant index based
on repo settings and 'is_sparse_index_allowed()'.

Normally, when executing 'unpack_trees', the output index is marked sparse
when (and only when) it unpacks a sparse directory. However, an index may be
"sparse" even if it contains no sparse directories - when all files fall
inside the sparse-checkout definition or otherwise have SKIP_WORKTREE
disabled. Therefore, the output index may be marked "full" even when it is
"sparse", resulting in unnecessary 'ensure_full_index' calls when writing to
disk. Avoid this by setting the "default" index sparsity to match what is
expected for the repository.

As a consequence of this fix, the (non-merge) 'read-tree' performed when
applying a stash with untracked files no longer expands the index. Update
the corresponding test in 't1092'.

Signed-off-by: Victoria Dye <vdye@github.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 2 +-
 unpack-trees.c                           | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index aaf4d880dbc..19221c14225 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1392,7 +1392,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
 	ensure_not_expanded stash -u &&
 	(
 		WITHOUT_UNTRACKED_TXT=1 &&
-		! ensure_not_expanded stash pop
+		ensure_not_expanded stash pop
 	) &&
 
 	ensure_not_expanded stash create &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 7f528d35cc2..a1d0ff3a4d3 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -11,6 +11,7 @@
 #include "refs.h"
 #include "attr.h"
 #include "split-index.h"
+#include "sparse-index.h"
 #include "submodule.h"
 #include "submodule-config.h"
 #include "fsmonitor.h"
@@ -1839,6 +1840,11 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.fsmonitor_last_update =
 		xstrdup_or_null(o->src_index->fsmonitor_last_update);
 
+	if (!o->src_index->initialized &&
+	    !repo->settings.command_requires_full_index &&
+	    is_sparse_index_allowed(&o->result, 0))
+		o->result.sparse_index = 1;
+
 	/*
 	 * Sparse checkout loop #1: set NEW_SKIP_WORKTREE on existing entries
 	 */
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
@ 2022-05-11  0:26       ` Junio C Hamano
  2022-05-12  1:01       ` Jonathan Tan
  2022-05-12 14:51       ` Elijah Newren
  2 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2022-05-11  0:26 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget; +Cc: git, derrickstolee, newren, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/builtin/stash.c b/builtin/stash.c
> index 1bfba532044..3fe549f7d3c 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -492,13 +493,13 @@ static void unstage_changes_unless_new(struct object_id *orig_tree)
>  static int do_apply_stash(const char *prefix, struct stash_info *info,
>  			  int index, int quiet)
>  {
> -	int ret;
> +	int clean, ret;
>  	int has_index = index;
>  	struct merge_options o;
>  	struct object_id c_tree;
>  	struct object_id index_tree;
> -	struct commit *result;
> -	const struct object_id *bases[1];
> +	struct tree *head, *merge, *merge_base;
> +	struct lock_file lock = LOCK_INIT;
>  
>  	read_cache_preload(NULL);
>  	if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
> @@ -541,6 +542,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>  
>  	o.branch1 = "Updated upstream";
>  	o.branch2 = "Stashed changes";
> +	o.ancestor = "Stash base";
>  
>  	if (oideq(&info->b_tree, &c_tree))
>  		o.branch1 = "Version stash was based on";
> @@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>  	if (o.verbosity >= 3)
>  		printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
>  
> -	bases[0] = &info->b_tree;
> +	head = lookup_tree(o.repo, &c_tree);
> +	merge = lookup_tree(o.repo, &info->w_tree);
> +	merge_base = lookup_tree(o.repo, &info->b_tree);
> +
> +	repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
> +	clean = merge_ort_nonrecursive(&o, head, merge, merge_base);
> +
> +	/*
> +	 * If 'clean' >= 0, reverse the value for 'ret' so 'ret' is 0 when the
> +	 * merge was clean, and nonzero if the merge was unclean or encountered
> +	 * an error.
> +	 */
> +	ret = clean >= 0 ? !clean : clean;
> +
> +	if (ret < 0)
> +		rollback_lock_file(&lock);
> +	else if (write_locked_index(o.repo->index, &lock,
> +				      COMMIT_LOCK | SKIP_IF_UNCHANGED))
> +		ret = error(_("could not write index"));
>  
> -	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
> -				      &result);

Nice.  We need a bit more boilerplate code, but all we need is a three-way
merge without any recursive magic.

> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 85c6a56f1b7..aaf4d880dbc 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1385,7 +1385,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
>  	ensure_not_expanded stash &&
>  	ensure_not_expanded stash list &&
>  	ensure_not_expanded stash show stash@{0} &&
> -	! ensure_not_expanded stash apply stash@{0} &&
> +	ensure_not_expanded stash apply stash@{0} &&
>  	ensure_not_expanded stash drop stash@{0} &&
>  
>  	echo >>sparse-index/deep/new &&
> @@ -1399,7 +1399,7 @@ test_expect_success 'sparse-index is not expanded: stash' '
>  	oid=$(git -C sparse-index stash create) &&
>  	ensure_not_expanded stash store -m "test" $oid &&
>  	ensure_not_expanded reset --hard &&
> -	! ensure_not_expanded stash pop
> +	ensure_not_expanded stash pop
>  '
>  
>  test_expect_success 'sparse index is not expanded: diff' '

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
  2022-05-11  0:26       ` Junio C Hamano
@ 2022-05-12  1:01       ` Jonathan Tan
  2022-05-12 14:52         ` Elijah Newren
  2022-05-12 14:51       ` Elijah Newren
  2 siblings, 1 reply; 42+ messages in thread
From: Jonathan Tan @ 2022-05-12  1:01 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Jonathan Tan, git, derrickstolee, newren, gitster, Victoria Dye

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> @@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>  	if (o.verbosity >= 3)
>  		printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
>  
> -	bases[0] = &info->b_tree;
> +	head = lookup_tree(o.repo, &c_tree);
> +	merge = lookup_tree(o.repo, &info->w_tree);
> +	merge_base = lookup_tree(o.repo, &info->b_tree);
> +
> +	repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);

What's the reason for locking the index? I would think that
merge_ort_nonrecursive() does not modify the index (in any case, I
removed the locking code and the tests still pass). And as for changes
in the worktree, I ran "strace" on the "stash apply" in the test and I
didn't see any changes in the worktree from here until the lock is
released.

Other than that, this patch set looks good to me.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
  2022-05-11  0:26       ` Junio C Hamano
  2022-05-12  1:01       ` Jonathan Tan
@ 2022-05-12 14:51       ` Elijah Newren
  2 siblings, 0 replies; 42+ messages in thread
From: Elijah Newren @ 2022-05-12 14:51 UTC (permalink / raw)
  To: Victoria Dye via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Victoria Dye

On Tue, May 10, 2022 at 4:32 PM Victoria Dye via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Victoria Dye <vdye@github.com>
>
> Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
> current working tree. When 'git stash apply' was converted from its shell
> script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
> builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
> into the working tree as part of 'git stash (apply|pop)'. However, with the
> single merge base used in 'do_apply_stash()', the commit wrapping done by
> 'merge_recursive_generic()' is not only unnecessary, but misleading (the
> *real* merge base is labeled "constructed merge base"). Therefore, a
> non-recursive merge of the working tree, stashed tree, and stash base tree
> is more appropriate.
>
> There are two options for a non-recursive merge-then-update-worktree
> function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
> 'merge_ort_nonrecursive()' to align with the default merge strategy used by
> 'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
> 2021-08-04)) and, because merge-ort does not operate in-place on the index,
> avoid unnecessary index expansion. Update tests in 't1092' verifying index
> expansion for 'git stash' accordingly.
>
> Signed-off-by: Victoria Dye <vdye@github.com>
> ---
>  builtin/stash.c                          | 30 +++++++++++++++++++-----
>  t/t1092-sparse-checkout-compatibility.sh |  4 ++--
>  2 files changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index 1bfba532044..3fe549f7d3c 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -7,6 +7,7 @@
>  #include "cache-tree.h"
>  #include "unpack-trees.h"
>  #include "merge-recursive.h"
> +#include "merge-ort-wrappers.h"
>  #include "strvec.h"
>  #include "run-command.h"
>  #include "dir.h"
> @@ -492,13 +493,13 @@ static void unstage_changes_unless_new(struct object_id *orig_tree)
>  static int do_apply_stash(const char *prefix, struct stash_info *info,
>                           int index, int quiet)
>  {
> -       int ret;
> +       int clean, ret;
>         int has_index = index;
>         struct merge_options o;
>         struct object_id c_tree;
>         struct object_id index_tree;
> -       struct commit *result;
> -       const struct object_id *bases[1];
> +       struct tree *head, *merge, *merge_base;
> +       struct lock_file lock = LOCK_INIT;
>
>         read_cache_preload(NULL);
>         if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
> @@ -541,6 +542,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>
>         o.branch1 = "Updated upstream";
>         o.branch2 = "Stashed changes";
> +       o.ancestor = "Stash base";
>
>         if (oideq(&info->b_tree, &c_tree))
>                 o.branch1 = "Version stash was based on";
> @@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>         if (o.verbosity >= 3)
>                 printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
>
> -       bases[0] = &info->b_tree;
> +       head = lookup_tree(o.repo, &c_tree);
> +       merge = lookup_tree(o.repo, &info->w_tree);
> +       merge_base = lookup_tree(o.repo, &info->b_tree);
> +
> +       repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
> +       clean = merge_ort_nonrecursive(&o, head, merge, merge_base);

A very minor nit: I actually have a minor dislike for the
merge-ort-wrappers, but I included them in case people objected to the
slightly more verbose real API.  I assumed it'd only be used to
convert existing calls to merge_trees() and merge_recursive(); in this
case you were converting a call to merge_recursive_generic(), so I
would have preferred using merge_incore_nonrecursive().  That might
have answered Jonathan's question better too when he saw the explicit
merge_switch_to_result() call.  However, it's a minor point; I'm not
sure it's worth a re-roll.


This series looks good to me:
Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-12  1:01       ` Jonathan Tan
@ 2022-05-12 14:52         ` Elijah Newren
  2022-05-12 16:55           ` Jonathan Tan
  0 siblings, 1 reply; 42+ messages in thread
From: Elijah Newren @ 2022-05-12 14:52 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Victoria Dye via GitGitGadget, Git Mailing List, Derrick Stolee,
	Junio C Hamano, Victoria Dye

On Wed, May 11, 2022 at 6:01 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > @@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
> >       if (o.verbosity >= 3)
> >               printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
> >
> > -     bases[0] = &info->b_tree;
> > +     head = lookup_tree(o.repo, &c_tree);
> > +     merge = lookup_tree(o.repo, &info->w_tree);
> > +     merge_base = lookup_tree(o.repo, &info->b_tree);
> > +
> > +     repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
>
> What's the reason for locking the index? I would think that
> merge_ort_nonrecursive() does not modify the index (in any case, I
> removed the locking code and the tests still pass). And as for changes
> in the worktree, I ran "strace" on the "stash apply" in the test and I
> didn't see any changes in the worktree from here until the lock is
> released.

merge_incore_nonrecursive() doesn't modify the index or working tree,
but merge_ort_nonrecursive() certainly modifies both in general (via
its call to merge_switch_to_result()).  I'm surprised you didn't see
working tree changes in a strace; was the stash being applied on top
of some commit that had equivalent changes already in its history, so
that the stash application involved a merge where the working tree was
already up-to-date?

More generally, I think it'd be nice if stash locked the index early
in the beginning of its process, did all the modifications, and then
released the lock.  But that's somewhat orthogonal to these changes,
and it'd require getting rid of the forking-subprocesses design of
git-stash.  Since stash is just a transliteration of a shell script,
it locks and unlocks after each individual subcomponent instead.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()'
  2022-05-12 14:52         ` Elijah Newren
@ 2022-05-12 16:55           ` Jonathan Tan
  0 siblings, 0 replies; 42+ messages in thread
From: Jonathan Tan @ 2022-05-12 16:55 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jonathan Tan, Victoria Dye via GitGitGadget, Git Mailing List,
	Derrick Stolee, Junio C Hamano, Victoria Dye

Elijah Newren <newren@gmail.com> writes:
> On Wed, May 11, 2022 at 6:01 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > @@ -551,10 +553,26 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
> > >       if (o.verbosity >= 3)
> > >               printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
> > >
> > > -     bases[0] = &info->b_tree;
> > > +     head = lookup_tree(o.repo, &c_tree);
> > > +     merge = lookup_tree(o.repo, &info->w_tree);
> > > +     merge_base = lookup_tree(o.repo, &info->b_tree);
> > > +
> > > +     repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
> >
> > What's the reason for locking the index? I would think that
> > merge_ort_nonrecursive() does not modify the index (in any case, I
> > removed the locking code and the tests still pass). And as for changes
> > in the worktree, I ran "strace" on the "stash apply" in the test and I
> > didn't see any changes in the worktree from here until the lock is
> > released.
> 
> merge_incore_nonrecursive() doesn't modify the index or working tree,
> but merge_ort_nonrecursive() certainly modifies both in general (via
> its call to merge_switch_to_result()).  I'm surprised you didn't see
> working tree changes in a strace; was the stash being applied on top
> of some commit that had equivalent changes already in its history, so
> that the stash application involved a merge where the working tree was
> already up-to-date?
> 
> More generally, I think it'd be nice if stash locked the index early
> in the beginning of its process, did all the modifications, and then
> released the lock.  But that's somewhat orthogonal to these changes,
> and it'd require getting rid of the forking-subprocesses design of
> git-stash.  Since stash is just a transliteration of a shell script,
> it locks and unlocks after each individual subcomponent instead.

Thanks for the note. Yeah..I'm not sure what happened.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-05-12 16:56 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-25 21:34   ` Junio C Hamano
2022-04-26 12:53   ` Derrick Stolee
2022-04-26 15:26     ` Victoria Dye
2022-04-26 16:21     ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-25 21:35   ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-04-25 21:38   ` Junio C Hamano
2022-04-26 12:57     ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-26 13:02   ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
2022-04-26 13:09   ` Derrick Stolee
2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-05-06  7:23     ` Elijah Newren
2022-05-09 19:24       ` Victoria Dye
2022-05-10  7:06         ` Elijah Newren
2022-04-27 18:16   ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-05-06  7:46   ` [PATCH v2 0/7] Sparse index: integrate with 'git stash' Elijah Newren
2022-05-10 23:32   ` [PATCH v3 0/6] " Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
2022-05-11  0:26       ` Junio C Hamano
2022-05-12  1:01       ` Jonathan Tan
2022-05-12 14:52         ` Elijah Newren
2022-05-12 16:55           ` Jonathan Tan
2022-05-12 14:51       ` Elijah Newren
2022-05-10 23:32     ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).