From: Victoria Dye <vdye@github.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: gitster@pobox.com, shaoxuan.yuan02@gmail.com,
Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 8/8] sparse-checkout: integrate with sparse index
Date: Mon, 16 May 2022 13:38:28 -0700 [thread overview]
Message-ID: <9191a98c-087a-39b9-cff3-eb7eccd198ea@github.com> (raw)
In-Reply-To: <b8a349c6deeb4b970075629d0c292b2ae9f7d0d3.1652724693.git.gitgitgadget@gmail.com>
Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> When modifying the sparse-checkout definition, the sparse-checkout
> builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
> cache entries in the index. Before, we needed the index to be fully
> expanded in order to ensure we had the full list of files necessary that
> match the new patterns.
>
> Insert a call to reset_sparse_directories() that expands sparse
> directories that are within the new pattern list, but only far enough
> that every necessary file path now exists as a cache entry. The
> remaining logic within update_sparsity() will modify the SKIP_WORKTREE
> bits appropriately.
>
> This allows us to disable command_requires_full_index within the
> sparse-checkout builtin. Add tests that demonstrate that we are not
> expanding to a full index unnecessarily.
>
> We can see the improved performance in the p2000 test script:
>
> Test HEAD~1 HEAD
> ------------------------------------------------------------------------
> 2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
> 2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
>
> These reductions of 26-28% are small compared to most examples, but the
> time is dominated by writing a new copy of the base repository to the
> worktree and then deleting it again. The fact that the previous index
> expansion was such a large portion of the time is telling how important
> it is to complete this sparse index integration.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
> builtin/sparse-checkout.c | 3 +++
> t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++
> unpack-trees.c | 4 ++++
> 3 files changed, 32 insertions(+)
>
> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index cbff6ad00b0..0157b292b36 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix)
>
> git_config(git_default_config, NULL);
>
> + prepare_repo_settings(the_repository);
> + the_repository->settings.command_requires_full_index = 0;
> +
> if (argc > 0) {
> if (!strcmp(argv[0], "list"))
> return sparse_checkout_list(argc, argv);
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 93bcfd20bbc..614357fc48c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1552,6 +1552,31 @@ test_expect_success 'ls-files' '
> ensure_not_expanded ls-files --sparse
> '
>
> +test_expect_success 'sparse index is not expanded: sparse-checkout' '
> + init_repos &&
> +
> + ensure_not_expanded sparse-checkout set deep/deeper2 &&
> + ensure_not_expanded sparse-checkout set deep/deeper1 &&
> + ensure_not_expanded sparse-checkout set deep &&
> + ensure_not_expanded sparse-checkout add folder1 &&
> + ensure_not_expanded sparse-checkout set deep/deeper1 &&
> + ensure_not_expanded sparse-checkout set folder2 &&
> +
> + # Demonstrate that the checks that "folder1/a" is a file
> + # do not cause a sparse-index expansion (since it is in the
> + # sparse-checkout cone).
> + echo >>sparse-index/folder2/a &&
> + git -C sparse-index add folder2/a &&
> +
> + ensure_not_expanded sparse-checkout add folder1 &&
> +
> + # Skip checks here, since deep/deeper1 is inside a sparse directory
> + # that must be expanded to check whether `deep/deeper1` is a file
> + # or not.
> + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 &&
> + ensure_not_expanded sparse-checkout set
> +'
> +
These tests look good for ensuring sparsity is preserved, but it'd be nice
to also have some "stress tests" of 'sparse-checkout (add|set)'. The purpose
would be to make sure the index has the right contents for various types of
pattern changes, e.g. running 'sparse-checkout (add|set) <path>', then
verifying index contents with 'ls-files --sparse'. Paths might be:
- in vs. out of (current) cone
- match an existing vs. nonexistent directory
etc.
> # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
> # in this scenario, but it shouldn't.
> test_expect_success 'reset mixed and checkout orphan' '
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 7f528d35cc2..9745e0dfc34 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -18,6 +18,7 @@
> #include "promisor-remote.h"
> #include "entry.h"
> #include "parallel-checkout.h"
> +#include "sparse-index.h"
>
> /*
> * Error messages expected by scripts out of plumbing commands such as
> @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
> goto skip_sparse_checkout;
> }
>
> + /* Expand sparse directories as needed */
> + expand_to_pattern_list(o->src_index, o->pl);
> +
> /* Set NEW_SKIP_WORKTREE on existing entries. */
> mark_all_ce_unused(o->src_index);
> mark_new_skip_worktree(o->pl, o->src_index, 0,
next prev parent reply other threads:[~2022-05-16 21:02 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-16 18:11 [PATCH 0/8] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 1/8] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-16 20:36 ` Victoria Dye
2022-05-16 20:49 ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 2/8] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 3/8] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 4/8] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 5/8] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-16 20:36 ` Victoria Dye
2022-05-16 18:11 ` [PATCH 6/8] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-16 20:38 ` Victoria Dye
2022-05-17 13:23 ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 7/8] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 8/8] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-16 20:38 ` Victoria Dye [this message]
2022-05-17 13:28 ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-19 17:52 ` [PATCH v2 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-19 17:52 ` [PATCH v2 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-19 17:52 ` [PATCH v2 03/10] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-19 19:50 ` Junio C Hamano
2022-05-20 18:01 ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-19 20:05 ` Junio C Hamano
2022-05-20 18:05 ` Derrick Stolee
2022-05-20 18:23 ` Junio C Hamano
2022-05-19 17:52 ` [PATCH v2 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-19 20:14 ` Junio C Hamano
2022-05-20 18:13 ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-19 20:19 ` Junio C Hamano
2022-05-19 17:52 ` [PATCH v2 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-20 18:17 ` Junio C Hamano
2022-05-20 18:33 ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-21 7:45 ` Junio C Hamano
2022-05-23 13:13 ` Derrick Stolee
2022-05-23 13:18 ` Derrick Stolee
2022-05-23 18:01 ` Junio C Hamano
2022-05-23 22:48 ` Junio C Hamano
2022-05-25 14:26 ` Derrick Stolee
2022-05-25 16:32 ` Junio C Hamano
2022-05-19 17:52 ` [PATCH v2 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-19 17:52 ` [PATCH v2 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 03/10] sparse-index: create expand_index() Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-23 13:48 ` [PATCH v3 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9191a98c-087a-39b9-cff3-eb7eccd198ea@github.com \
--to=vdye@github.com \
--cc=derrickstolee@github.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=shaoxuan.yuan02@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).