From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren <newren@gmail.com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Matheus Tavares Bernardino <matheus.bernardino@usp.br>,
Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 2/2] sparse-checkout: clear tracked sparse dirs
Date: Mon, 2 Aug 2021 10:34:28 -0400 [thread overview]
Message-ID: <76639e16-204d-7812-d4c5-56c70e280bed@gmail.com> (raw)
In-Reply-To: <CABPp-BGbRbyCYYS+NcYrC-T4hJf7BCoLE2HsXFM4K51A0wSgcg@mail.gmail.com>
On 7/30/2021 9:52 AM, Elijah Newren wrote:
> On Thu, Jul 29, 2021 at 11:27 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
...
>> Leaving these ignored files in the sparse directories makes it
>> impossible to gain performance benefits in the sparse index. When we
>> track into these directories, we need to know if the files are ignored
>> or not, which might depend on the _tracked_ .gitignore file(s) within
>> the sparse directory. This depends on the indexed version of the file,
>> so the sparse directory must be expanded.
>
> Is this the issue I highlighted at
> https://lore.kernel.org/git/CABPp-BHsiLTtv6AuycRrQ5K6q0=ZTJe0kd7uTUL+UT=nxj66zA@mail.gmail.com/,
> the paragraph "...I thought the point of add_patterns()..." or is
> there more or other things involved here?
This is exactly that point. I'm not sure why I didn't pick up on your
earlier concerns as something to do _immediately_, but for some reason
I thought I could delay it until later.
>> If users depend on ignored files within the sparse directories, then
>> they have created a bad shape in their repository. Regardless, such
>> shapes would create risk that changing the behavior for all cone mode
>> users might be too risky to take on at the moment. Since this data shape
>> makes it impossible to get performance benefits using the sparse index,
>> we limit the change to only be enabled when the sparse index is enabled.
>> Users can opt out of this behavior by disabline the sparse index.
>
> s/disabline/disabling/, otherwise, I fully agree with this paragraph.
Will fix. Thanks.
>> Depending on user feedback or real-world use, we might want to consider
>> expanding the behavior change to all of cone mode. Since we are
>> currently restricting to the sparse index case, we can use the existence
>> of sparse directory entries in the index as indicators of which
>> directories should be removed.
>
> Let's just expand it to all of cone mode.
I'm open to that. I'll leave this version open for feedback a bit longer
before doing so.
>> + /*
>> + * NEEDSWORK: For now, only use this behavior when index.sparse
>> + * is enabled. We may want this behavior enabled whenever using
>> + * cone mode patterns.
>> + */
>> + prepare_repo_settings(r);
>> + if (!r->worktree || !r->settings.sparse_index)
>> + return;
>
> s/may/do/ :-)
Maybe!
>> +
>> + /*
>> + * Since we now depend on the sparse index to enable this
>> + * behavior, use it to our advantage. This process is more
>> + * complicated without it.
>> + */
>
> Ah, the real reason why you limited this change to sparse-index comes out. :-)
>
>> + convert_to_sparse(r->index);
>> +
>> + strbuf_addstr(&path, r->worktree);
>> + strbuf_complete(&path, '/');
>> + pathlen = path.len;
>> +
>> + for (i = 0; i < r->index->cache_nr; i++) {
>> + struct cache_entry *ce = r->index->cache[i];
>> +
>> + /*
>> + * Is this a sparse directory? If so, then definitely
>> + * include it. All contained content is outside of the
>> + * patterns.
>
> "include"? I would have used the word "remove", but it gets confusing
> because the question is if we want to include it in our list of things
> to remove or remove it from the working directory.
This comment is a bit stale, sorry. Earlier I was populating a list of
the directories, but in this version I'm just deleting them immediately.
>> + */
>> + if (S_ISSPARSEDIR(ce->ce_mode) &&
>> + repo_file_exists(r, ce->name)) {
>> + strbuf_setlen(&path, pathlen);
>> + strbuf_addstr(&path, ce->name);
>> +
>> + /*
>> + * Removal is "best effort". If something blocks
>> + * the deletion, then continue with a warning.
>> + */
>> + if (remove_dir_recursively(&path, 0))
>> + warning(_("failed to remove directory '%s'"), path.buf);
>
> Um, doesn't this delete untracked files that are not ignored as well
> as the ignored files? If so, was that intentional? I'm fully on
> board with removing the gitignore'd files, but I'm worried removing
> other untracked files is dangerous.
I believe that 'git sparse-checkout (set|add|reapply)' will fail before
reaching this method if there are untracked files that could potentially
be removed. I will double-check to ensure this is the case. It is
definitely my intention to protect any untracked, non-ignored files in
these directories by failing the sparse-checkout modification.
> My implementation of this concept (in an external tool) was more along
> the lines of
>
> * Get $LIST_OF_NON_SPARSE_DIRECTORIES by walking `git ls-files -t`
> output and finding common fully-sparse directories
> * git clean -fX $LIST_OF_NON_SPARSE_DIRECTORIES
I initially was running 'git clean -dfx -- <dir> ...' but that also
requires parsing and expanding the index (or being very careful with
the sparse index).
>
>> + }
>> + }
>> +
>> + strbuf_release(&path);
>> +
>> + /*
>> + * This is temporary: the sparse-checkout builtin is not
>> + * integrated with the sparse-index yet, so we need to keep
>> + * it full during the process.
>> + */
>> + ensure_full_index(r->index);
>> +}
>> +
>> static int update_working_directory(struct pattern_list *pl)
>> {
>> enum update_sparsity_result result;
>> @@ -141,6 +207,8 @@ static int update_working_directory(struct pattern_list *pl)
>> else
>> rollback_lock_file(&lock_file);
>>
>> + clean_tracked_sparse_directories(r);
>> +
>> r->index->sparse_checkout_patterns = NULL;
>> return result;
>> }
>
> (Adding a comment here just to separate the two blocks below.)
>
>> @@ -540,8 +608,11 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
>> {
>> int result;
>> int changed_config = 0;
>> + struct pattern_list *old_pl = xcalloc(1, sizeof(*old_pl));
>> struct pattern_list *pl = xcalloc(1, sizeof(*pl));
>>
>> + get_sparse_checkout_patterns(old_pl);
>> +
>> switch (m) {
>> case ADD:
>> if (core_sparse_checkout_cone)
>> @@ -567,7 +638,9 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
>> set_config(MODE_NO_PATTERNS);
>>
>> clear_pattern_list(pl);
>> + clear_pattern_list(old_pl);
>> free(pl);
>> + free(old_pl);
>> return result;
>> }
>
> You create an old_pl, populate it with get_sparse_checkout_patterns(),
> do nothing with it, then clear and free it? I'm confused by the above
> two blocks.
Sorry that this is also stale. I should read my own patches more carefully.
I was originally going to focus only on the directories that were "leaving"
the sparse-checkout definition, but that did not catch the 'reapply' case
or cases where the user created the directories by other means.
Will delete these bits.
>> + git -C repo sparse-checkout reapply &&
>> + test_path_is_missing repo/folder1 &&
>> + test_path_is_missing repo/deep/deeper2 &&
>> + test_path_is_dir repo/obj &&
>> + test_path_is_file repo/file.o &&
>> +
>> + git -C repo status --porcelain=v2 >out &&
>> + test_must_be_empty out &&
>> +
>> + git -C repo sparse-checkout set deep/deeper2 &&
>> + test_path_is_missing repo/deep/deeper1 &&
>> + test_path_is_dir repo/deep/deeper2 &&
>
> What's the expectation for repo/obj?
I will add
test_path_is_dir repo/obj
because this ignored directory should not be removed. This is
simulating the build artifacts that might appear in ignored
directories whose parents are in the sparse-checkout definition.
Thanks,
-Stolee
next prev parent reply other threads:[~2021-08-02 14:34 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-29 17:27 [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 1/2] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 2/2] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-07-30 13:52 ` Elijah Newren
2021-08-02 14:34 ` Derrick Stolee [this message]
2021-08-02 16:17 ` Elijah Newren
2021-08-05 1:55 ` Derrick Stolee
2021-08-05 3:54 ` Elijah Newren
2021-07-30 13:11 ` [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-10 19:50 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-19 18:24 ` Elijah Newren
2021-08-20 15:04 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-12 17:29 ` Derrick Stolee
2021-08-10 19:50 ` [PATCH v2 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-10 19:50 ` [PATCH v2 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-19 7:45 ` Johannes Schindelin
2021-08-20 15:09 ` Derrick Stolee
2021-08-20 16:40 ` Eric Sunshine
2021-08-17 13:23 ` [PATCH v3 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-17 13:23 ` [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-19 8:01 ` Johannes Schindelin
2021-08-20 15:18 ` Derrick Stolee
2021-08-20 19:35 ` René Scharfe
2021-08-20 20:22 ` René Scharfe
2021-08-19 18:29 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-19 8:07 ` Johannes Schindelin
2021-08-20 15:30 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-19 8:11 ` Johannes Schindelin
2021-08-20 15:36 ` Derrick Stolee
2021-08-19 20:53 ` Elijah Newren
2021-08-20 15:39 ` Derrick Stolee
2021-08-20 16:05 ` Elijah Newren
2021-08-17 13:23 ` [PATCH v3 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-18 18:59 ` Derrick Stolee
2021-08-17 13:23 ` [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-19 8:48 ` Johannes Schindelin
2021-08-20 15:49 ` Derrick Stolee
2021-08-20 16:15 ` Elijah Newren
2021-08-20 15:56 ` Elijah Newren
2021-08-23 20:00 ` Johannes Schindelin
2021-08-17 14:09 ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-24 21:51 ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 01/10] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 02/10] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 03/10] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 04/10] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-08-27 21:33 ` Elijah Newren
2021-08-30 13:19 ` Derrick Stolee
2021-08-30 20:08 ` Elijah Newren
2021-08-24 21:51 ` [PATCH v4 05/10] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-24 22:21 ` René Scharfe
2021-08-25 1:09 ` Derrick Stolee
2021-08-24 21:51 ` [PATCH v4 06/10] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 07/10] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 08/10] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 09/10] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-24 21:51 ` [PATCH v4 10/10] sparse-checkout: add config to disable deleting dirs Derrick Stolee via GitGitGadget
2021-08-27 20:58 ` Elijah Newren
2021-08-30 13:30 ` Derrick Stolee
2021-08-30 20:11 ` Elijah Newren
2021-08-27 21:56 ` [PATCH v4 00/10] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-27 22:01 ` Elijah Newren
2021-08-30 13:34 ` Derrick Stolee
2021-08-30 20:14 ` Elijah Newren
2021-08-30 13:54 ` Derrick Stolee
2021-08-30 20:23 ` Elijah Newren
2021-09-08 1:42 ` [PATCH v5 0/9] " Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 1/9] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 2/9] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 3/9] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 4/9] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 5/9] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 6/9] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 7/9] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 8/9] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-09-08 1:42 ` [PATCH v5 9/9] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-09-08 5:21 ` [PATCH v5 0/9] Sparse index: delete ignored files outside sparse cone Junio C Hamano
2021-09-08 6:56 ` Junio C Hamano
2021-09-08 11:39 ` Derrick Stolee
2021-09-08 16:11 ` Junio C Hamano
2021-09-08 5:30 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=76639e16-204d-7812-d4c5-56c70e280bed@gmail.com \
--to=stolee@gmail.com \
--cc=derrickstolee@github.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).