git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Matheus Tavares Bernardino <matheus.bernardino@usp.br>,
	Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs
Date: Fri, 20 Aug 2021 08:56:26 -0700	[thread overview]
Message-ID: <CABPp-BFUoPQUQDRaRj3Rq2gShOD8WPv6Oh-Wdj4m-7sf=okY6Q@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2108191015260.55@tvgsbejvaqbjf.bet>

On Thu, Aug 19, 2021 at 1:48 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Stolee,
>
> On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:
>
> > From: Derrick Stolee <dstolee@microsoft.com>
> >
> > When changing the scope of a sparse-checkout using cone mode, we might
> > have some tracked directories go out of scope. The current logic removes
> > the tracked files from within those directories, but leaves the ignored
> > files within those directories. This is a bit unexpected to users who
> > have given input to Git saying they don't need those directories
> > anymore.
> >
> > This is something that is new to the cone mode pattern type: the user
> > has explicitly said "I want these directories and _not_ those
> > directories." The typical sparse-checkout patterns more generally apply
> > to "I want files with with these patterns" so it is natural to leave
> > ignored files as they are. This focus on directories in cone mode
> > provides us an opportunity to change the behavior.
> >
> > Leaving these ignored files in the sparse directories makes it
> > impossible to gain performance benefits in the sparse index. When we
> > track into these directories, we need to know if the files are ignored
> > or not, which might depend on the _tracked_ .gitignore file(s) within
> > the sparse directory. This depends on the indexed version of the file,
> > so the sparse directory must be expanded.
> >
> > By deleting the sparse directories when changing scope (or running 'git
> > sparse-checkout reapply') we regain these performance benefits as if the
> > repository was in a clean state.
> >
> > Since these ignored files are frequently build output or helper files
> > from IDEs, the users should not need the files now that the tracked
> > files are removed. If the tracked files reappear, then they will have
> > newer timestamps than the build artifacts, so the artifacts will need to
> > be regenerated anyway.
> >
> > Use the sparse-index as a data structure in order to find the sparse
> > directories that can be safely deleted. Re-expand the index to a full
> > one if it was full before.
>
> This description makes sense, and is easy to explain.
>
> It does not cover the case where untracked files are found and the
> directory is _not_ removed as a consequence, though. I would like to ask
> to add this to the commit message, because it is kind of important.
>
> The implementation of this behavior looks fine to me.
>
> About this behavior itself: in my experience, the more tricky a feature is
> to explain, the more likely the design should have been adjusted in the
> first place. And I find myself struggling a little bit to explain what
> files `git switch` touches in cone mode in addition to tracked files.

I share this concern.

> So I wonder whether an easier-to-explain behavior would be the following:
> ignored files in directories that fell out of the sparse-checkout cone are
> deleted. (Even if there are untracked, unignored files in the same
> directory tree.)
>
> This is different than what this patch implements: we would now have to
> delete the ignored and out-of-cone files _also_ when there are untracked
> files in the same directory, i.e. we could no longer use the sweet
> `remove_dir_recursively()` call. Therefore, the implementation of what I
> suggested would be much more complicated: you would have to enumerate the
> ignored files and remove them individually.

The ignored files are already enumerated by the fill_directory call;
you just need to iterate over the dir->ignored_nr entries in
dir->ignored.

> Having said that, even after mulling over this behavior and sleeping over
> it, I am unsure what the best way forward would be. Just because it is
> easy to explain does not make it right.
>
> It is tricky to decide mostly because "ignored" files are definitely not
> always build output. Apart from VIM's temporary files, users like me
> frequently write other files and/or directories that we simply do not want
> to see tracked in Git. For example, I often test things in an `a1.c` file
> that -- for convenience -- lives in the current worktree. Obviously I
> don't want Git to track it, but I also don't want it to be deleted, so I
> often add corresponding lines to `.git/info/exclude`. Likewise, I
> sometimes download additional information related to what I am
> implementing, and that also lives in the current worktree (but then, I
> usually am too lazy to add an entry to `.git/info/exclude` in those
> cases).

I do the same thing, and I know other users that do as well...but I
don't put such files in directories that are irrelevant to me.  I
create cruft files near other files that I'm working on, or in a
special directory of its own, but not under some directory that is
irrelevant to the areas I'm working on.

For reference, we implemented something like this in our `sparsify`
wrapper we have internally, where 'git clean -fdX <all sparse
directories>` is executed whenever folks sparsify.  (We have our own
tool and don't have users use sparse-checkout directly, because our
tool computes dependencies to determine which directories are needed.)
 I was really hesitant to add that cleaning behavior by default, and
just made it an option.  My colleagues tired of all the bug reports
about left-around directories and made it the default, waiting to hear
complaints.  We never got one.  It's been over a year.

> Now, I don't want to over-index on my own habits. There are so many users
> out there, and most of them have different preferences from mine.
>
> Which leaves me to wonder whether we need at least a flag to turn this
> behavior on and off? Something like
> `core.ignoredFilesInSparseConesArePrecious = true` (obviously with a
> better, shorter name).

That seems fine, but I suspect you're projecting the same concerns I
had, and it turns out much less useful than you'd expect (perhaps even
totally useless).

  parent reply	other threads:[~2021-08-20 15:56 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 17:27 [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 1/2] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-07-29 17:27 ` [PATCH 2/2] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-07-30 13:52   ` Elijah Newren
2021-08-02 14:34     ` Derrick Stolee
2021-08-02 16:17       ` Elijah Newren
2021-08-05  1:55         ` Derrick Stolee
2021-08-05  3:54           ` Elijah Newren
2021-07-30 13:11 ` [PATCH 0/2] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-10 19:50 ` [PATCH v2 0/8] " Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-19 18:24     ` Elijah Newren
2021-08-20 15:04       ` Derrick Stolee
2021-08-10 19:50   ` [PATCH v2 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-12 17:29     ` Derrick Stolee
2021-08-10 19:50   ` [PATCH v2 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-10 19:50   ` [PATCH v2 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-17 13:23   ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 1/8] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-19  7:45       ` Johannes Schindelin
2021-08-20 15:09         ` Derrick Stolee
2021-08-20 16:40           ` Eric Sunshine
2021-08-17 13:23     ` [PATCH v3 2/8] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 3/8] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-17 13:23     ` [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-19  8:01       ` Johannes Schindelin
2021-08-20 15:18         ` Derrick Stolee
2021-08-20 19:35           ` René Scharfe
2021-08-20 20:22             ` René Scharfe
2021-08-19 18:29       ` Elijah Newren
2021-08-17 13:23     ` [PATCH v3 5/8] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-19  8:07       ` Johannes Schindelin
2021-08-20 15:30         ` Derrick Stolee
2021-08-17 13:23     ` [PATCH v3 6/8] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-19  8:11       ` Johannes Schindelin
2021-08-20 15:36         ` Derrick Stolee
2021-08-19 20:53       ` Elijah Newren
2021-08-20 15:39         ` Derrick Stolee
2021-08-20 16:05           ` Elijah Newren
2021-08-17 13:23     ` [PATCH v3 7/8] sparse-index: add SPARSE_INDEX_IGNORE_CONFIG flag Derrick Stolee via GitGitGadget
2021-08-18 18:59       ` Derrick Stolee
2021-08-17 13:23     ` [PATCH v3 8/8] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-19  8:48       ` Johannes Schindelin
2021-08-20 15:49         ` Derrick Stolee
2021-08-20 16:15           ` Elijah Newren
2021-08-20 15:56         ` Elijah Newren [this message]
2021-08-23 20:00           ` Johannes Schindelin
2021-08-17 14:09     ` [PATCH v3 0/8] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-24 21:51     ` [PATCH v4 00/10] " Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 01/10] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 02/10] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 03/10] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 04/10] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-08-27 21:33         ` Elijah Newren
2021-08-30 13:19           ` Derrick Stolee
2021-08-30 20:08             ` Elijah Newren
2021-08-24 21:51       ` [PATCH v4 05/10] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-08-24 22:21         ` René Scharfe
2021-08-25  1:09           ` Derrick Stolee
2021-08-24 21:51       ` [PATCH v4 06/10] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 07/10] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 08/10] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 09/10] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-08-24 21:51       ` [PATCH v4 10/10] sparse-checkout: add config to disable deleting dirs Derrick Stolee via GitGitGadget
2021-08-27 20:58         ` Elijah Newren
2021-08-30 13:30           ` Derrick Stolee
2021-08-30 20:11             ` Elijah Newren
2021-08-27 21:56       ` [PATCH v4 00/10] Sparse index: delete ignored files outside sparse cone Elijah Newren
2021-08-27 22:01         ` Elijah Newren
2021-08-30 13:34           ` Derrick Stolee
2021-08-30 20:14             ` Elijah Newren
2021-08-30 13:54         ` Derrick Stolee
2021-08-30 20:23           ` Elijah Newren
2021-09-08  1:42       ` [PATCH v5 0/9] " Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 1/9] t7519: rewrite sparse index test Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 2/9] sparse-index: silently return when not using cone-mode patterns Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 3/9] unpack-trees: fix nested sparse-dir search Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 4/9] sparse-index: silently return when cache tree fails Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 5/9] sparse-index: use WRITE_TREE_MISSING_OK Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 6/9] sparse-checkout: create helper methods Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 7/9] attr: be careful about sparse directories Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 8/9] sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag Derrick Stolee via GitGitGadget
2021-09-08  1:42         ` [PATCH v5 9/9] sparse-checkout: clear tracked sparse dirs Derrick Stolee via GitGitGadget
2021-09-08  5:21         ` [PATCH v5 0/9] Sparse index: delete ignored files outside sparse cone Junio C Hamano
2021-09-08  6:56           ` Junio C Hamano
2021-09-08 11:39             ` Derrick Stolee
2021-09-08 16:11               ` Junio C Hamano
2021-09-08  5:30         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BFUoPQUQDRaRj3Rq2gShOD8WPv6Oh-Wdj4m-7sf=okY6Q@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).