From: Martin von Zweigbergk <martinvonz@google.com>
To: Elijah Newren <newren@gmail.com>
Cc: Victoria Dye <vdye@github.com>, Glen Choo <chooglen@google.com>,
Git Mailing List <git@vger.kernel.org>,
Derrick Stolee <derrickstolee@github.com>
Subject: Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
Date: Wed, 5 Oct 2022 13:00:00 -0700 [thread overview]
Message-ID: <CAESOdVDt7SU=OJhF0mgyZ=B3sncB49aML8oOzKTKAnmGO5BaVQ@mail.gmail.com> (raw)
In-Reply-To: <CABPp-BH5_=Tq9DM6iAfG3+DuzEE7dR-H8rhP34x-A5FQhLO+bg@mail.gmail.com>
On Wed, Oct 5, 2022 at 12:51 AM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Oct 4, 2022 at 9:53 PM Martin von Zweigbergk
> <martinvonz@google.com> wrote:
> >
> > On Tue, Oct 4, 2022 at 1:35 PM Victoria Dye <vdye@github.com> wrote:
> > >
> > > Martin von Zweigbergk wrote:
> > > >> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com <mailto:vdye@github.com>> wrote:
> > > >>
> [...]
> > > >> Thanks for reporting this! There are a few confusing things going on with
> > > >> 'restore' here.
> > > >>
> > > >> First is that the out-of-cone was even restored in the first place.
> > > >>
> > > >
> > > > I was actually happy that the out-of-cone paths were restored. I ran that
> > > > command as an experiment while reading Elijah's doc because I was curious
> > > > what would happen. The reason I think it should restore out-of-cone paths is
> > > > so you can do `git restore --staged --source <some commit> && git commit -m
> > > > "restore to old commit"` without caring about the sparse spec.
> > >
> > > Conversely, that's behavior a user *wouldn't* want if they want to keep
> > > their sparse cone intact (not to mention the performance impact of checking
> > > out the entire worktree). I think it does more harm to those users than it
> > > would benefit the ones that want to checkout out-of-cone files.
> > >
> > > The use-case you're describing should be served by the
> > > '--ignore-skip-worktree-bits' option (not the most intuitive name,
> > > unfortunately). Luckily, there's an increasing desire to improve the naming
> > > of sparse-related options, so the UX situation should improve in the future.
> >
> > I realized after sending my previous email that I might have a
> > different view of what
> > sparse checkout is about. To me, it seems like it should be just a performance
> > optimization. That's why I feel like commands should behave the same way with
> > or without a sparse spec (unless that proposed `--restrict` flag is passed). I
> > understand if that's just not feasible. Sorry about the noise in that case :)
>
> The problem I see with that definition is that I'm not even sure what
> that means. Behaving the same way with or without a sparse
> specification at the extreme means that switching branches should
> populate all the files in that branch...meaning a dense checkout.
> That kind of defeats the point. So, I'm sure you don't mean that they
> behave the same in all cases...but where do you draw the line?
I agree with you and Stolee that there are two different cases: some
people use sparse checkouts to restrict what they see (behavior A), and
some people use it just as a performance optimization (behavior B). So I
suspect we roughly agree about what should happen if you pass
`--restrict` (or if that becomes the default so you don't actually need to
pass it). My arguments were about the `--no-restrict` case. Sorry, I
should have made that clear.
I also agree that having a way to make commands restrict to certain paths
by default is useful, and I agree that tying that set of paths to the current
worktree's sparse spec makes sense.
I'll answer the questions below for the `--no-restrict` case
(behavior B).
>
> Is it just switch/checkout & `reset --hard` that avoid reading and
> writing outside the sparse specification?
>
> Should diff & status ignore files outside the sparse specification
> even if users wrote to such files? A "performance optimization" might
> suggest we should, but would users get confused?
I think they should be included (again, in the `--no-restrict` case).
>
> What about merge/rebase/cherry-pick/revert? Should those write
> additional files to the working tree or avoid it? What about if there
> are conflicts outside the sparse specification?
I think they should avoid it, but since the user will need to resolve
that conflict anyway, I can see it makes sense to write them to disk
if there are conflicts.
>
> And if extra files get written outside the sparse specification, are
> there ways for this to get "cleaned up" where after resolving
> conflicts or changes we can again remove the file from the working
> tree?
I've never really used `git sparse-checkout` (until I read your doc),
but isn't that what `git sparse-checkout reapply` is for?
>
> What about `git grep PATTERN`? That's documented to search the
> tracked files in the working tree. But should that include the files
> that would have been there were it not for the "performance
> optimization" of not checking them out? (Similarly, what about `git
> grep --cached PATTERN` or `git grep PATTERN REVISION`?) I mean, if
> these commands should behave the same regardless of sparse
> specification, then you should search those other files, right? But
> isn't that a nasty performance penalty if the user has a
> sparse/partial clone since Git will have to do many network operations
> to get additional blobs in order to search them? Is that really
> wanted?
I think it's consistent to search them with `--no-restrict` (but not
with `--restrict`, of course).
>
> What about `git rm '*.png'` to remove all the tracked png files from
> my working tree. Should that also remove all the files that would
> have been there were it not for the "performance optimization"? Will
> that result in very negative surprises for those with a "I want to
> concentrate on just this subset of files" mental model?
Same here.
>
> What about `git worktree add`? Should the sparse specification be the
> same for all worktrees? Be per-worktree? Should it default to dense?
> To just top-level sparse? To the same sparsity as the worktree you
> were in when you created a new one?
That's an interesting case. If someone does `git worktree add` and expects
all files to be available in the working copy, they might be surprised, yes.
I think that's a much smaller risk than
`git restore --source HEAD^ --staged && git commit -m 'undo changes'` being
partial, however.
next prev parent reply other threads:[~2022-10-05 20:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-03 22:05 Bug report: `git restore --source --staged` deals poorly with sparse-checkout Glen Choo
2022-10-04 16:34 ` Victoria Dye
[not found] ` <CAESOdVAh68HoQoyicfZn4XbjGfiRFCu1zFQmUjMcSAg3tUzr4Q@mail.gmail.com>
2022-10-04 20:34 ` Victoria Dye
2022-10-05 4:53 ` Martin von Zweigbergk
2022-10-05 7:51 ` Elijah Newren
2022-10-05 20:00 ` Martin von Zweigbergk [this message]
2022-10-06 4:20 ` Elijah Newren
2022-10-05 16:11 ` Victoria Dye
2022-10-05 5:22 ` Elijah Newren
2022-10-06 19:30 ` Junio C Hamano
2022-10-06 19:38 ` Junio C Hamano
2022-10-05 5:00 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAESOdVDt7SU=OJhF0mgyZ=B3sncB49aML8oOzKTKAnmGO5BaVQ@mail.gmail.com' \
--to=martinvonz@google.com \
--cc=chooglen@google.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).