git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug report: `git restore --source --staged` deals poorly with sparse-checkout
@ 2022-10-03 22:05 Glen Choo
  2022-10-04 16:34 ` Victoria Dye
  0 siblings, 1 reply; 12+ messages in thread
From: Glen Choo @ 2022-10-03 22:05 UTC (permalink / raw)
  To: git, vdye, Derrick Stolee; +Cc: martinvonz

Filing a `git bugreport` on behalf of a user at $DAYJOB. I'm also pretty
surprised by this behavior, perhaps someone who knows more could shed
some light?

What did you do before the bug happened? (Steps to reproduce your issue)

  git clone git@github.com:git/git.git . &&
  git sparse-checkout set t &&
  git restore --source v2.38.0-rc1 --staged Documentation &&
  git status

What did you expect to happen? (Expected behavior)

I expected to see staged changes only, since I restored only paths
outside of my sparse spec (which was t/, plus the implicit root
directory).

What happened instead? (Actual behavior)

I saw a staged modification (Documentation/cmd-list.perl) and the same
file reported as deleted in the working copy. Specifically,

  $ git status

  On branch master
  Your branch is up to date with 'origin/master'.

  You are in a sparse checkout with 64% of tracked files present.

  Changes to be committed:
    (use "git restore --staged <file>..." to unstage)
          modified:   Documentation/cmd-list.perl

  Changes not staged for commit:
    (use "git add/rm <file>..." to update what will be committed)
    (use "git restore <file>..." to discard changes in working directory)
          deleted:    Documentation/cmd-list.perl

What's different between what you expected and what actually happened?

git status should not have said that the file was deleted in the
working copy

[System Info]
git version: git version 2.37.3.998.g577e59143f-goog
cpu: x86_64 no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.17.11-1rodete2-amd64 #1 SMP PREEMPT Debian
5.17.11-1rodete2 (2022-06-09) x86_64
compiler info: gnuc: 12.2
libc info: glibc: 2.33
$SHELL (typically, interactive shell): /bin/bash

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-03 22:05 Bug report: `git restore --source --staged` deals poorly with sparse-checkout Glen Choo
@ 2022-10-04 16:34 ` Victoria Dye
       [not found]   ` <CAESOdVAh68HoQoyicfZn4XbjGfiRFCu1zFQmUjMcSAg3tUzr4Q@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Victoria Dye @ 2022-10-04 16:34 UTC (permalink / raw)
  To: Glen Choo, git, Derrick Stolee, Elijah Newren; +Cc: martinvonz

Glen Choo wrote:
> Filing a `git bugreport` on behalf of a user at $DAYJOB. I'm also pretty
> surprised by this behavior, perhaps someone who knows more could shed
> some light?
> 
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
>   git clone git@github.com:git/git.git . &&
>   git sparse-checkout set t &&
>   git restore --source v2.38.0-rc1 --staged Documentation &&
>   git status
> ...> 
> What happened instead? (Actual behavior)
> 
> I saw a staged modification (Documentation/cmd-list.perl) and the same
> file reported as deleted in the working copy. Specifically,
> 
>   $ git status
> 
>   On branch master
>   Your branch is up to date with 'origin/master'.
> 
>   You are in a sparse checkout with 64% of tracked files present.
> 
>   Changes to be committed:
>     (use "git restore --staged <file>..." to unstage)
>           modified:   Documentation/cmd-list.perl
> 
>   Changes not staged for commit:
>     (use "git add/rm <file>..." to update what will be committed)
>     (use "git restore <file>..." to discard changes in working directory)
>           deleted:    Documentation/cmd-list.perl
> 

Thanks for reporting this! There are a few confusing things going on with
'restore' here.

First is that the out-of-cone was even restored in the first place.
Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
inside the sparse-checkout patterns (per the documentation of
'--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
Then, there's a difference between 'restore' and 'checkout' that doesn't
seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
status).

Elijah's WIP design doc [1] describes 'restore' as one of:

> commands that restore files to the working tree that match sparsity
> patterns, and remove unmodified files that don't match those patterns

albeit with other (probably related?) bugs. Given that, I think the correct
behavior would be:

1. if '--ignore-skip-worktree-bits' *is not* specified, do not restore any
   files outside of the sparse-checkout patterns.
2. if '--ignore-skip-worktree-bits' *is* specified, remove the
   'SKIP_WORKTREE' bit & check out all entries matching the pathspec to
   disk.

Fixing this should probably wait until the design doc is finalized (I've
meant to follow up on it for a while now, but I should have some time to do
that this week). In the meantime, 'checkout' will at least allow you to (for
better or for worse) avoid the "deleted" status.

Hope that helps!
- Victoria

[1] https://lore.kernel.org/git/pull.1367.v2.git.1664353951797.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
       [not found]   ` <CAESOdVAh68HoQoyicfZn4XbjGfiRFCu1zFQmUjMcSAg3tUzr4Q@mail.gmail.com>
@ 2022-10-04 20:34     ` Victoria Dye
  2022-10-05  4:53       ` Martin von Zweigbergk
                         ` (2 more replies)
  2022-10-05  5:00     ` Elijah Newren
  1 sibling, 3 replies; 12+ messages in thread
From: Victoria Dye @ 2022-10-04 20:34 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Glen Choo, git, Derrick Stolee, Elijah Newren

Hi Martin!

Please make sure you respond in plaintext - it looks like your message didn't
make it to the mailing list. I've reformatted it to render nicely in this 
reply.

Martin von Zweigbergk wrote:
>> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com <mailto:vdye@github.com>> wrote:
>> 
>> Glen Choo wrote:
>>> Filing a `git bugreport` on behalf of a user at $DAYJOB. I'm also pretty
>>> surprised by this behavior, perhaps someone who knows more could shed
>>> some light?
>>>
>>> What did you do before the bug happened? (Steps to reproduce your issue)
>>>
>>>   git clone git@github.com:git/git.git . &&
>>>   git sparse-checkout set t &&
>>>   git restore --source v2.38.0-rc1 --staged Documentation &&
>>>   git status
>>> ...>
>>> What happened instead? (Actual behavior)
>>>
>>> I saw a staged modification (Documentation/cmd-list.perl) and the same
>>> file reported as deleted in the working copy. Specifically,
>>>
>>>   $ git status
>>>
>>>   On branch master
>>>   Your branch is up to date with 'origin/master'.
>>>
>>>   You are in a sparse checkout with 64% of tracked files present.
>>>
>>>   Changes to be committed:
>>>     (use "git restore --staged <file>..." to unstage)
>>>           modified:   Documentation/cmd-list.perl
>>>
>>>   Changes not staged for commit:
>>>     (use "git add/rm <file>..." to update what will be committed)
>>>     (use "git restore <file>..." to discard changes in working directory)
>>>           deleted:    Documentation/cmd-list.perl
>>>
>> 
>> Thanks for reporting this! There are a few confusing things going on with
>> 'restore' here.
>> 
>> First is that the out-of-cone was even restored in the first place.
>> 
> 
> I was actually happy that the out-of-cone paths were restored. I ran that
> command as an experiment while reading Elijah's doc because I was curious
> what would happen. The reason I think it should restore out-of-cone paths is
> so you can do `git restore --staged --source <some commit> && git commit -m
> "restore to old commit"` without caring about the sparse spec.

Conversely, that's behavior a user *wouldn't* want if they want to keep
their sparse cone intact (not to mention the performance impact of checking
out the entire worktree). I think it does more harm to those users than it
would benefit the ones that want to checkout out-of-cone files.

The use-case you're describing should be served by the
'--ignore-skip-worktree-bits' option (not the most intuitive name,
unfortunately). Luckily, there's an increasing desire to improve the naming
of sparse-related options, so the UX situation should improve in the future.

> 
>> Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
>> inside the sparse-checkout patterns (per the documentation of
>> '--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
>> Then, there's a difference between 'restore' and 'checkout' that doesn't
>> seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
>> only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
>> status).
> 
> Restoring only into the index (as I think `git restore --staged` is supposed
> to do) is weird. 

'git restore --staged' is intended to restore to both the worktree and index
(per 183fb44fd2 (restore: add --worktree and --staged, 2019-04-25)). The bug
you've identified is that it's not restoring to the worktree.

Assuming everything was working properly, users could still choose to
restore only to the index (using the '--no-worktree' option).

> Let's say we do a clean checkout of a commit with tree A
> (i.e. the root tree's hash is A). If we do `git sparse-checkout set
> non-existent`, the index and the working copy still logically contain state
> A, right? 

The index will, but the working tree will be empty because all index entries
not matching 'non-existent' will have SKIP_WORKTREE applied.

> If we now do `git restore --staged --source HEAD^` and that
> command doesn't remove the `SKIP_WORKTREE` flag on any paths, that logically
> means that we have modified the working copy, and I think `git
> sparse-checkout disable` would agree with me. 

If you aren't using '--ignore-skip-worktree-bits', the entries with
SKIP_WORKTREE shouldn't be touched in the first place. If you *do* specify
it, by virtue of restoring to the working tree, SKIP_WORKTREE must be
removed.

But suppose you're doing something like 'git restore --staged --no-worktree
--ignore-skip-worktree-bits --source HEAD^'. In that case:

- you are restoring to the index
- you are *not* restoring to the worktree
- you're restoring files with SKIP_WORKTREE applied

Right now, SKIP_WORKTREE is removed from the matched entries, but suppose
(per your comment) it wasn't. That wouldn't mean that we've "modified the
working copy"; the working tree is defined with respect to the index, and if
the index entry says "I don't care about the worktree", then there are no
differences to reflect. 

This raises an interesting question about the current behavior, though: if
you restore a SKIP_WORKTREE entry with '--staged' and '--no-worktree',
should we remove SKIP_WORKTREE? I'd lean towards "no", but I'm interested to
hear other contributors' thoughts.

> That's different from how `git
> restore --staged` without sparse-checkout would have worked (it would not
> have updated the working copy). So from that perspective, it might make
> sense to remove the `SKIP_WORKTREE` and add the old file contents back in
> the working (i.e. from state A in this example), and maybe that's why the
> commands do that? 

It's important to avoid restoring a file to the worktree when it has
SKIP_WORKTREE enabled. See af6a51875a (repo_read_index: clear SKIP_WORKTREE
bit from files present in worktree, 2022-01-14) and the corresponding
discussion in [1].

[1] https://lore.kernel.org/all/pull.1114.v2.git.1642175983.gitgitgadget@gmail.com/

> Actually, `git checkout HEAD^ .` would update both the
> index and the working copy to match
> HEAD^, so that shouldn't have to remove the `SKIP_WORKTREE`, maybe?
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-04 20:34     ` Victoria Dye
@ 2022-10-05  4:53       ` Martin von Zweigbergk
  2022-10-05  7:51         ` Elijah Newren
  2022-10-05 16:11         ` Victoria Dye
  2022-10-05  5:22       ` Elijah Newren
  2022-10-06 19:30       ` Junio C Hamano
  2 siblings, 2 replies; 12+ messages in thread
From: Martin von Zweigbergk @ 2022-10-05  4:53 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Glen Choo, git, Derrick Stolee, Elijah Newren

On Tue, Oct 4, 2022 at 1:35 PM Victoria Dye <vdye@github.com> wrote:
>
> Hi Martin!
>
> Please make sure you respond in plaintext - it looks like your message didn't
> make it to the mailing list. I've reformatted it to render nicely in this
> reply.

Hi! Sorry about that, and thanks for fixing! I've switched to plain
text now, and I've
manually added newlines to keep lines short (unsure if I was supposed to).

>
> Martin von Zweigbergk wrote:
> >> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com <mailto:vdye@github.com>> wrote:
> >>
> >> Glen Choo wrote:
> >>> Filing a `git bugreport` on behalf of a user at $DAYJOB. I'm also pretty
> >>> surprised by this behavior, perhaps someone who knows more could shed
> >>> some light?
> >>>
> >>> What did you do before the bug happened? (Steps to reproduce your issue)
> >>>
> >>>   git clone git@github.com:git/git.git . &&
> >>>   git sparse-checkout set t &&
> >>>   git restore --source v2.38.0-rc1 --staged Documentation &&
> >>>   git status
> >>> ...>
> >>> What happened instead? (Actual behavior)
> >>>
> >>> I saw a staged modification (Documentation/cmd-list.perl) and the same
> >>> file reported as deleted in the working copy. Specifically,
> >>>
> >>>   $ git status
> >>>
> >>>   On branch master
> >>>   Your branch is up to date with 'origin/master'.
> >>>
> >>>   You are in a sparse checkout with 64% of tracked files present.
> >>>
> >>>   Changes to be committed:
> >>>     (use "git restore --staged <file>..." to unstage)
> >>>           modified:   Documentation/cmd-list.perl
> >>>
> >>>   Changes not staged for commit:
> >>>     (use "git add/rm <file>..." to update what will be committed)
> >>>     (use "git restore <file>..." to discard changes in working directory)
> >>>           deleted:    Documentation/cmd-list.perl
> >>>
> >>
> >> Thanks for reporting this! There are a few confusing things going on with
> >> 'restore' here.
> >>
> >> First is that the out-of-cone was even restored in the first place.
> >>
> >
> > I was actually happy that the out-of-cone paths were restored. I ran that
> > command as an experiment while reading Elijah's doc because I was curious
> > what would happen. The reason I think it should restore out-of-cone paths is
> > so you can do `git restore --staged --source <some commit> && git commit -m
> > "restore to old commit"` without caring about the sparse spec.
>
> Conversely, that's behavior a user *wouldn't* want if they want to keep
> their sparse cone intact (not to mention the performance impact of checking
> out the entire worktree). I think it does more harm to those users than it
> would benefit the ones that want to checkout out-of-cone files.
>
> The use-case you're describing should be served by the
> '--ignore-skip-worktree-bits' option (not the most intuitive name,
> unfortunately). Luckily, there's an increasing desire to improve the naming
> of sparse-related options, so the UX situation should improve in the future.

I realized after sending my previous email that I might have a
different view of what
sparse checkout is about. To me, it seems like it should be just a performance
optimization. That's why I feel like commands should behave the same way with
or without a sparse spec (unless that proposed `--restrict` flag is passed). I
understand if that's just not feasible. Sorry about the noise in that case :)

>
> >
> >> Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
> >> inside the sparse-checkout patterns (per the documentation of
> >> '--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
> >> Then, there's a difference between 'restore' and 'checkout' that doesn't
> >> seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
> >> only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
> >> status).
> >
> > Restoring only into the index (as I think `git restore --staged` is supposed
> > to do) is weird.
>
> 'git restore --staged' is intended to restore to both the worktree and index
> (per 183fb44fd2 (restore: add --worktree and --staged, 2019-04-25)). The bug
> you've identified is that it's not restoring to the worktree.

Ah, `--worktree` is on by default even if I pass `--staged`, I see. Hmm, the
help text actually says "Specifying --staged will only restore the index."...

>
> Assuming everything was working properly, users could still choose to
> restore only to the index (using the '--no-worktree' option).
>
> > Let's say we do a clean checkout of a commit with tree A
> > (i.e. the root tree's hash is A). If we do `git sparse-checkout set
> > non-existent`, the index and the working copy still logically contain state
> > A, right?
>
> The index will, but the working tree will be empty because all index entries
> not matching 'non-existent' will have SKIP_WORKTREE applied.
>
> > If we now do `git restore --staged --source HEAD^` and that
> > command doesn't remove the `SKIP_WORKTREE` flag on any paths, that logically
> > means that we have modified the working copy, and I think `git
> > sparse-checkout disable` would agree with me.
>
> If you aren't using '--ignore-skip-worktree-bits', the entries with
> SKIP_WORKTREE shouldn't be touched in the first place. If you *do* specify
> it, by virtue of restoring to the working tree, SKIP_WORKTREE must be
> removed.
>
> But suppose you're doing something like 'git restore --staged --no-worktree
> --ignore-skip-worktree-bits --source HEAD^'. In that case:
>
> - you are restoring to the index
> - you are *not* restoring to the worktree
> - you're restoring files with SKIP_WORKTREE applied
>
> Right now, SKIP_WORKTREE is removed from the matched entries, but suppose
> (per your comment) it wasn't. That wouldn't mean that we've "modified the
> working copy"; the working tree is defined with respect to the index, and if
> the index entry says "I don't care about the worktree", then there are no
> differences to reflect.

Yes, not technically changed. I was (and still am) thinking of the working copy
as logically containing all the files even if some of them are not written out
to disk. I understand if that seems like an odd way of thinking about
it. It might
help to think about how it might appear in a virtual file system where clean
files outside the sparse spec are presented from the index. Or it may be better
to not try to think about it in this weird way at all :)

>
> This raises an interesting question about the current behavior, though: if
> you restore a SKIP_WORKTREE entry with '--staged' and '--no-worktree',
> should we remove SKIP_WORKTREE?

FWIW, that's the case I meant.

> I'd lean towards "no", but I'm interested to
> hear other contributors' thoughts.
>
> > That's different from how `git
> > restore --staged` without sparse-checkout would have worked (it would not
> > have updated the working copy). So from that perspective, it might make
> > sense to remove the `SKIP_WORKTREE` and add the old file contents back in
> > the working (i.e. from state A in this example), and maybe that's why the
> > commands do that?
>
> It's important to avoid restoring a file to the worktree when it has
> SKIP_WORKTREE enabled. See af6a51875a (repo_read_index: clear SKIP_WORKTREE
> bit from files present in worktree, 2022-01-14) and the corresponding
> discussion in [1].

Makes sense. That's why I said "and add the old file contents back in the
working copy" (if I understood correctly what that discussion was about).

>
> [1] https://lore.kernel.org/all/pull.1114.v2.git.1642175983.gitgitgadget@gmail.com/
>
> > Actually, `git checkout HEAD^ .` would update both the
> > index and the working copy to match
> > HEAD^, so that shouldn't have to remove the `SKIP_WORKTREE`, maybe?
> >
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
       [not found]   ` <CAESOdVAh68HoQoyicfZn4XbjGfiRFCu1zFQmUjMcSAg3tUzr4Q@mail.gmail.com>
  2022-10-04 20:34     ` Victoria Dye
@ 2022-10-05  5:00     ` Elijah Newren
  1 sibling, 0 replies; 12+ messages in thread
From: Elijah Newren @ 2022-10-05  5:00 UTC (permalink / raw)
  To: Martin von Zweigbergk
  Cc: Victoria Dye, Glen Choo, Git Mailing List, Derrick Stolee

Hi Marin!

On Tue, Oct 4, 2022 at 10:52 AM Martin von Zweigbergk
<martinvonz@google.com> wrote:
>
> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com> wrote:
>>
>> Glen Choo wrote:
>> > Filing a `git bugreport` on behalf of a user at $DAYJOB. I'm also pretty
>> > surprised by this behavior, perhaps someone who knows more could shed
>> > some light?
>> >
>> > What did you do before the bug happened? (Steps to reproduce your issue)
>> >
>> >   git clone git@github.com:git/git.git . &&
>> >   git sparse-checkout set t &&
>> >   git restore --source v2.38.0-rc1 --staged Documentation &&
>> >   git status
>> > ...>
>> > What happened instead? (Actual behavior)
>> >
>> > I saw a staged modification (Documentation/cmd-list.perl) and the same
>> > file reported as deleted in the working copy. Specifically,
>> >
>> >   $ git status
>> >
>> >   On branch master
>> >   Your branch is up to date with 'origin/master'.
>> >
>> >   You are in a sparse checkout with 64% of tracked files present.
>> >
>> >   Changes to be committed:
>> >     (use "git restore --staged <file>..." to unstage)
>> >           modified:   Documentation/cmd-list.perl
>> >
>> >   Changes not staged for commit:
>> >     (use "git add/rm <file>..." to update what will be committed)
>> >     (use "git restore <file>..." to discard changes in working directory)
>> >           deleted:    Documentation/cmd-list.perl
>> >
>>
>> Thanks for reporting this! There are a few confusing things going on with
>> 'restore' here.
>>
>> First is that the out-of-cone was even restored in the first place.
>
>
> I was actually happy that the out-of-cone paths were restored. I ran that command as an experiment while reading Elijah's doc because I was curious what would happen. The reason I think it should restore out-of-cone paths is so you can do `git restore --staged --source <some commit> && git commit -m "restore to old commit"` without caring about the sparse spec.

I think that could lead to something that would be very dangerous or
highly confusing to other users.  In particular, if they run

  `git restore --staged --source <some commit> -- '*.rs'`

and git changes not only all the Rust files inside the sparsity
specification (i.e. the files they are interested in), but all the
ones outside too, then they'll be rather unhappy.  So, I think if the
paths you specify aren't within the sparse specification, we should
throw an error, much like we already do with `git add` and `git rm`
when in a sparse checkout.  And if you don't care about the sparse
spec despite having set up a sparse checkout, you can always specify
that (with --no-restrict or --scope=all or whatever).

>> Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
>> inside the sparse-checkout patterns (per the documentation of
>> '--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
>> Then, there's a difference between 'restore' and 'checkout' that doesn't
>> seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
>> only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
>> status).
>
>
> Restoring only into the index (as I think `git restore --staged` is supposed to do) is weird. Let's say we do a clean checkout of a commit with tree A (i.e. the root tree's hash is A). If we do `git sparse-checkout set non-existent`, the index and the working copy still logically contain state A, right? If we now do `git restore --staged --source HEAD^` and that command doesn't remove the `SKIP_WORKTREE` flag on any paths, that logically means that we have modified the working copy, and I think `git sparse-checkout disable` would agree with me. That's different from how `git restore --staged` without sparse-checkout would have worked (it would not have updated the working copy). So from that perspective, it might make sense to remove the `SKIP_WORKTREE` and add the old file contents back in the working (i.e. from state A in this example), and maybe that's why the commands do that? Actually, `git checkout HEAD^ .` would update both the index and the working copy to match HEAD^, so that shouldn't have to remove the `SKIP_WORKTREE`, maybe?

Yes, you've flagged this correctly as an issue, but I think only
touching files within the sparse specification is much safer and
throwing an error telling the user what flag to add if they specified
a path outside the sparse specification would be better.

Now, if they do provide the override...then your question becomes
valid.  My inclination there is that they provided --staged without
--worktree and provided an override so although they'll get weird
results (much as they also would with git-add or git-rm when they
override) we just follow what they said and only update the index and
leave the file as SKIP_WORKTREE.

> I barely ever use Git, so take all that with a grain of salt.
>
>>
>> Elijah's WIP design doc [1] describes 'restore' as one of:
>>
>> > commands that restore files to the working tree that match sparsity
>> > patterns, and remove unmodified files that don't match those patterns
>
>
> I *think* that only applies to `git restore` without `--staged`.

Yeah, you brought up a really good example here.  I think `restore`
and `checkout -- <paths>`  should probably be better grouped with
add/rm/mv

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-04 20:34     ` Victoria Dye
  2022-10-05  4:53       ` Martin von Zweigbergk
@ 2022-10-05  5:22       ` Elijah Newren
  2022-10-06 19:30       ` Junio C Hamano
  2 siblings, 0 replies; 12+ messages in thread
From: Elijah Newren @ 2022-10-05  5:22 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Martin von Zweigbergk, Glen Choo, Git Mailing List,
	Derrick Stolee

On Tue, Oct 4, 2022 at 1:35 PM Victoria Dye <vdye@github.com> wrote:
>
[...]
> >> Thanks for reporting this! There are a few confusing things going on with
> >> 'restore' here.
> >>
> >> First is that the out-of-cone was even restored in the first place.

Yep, agreed.  I'll add this example to the Known bugs section.

[...]
> >> Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
> >> inside the sparse-checkout patterns (per the documentation of
> >> '--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
> >> Then, there's a difference between 'restore' and 'checkout' that doesn't
> >> seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
> >> only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
> >> status).
> >
> > Restoring only into the index (as I think `git restore --staged` is supposed
> > to do) is weird.
>
> 'git restore --staged' is intended to restore to both the worktree and index
> (per 183fb44fd2 (restore: add --worktree and --staged, 2019-04-25)). The bug
> you've identified is that it's not restoring to the worktree.

I think that commit message is easy to mis-read.  --staged means
restore to the index, and does not restore to the working tree unless
the user also specifies --worktree.  Duy considered adding a
`--no-worktree` option but decided against it.

(In constrast, `checkout` always restores to both places and gives no
way to do just one.  Defaulting to just the worktree and letting the
user pick was one of the two big warts that `restore` fixed relative
to `checkout` -- the other being the --no-overlay mode.)

[...]
> > If we now do `git restore --staged --source HEAD^` and that
> > command doesn't remove the `SKIP_WORKTREE` flag on any paths, that logically
> > means that we have modified the working copy, and I think `git
> > sparse-checkout disable` would agree with me.
>
> If you aren't using '--ignore-skip-worktree-bits', the entries with
> SKIP_WORKTREE shouldn't be touched in the first place. If you *do* specify
> it, by virtue of restoring to the working tree, SKIP_WORKTREE must be
> removed.
>
> But suppose you're doing something like 'git restore --staged --no-worktree
> --ignore-skip-worktree-bits --source HEAD^'. In that case:

That'd be `git restore --staged --source=HEAD^
--ignore-skip-worktree-bits -- <paths>`, but I think I understand the
point you're conveying...

> - you are restoring to the index
> - you are *not* restoring to the worktree
> - you're restoring files with SKIP_WORKTREE applied

minor nit: you may also be restoring files that do not exist in HEAD
or the index.  So, I'd rather say here "you're restoring files outside
the sparse specification" to handle those too.

> Right now, SKIP_WORKTREE is removed from the matched entries, but suppose

Right, any code outside of unpack-trees.c that tweaks index entries
tends to replace existing ones with new ones without carefully setting
or copying various flags (like SKIP_WORKTREE) over.  I suspect we just
have another case of that here.  sparse-checkouts are forcing us to
audit all these paths...

> (per your comment) it wasn't. That wouldn't mean that we've "modified the
> working copy"; the working tree is defined with respect to the index, and if
> the index entry says "I don't care about the worktree", then there are no
> differences to reflect.
>
> This raises an interesting question about the current behavior, though: if
> you restore a SKIP_WORKTREE entry with '--staged' and '--no-worktree',
> should we remove SKIP_WORKTREE? I'd lean towards "no", but I'm interested to
> hear other contributors' thoughts.

My current $0.02...

I'd say that, without the --ignore-skip-worktree-bits override, we
should behave like `git-add` and `git-rm` and not operate on paths
outside the sparse specification.  If the paths the user specified
don't match anything in the sparse specification, we should throw an
error and tell the user of the flag to use to override.

With the override...I'd agree that we should only update the index and
still keep (or appropriately set) the SKIP_WORKTREE bits.  Yes, it has
some weirdness (as Martin points out), but that's part of the reason
of requiring the --scope=all override, much as we do with `git-add`
and `git-rm`.  (The override for add/rm is currently named `--sparse`
so it's not quite the same, but we're planning on renaming all these
so that'll make it clearer that it's the same situation for them all.)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-05  4:53       ` Martin von Zweigbergk
@ 2022-10-05  7:51         ` Elijah Newren
  2022-10-05 20:00           ` Martin von Zweigbergk
  2022-10-05 16:11         ` Victoria Dye
  1 sibling, 1 reply; 12+ messages in thread
From: Elijah Newren @ 2022-10-05  7:51 UTC (permalink / raw)
  To: Martin von Zweigbergk
  Cc: Victoria Dye, Glen Choo, Git Mailing List, Derrick Stolee

On Tue, Oct 4, 2022 at 9:53 PM Martin von Zweigbergk
<martinvonz@google.com> wrote:
>
> On Tue, Oct 4, 2022 at 1:35 PM Victoria Dye <vdye@github.com> wrote:
> >
> > Martin von Zweigbergk wrote:
> > >> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com <mailto:vdye@github.com>> wrote:
> > >>
[...]
> > >> Thanks for reporting this! There are a few confusing things going on with
> > >> 'restore' here.
> > >>
> > >> First is that the out-of-cone was even restored in the first place.
> > >>
> > >
> > > I was actually happy that the out-of-cone paths were restored. I ran that
> > > command as an experiment while reading Elijah's doc because I was curious
> > > what would happen. The reason I think it should restore out-of-cone paths is
> > > so you can do `git restore --staged --source <some commit> && git commit -m
> > > "restore to old commit"` without caring about the sparse spec.
> >
> > Conversely, that's behavior a user *wouldn't* want if they want to keep
> > their sparse cone intact (not to mention the performance impact of checking
> > out the entire worktree). I think it does more harm to those users than it
> > would benefit the ones that want to checkout out-of-cone files.
> >
> > The use-case you're describing should be served by the
> > '--ignore-skip-worktree-bits' option (not the most intuitive name,
> > unfortunately). Luckily, there's an increasing desire to improve the naming
> > of sparse-related options, so the UX situation should improve in the future.
>
> I realized after sending my previous email that I might have a
> different view of what
> sparse checkout is about. To me, it seems like it should be just a performance
> optimization. That's why I feel like commands should behave the same way with
> or without a sparse spec (unless that proposed `--restrict` flag is passed). I
> understand if that's just not feasible. Sorry about the noise in that case :)

The problem I see with that definition is that I'm not even sure what
that means.  Behaving the same way with or without a sparse
specification at the extreme means that switching branches should
populate all the files in that branch...meaning a dense checkout.
That kind of defeats the point.  So, I'm sure you don't mean that they
behave the same in all cases...but where do you draw the line?

Is it just switch/checkout & `reset --hard` that avoid reading and
writing outside the sparse specification?

Should diff & status ignore files outside the sparse specification
even if users wrote to such files?  A "performance optimization" might
suggest we should, but would users get confused?

What about merge/rebase/cherry-pick/revert?  Should those write
additional files to the working tree or avoid it?  What about if there
are conflicts outside the sparse specification?

And if extra files get written outside the sparse specification, are
there ways for this to get "cleaned up" where after resolving
conflicts or changes we can again remove the file from the working
tree?

What about `git grep PATTERN`?  That's documented to search the
tracked files in the working tree.  But should that include the files
that would have been there were it not for the "performance
optimization" of not checking them out?  (Similarly, what about `git
grep --cached PATTERN` or `git grep PATTERN REVISION`?)  I mean, if
these commands should behave the same regardless of sparse
specification, then you should search those other files, right?  But
isn't that a nasty performance penalty if the user has a
sparse/partial clone since Git will have to do many network operations
to get additional blobs in order to search them?  Is that really
wanted?

What about `git rm '*.png'` to remove all the tracked png files from
my working tree.  Should that also remove all the files that would
have been there were it not for the "performance optimization"?  Will
that result in very negative surprises for those with a "I want to
concentrate on just this subset of files" mental model?

What about `git worktree add`?  Should the sparse specification be the
same for all worktrees?  Be per-worktree?  Should it default to dense?
 To just top-level sparse?  To the same sparsity as the worktree you
were in when you created a new one?

There are more questions along these lines, but perhaps this gives a
flavor of the kinds of questions that can come up which need
underlying user mental models and usecases to help us answer how these
commands should behave.

> > But suppose you're doing something like 'git restore --staged --no-worktree
> > --ignore-skip-worktree-bits --source HEAD^'. In that case:
> >
> > - you are restoring to the index
> > - you are *not* restoring to the worktree
> > - you're restoring files with SKIP_WORKTREE applied
> >
> > Right now, SKIP_WORKTREE is removed from the matched entries, but suppose
> > (per your comment) it wasn't. That wouldn't mean that we've "modified the
> > working copy"; the working tree is defined with respect to the index, and if
> > the index entry says "I don't care about the worktree", then there are no
> > differences to reflect.
>
> Yes, not technically changed. I was (and still am) thinking of the working copy
> as logically containing all the files even if some of them are not written out
> to disk.

Git started with a nearly identical definition and still sadly
includes it (though buried deep in a plumbing manual that thankfully
virtually no one will ever read):

       Skip-worktree bit can be defined in one (long) sentence: When reading
       an entry, if it is marked as skip-worktree, then Git pretends its
       working directory version is up to date and read the index version
       instead.

Such a mental model is useless to me in answering how commands should
behave in any interesting cases, and actually led to several
inconsistencies and bugs[1].  However, you didn't leave it there; you
took it a step further...

[1] https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/

> I understand if that seems like an odd way of thinking about
> it. It might
> help to think about how it might appear in a virtual file system where clean
> files outside the sparse spec are presented from the index. Or it may be better
> to not try to think about it in this weird way at all :)

Now you've provided a real usecase (virtual file system really
pretending that all files are present), which gives us some context
with which we can try to answer those questions I asked above.  I
think some of the answers might be different for this usecase than the
"behavior A" and "behavior B" in my current sparse-checkout.txt
document, so maybe we want to add this usecase to the mix.

Anyway, thanks for the interesting bug report, and the context of
another usecase to consider.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-05  4:53       ` Martin von Zweigbergk
  2022-10-05  7:51         ` Elijah Newren
@ 2022-10-05 16:11         ` Victoria Dye
  1 sibling, 0 replies; 12+ messages in thread
From: Victoria Dye @ 2022-10-05 16:11 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Glen Choo, git, Derrick Stolee, Elijah Newren

Martin von Zweigbergk wrote:>>>> Theoretically, 'restore' (like 'checkout') should be limited to pathspecs
>>>> inside the sparse-checkout patterns (per the documentation of
>>>> '--ignore-skip-worktree-bits'), but 'Documentation' does not match them.
>>>> Then, there's a difference between 'restore' and 'checkout' that doesn't
>>>> seem intentional; both remove the 'SKIP_WORKTREE' flag from the file, but
>>>> only 'checkout' creates the file on-disk (therefore avoiding the "deleted"
>>>> status).
>>>
>>> Restoring only into the index (as I think `git restore --staged` is supposed
>>> to do) is weird.
>>
>> 'git restore --staged' is intended to restore to both the worktree and index
>> (per 183fb44fd2 (restore: add --worktree and --staged, 2019-04-25)). The bug
>> you've identified is that it's not restoring to the worktree.
> 
> Ah, `--worktree` is on by default even if I pass `--staged`, I see. Hmm, the
> help text actually says "Specifying --staged will only restore the index."...

You (and Elijah [1]) are correct. '--staged' overrides the "checkout to
worktree" default behavior of 'git restore' to only restore to the index. If
you want to checkout to the worktree _and_ the index, 'git restore --staged
--worktree' is what you'd use.

Sorry for the incorrect information! 

[1] https://lore.kernel.org/git/CABPp-BGeC3hXw-v3voniY5ZU2f6W8NXfXVvq0C03eGGhvSefgg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-05  7:51         ` Elijah Newren
@ 2022-10-05 20:00           ` Martin von Zweigbergk
  2022-10-06  4:20             ` Elijah Newren
  0 siblings, 1 reply; 12+ messages in thread
From: Martin von Zweigbergk @ 2022-10-05 20:00 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Victoria Dye, Glen Choo, Git Mailing List, Derrick Stolee

On Wed, Oct 5, 2022 at 12:51 AM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Oct 4, 2022 at 9:53 PM Martin von Zweigbergk
> <martinvonz@google.com> wrote:
> >
> > On Tue, Oct 4, 2022 at 1:35 PM Victoria Dye <vdye@github.com> wrote:
> > >
> > > Martin von Zweigbergk wrote:
> > > >> On Tue, Oct 4, 2022 at 9:34 AM Victoria Dye <vdye@github.com <mailto:vdye@github.com>> wrote:
> > > >>
> [...]
> > > >> Thanks for reporting this! There are a few confusing things going on with
> > > >> 'restore' here.
> > > >>
> > > >> First is that the out-of-cone was even restored in the first place.
> > > >>
> > > >
> > > > I was actually happy that the out-of-cone paths were restored. I ran that
> > > > command as an experiment while reading Elijah's doc because I was curious
> > > > what would happen. The reason I think it should restore out-of-cone paths is
> > > > so you can do `git restore --staged --source <some commit> && git commit -m
> > > > "restore to old commit"` without caring about the sparse spec.
> > >
> > > Conversely, that's behavior a user *wouldn't* want if they want to keep
> > > their sparse cone intact (not to mention the performance impact of checking
> > > out the entire worktree). I think it does more harm to those users than it
> > > would benefit the ones that want to checkout out-of-cone files.
> > >
> > > The use-case you're describing should be served by the
> > > '--ignore-skip-worktree-bits' option (not the most intuitive name,
> > > unfortunately). Luckily, there's an increasing desire to improve the naming
> > > of sparse-related options, so the UX situation should improve in the future.
> >
> > I realized after sending my previous email that I might have a
> > different view of what
> > sparse checkout is about. To me, it seems like it should be just a performance
> > optimization. That's why I feel like commands should behave the same way with
> > or without a sparse spec (unless that proposed `--restrict` flag is passed). I
> > understand if that's just not feasible. Sorry about the noise in that case :)
>
> The problem I see with that definition is that I'm not even sure what
> that means.  Behaving the same way with or without a sparse
> specification at the extreme means that switching branches should
> populate all the files in that branch...meaning a dense checkout.
> That kind of defeats the point.  So, I'm sure you don't mean that they
> behave the same in all cases...but where do you draw the line?

I agree with you and Stolee that there are two different cases: some
people use sparse checkouts to restrict what they see (behavior A), and
some people use it just as a performance optimization (behavior B). So I
suspect we roughly agree about what should happen if you pass
`--restrict` (or if that becomes the default so you don't actually need to
pass it). My arguments were about the `--no-restrict` case. Sorry, I
should have made that clear.

I also agree that having a way to make commands restrict to certain paths
by default is useful, and I agree that tying that set of paths to the current
worktree's sparse spec makes sense.

I'll answer the questions below for the `--no-restrict` case
(behavior B).

>
> Is it just switch/checkout & `reset --hard` that avoid reading and
> writing outside the sparse specification?
>
> Should diff & status ignore files outside the sparse specification
> even if users wrote to such files?  A "performance optimization" might
> suggest we should, but would users get confused?

I think they should be included (again, in the `--no-restrict` case).

>
> What about merge/rebase/cherry-pick/revert?  Should those write
> additional files to the working tree or avoid it?  What about if there
> are conflicts outside the sparse specification?

I think they should avoid it, but since the user will need to resolve
that conflict anyway, I can see it makes sense to write them to disk
if there are conflicts.

>
> And if extra files get written outside the sparse specification, are
> there ways for this to get "cleaned up" where after resolving
> conflicts or changes we can again remove the file from the working
> tree?

I've never really used `git sparse-checkout` (until I read your doc),
but isn't that what `git sparse-checkout reapply` is for?

>
> What about `git grep PATTERN`?  That's documented to search the
> tracked files in the working tree.  But should that include the files
> that would have been there were it not for the "performance
> optimization" of not checking them out?  (Similarly, what about `git
> grep --cached PATTERN` or `git grep PATTERN REVISION`?)  I mean, if
> these commands should behave the same regardless of sparse
> specification, then you should search those other files, right?  But
> isn't that a nasty performance penalty if the user has a
> sparse/partial clone since Git will have to do many network operations
> to get additional blobs in order to search them?  Is that really
> wanted?

I think it's consistent to search them with `--no-restrict` (but not
with `--restrict`, of course).

>
> What about `git rm '*.png'` to remove all the tracked png files from
> my working tree.  Should that also remove all the files that would
> have been there were it not for the "performance optimization"?  Will
> that result in very negative surprises for those with a "I want to
> concentrate on just this subset of files" mental model?

Same here.

>
> What about `git worktree add`?  Should the sparse specification be the
> same for all worktrees?  Be per-worktree?  Should it default to dense?
>  To just top-level sparse?  To the same sparsity as the worktree you
> were in when you created a new one?

That's an interesting case. If someone does `git worktree add` and expects
all files to be available in the working copy, they might be surprised, yes.
I think that's a much smaller risk than
`git restore --source HEAD^ --staged && git commit -m 'undo changes'` being
partial, however.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-05 20:00           ` Martin von Zweigbergk
@ 2022-10-06  4:20             ` Elijah Newren
  0 siblings, 0 replies; 12+ messages in thread
From: Elijah Newren @ 2022-10-06  4:20 UTC (permalink / raw)
  To: Martin von Zweigbergk
  Cc: Victoria Dye, Glen Choo, Git Mailing List, Derrick Stolee

On Wed, Oct 5, 2022 at 1:00 PM Martin von Zweigbergk
<martinvonz@google.com> wrote:
>
> On Wed, Oct 5, 2022 at 12:51 AM Elijah Newren <newren@gmail.com> wrote:
> >
[...]
> I agree with you and Stolee that there are two different cases: some
> people use sparse checkouts to restrict what they see (behavior A), and
> some people use it just as a performance optimization (behavior B). So I
> suspect we roughly agree about what should happen if you pass
> `--restrict` (or if that becomes the default so you don't actually need to
> pass it). My arguments were about the `--no-restrict` case. Sorry, I
> should have made that clear.
>
> I also agree that having a way to make commands restrict to certain paths
> by default is useful, and I agree that tying that set of paths to the current
> worktree's sparse spec makes sense.
>
> I'll answer the questions below for the `--no-restrict` case
> (behavior B).

I don't think your usecase matches behavior B.  I think we should
label your VFS usecase as behavior C and define it separately.  More
on that below...

[...]
> > What about merge/rebase/cherry-pick/revert?  Should those write
> > additional files to the working tree or avoid it?  What about if there
> > are conflicts outside the sparse specification?
>
> I think they should avoid it, but since the user will need to resolve
> that conflict anyway, I can see it makes sense to write them to disk
> if there are conflicts.
>
> >
> > And if extra files get written outside the sparse specification, are
> > there ways for this to get "cleaned up" where after resolving
> > conflicts or changes we can again remove the file from the working
> > tree?
>
> I've never really used `git sparse-checkout` (until I read your doc),
> but isn't that what `git sparse-checkout reapply` is for?

While that command is available for users that want to manually clean
things up proactively, my suspicion is that it is used very rarely --
especially now that we have the present-despite-skipped class of
issues fixed.  I suspect nearly all cleaning up is actually done as an
implicit side-effect of calls to unpack_trees(), which would affect
commands such `switch`, the switch-like portion of `checkout`, `reset
--hard`, `merge`, `rebase`, and many others.

All of these commands have two types of implicit clean-up they do as
part of their operation (which could be thought of as a
post-processing step): (1) marking *unmodified* files outside the
sparsity patterns as SKIP_WORKTREE in the index and removing them from
the working tree, and (2) taking files which match the sparsity
patterns which were previously SKIP_WORKTREE and flip them to
!SKIP_WORKTREE and restore them to the working tree.  I've got a few
examples of what this clean up looks like over at:
https://lore.kernel.org/git/CABPp-BHGrxLPu_S3y2zG-U6uo0rM5TYYEREZa2A=e=d9VZb2PA@mail.gmail.com/

I have no idea how this cleanup affects the VFS usecase; it's very
focused towards "sparse checkout means many files should NOT be
present in the working tree" which may be at odds with how the VFS
stuff is intended to behave.  But it's also been part of
sparse-checkout behavior the longest; for well over a decade now.

> > What about `git grep PATTERN`?  That's documented to search the
> > tracked files in the working tree.  But should that include the files
> > that would have been there were it not for the "performance
> > optimization" of not checking them out?  (Similarly, what about `git
> > grep --cached PATTERN` or `git grep PATTERN REVISION`?)  I mean, if
> > these commands should behave the same regardless of sparse
> > specification, then you should search those other files, right?  But
> > isn't that a nasty performance penalty if the user has a
> > sparse/partial clone since Git will have to do many network operations
> > to get additional blobs in order to search them?  Is that really
> > wanted?
>
> I think it's consistent to search them with `--no-restrict` (but not
> with `--restrict`, of course).
>
> > What about `git rm '*.png'` to remove all the tracked png files from
> > my working tree.  Should that also remove all the files that would
> > have been there were it not for the "performance optimization"?  Will
> > that result in very negative surprises for those with a "I want to
> > concentrate on just this subset of files" mental model?
>
> Same here.
>
> >
> > What about `git worktree add`?  Should the sparse specification be the
> > same for all worktrees?  Be per-worktree?  Should it default to dense?
> >  To just top-level sparse?  To the same sparsity as the worktree you
> > were in when you created a new one?
>
> That's an interesting case. If someone does `git worktree add` and expects
> all files to be available in the working copy, they might be surprised, yes.
> I think that's a much smaller risk than
> `git restore --source HEAD^ --staged && git commit -m 'undo changes'` being
> partial, however.

After you described the VFS usecase, I was guessing you'd answer how
you did for most of these commands.  Most of your answers do not match
the answers I'd expect for behavior B, which seems to me to support my
suspicion that you've got a third usecase.

In particular, I think the difference between Behavior B and your
usecase hinges on the expectation for the working tree:
   Behavior B: Files outside the sparse specification are NOT present
in the working tree.
   Behavior C (your usecase): Files outside the sparse specification
ARE "present" in the working tree, but Git doesn't have to put them
there (they'll be lazily put into place by something else, and the VFS
will ensure that users don't ever notice them actually missing, so far
all intents and purposes, the files are present).

In particular, that difference is perhaps most notable with `git grep`
(without --cached or REVISION flags); such a command is supposed to
search the worktree.  For Behavior B, files outside the sparse
specification are NOT present in the working tree, and hence those
files should NOT be searched.  For your usecase, as you highlight
above, you view all files as present in the working tree (even if Git
isn't the thing writing those files to the working tree and even if
they aren't technically present until you query whether they are
there), so all those files SHOULD be searched.

This difference about the "presence" of files has other knock-on
effects too.  Under Behavior B, users get used to working on just a
subset of files.  Thus `git rm '*.jpg'` or `git restore --source HEAD^
-- '*.md'` should NOT overwrite files outside the sparse specification
(but an error should be shown if the pathspec doesn't match anything,
and that error should point out how users can affect other files
outside the sparse specification).  Under your usecase, users are
always working on the full set of files and all of them can be viewed
in their working copy (as enforced by the filesystem intercepting any
attempts to view or edit files and suddenly magically materializing
them when users look) -- so users in your usecase are not expecting to
be working on a subset of files, and thus those commands would operate
tree-wide.

Similarly, under Behavior B, `git add outside/of/cone/path` should
throw an error.  If it doesn't, some future command will silently
remove the file from the working copy, which may confuse the user;
they are getting themselves into an erroneous state.  Users are
pointed to an override flag they can use if they want to accept the
consequences.  Under your usecase, since ALL files are always
"present" (but not materialized until an attempt to access is made),
that same command would be expected to run without an override and
with no error or warning.

Related to the above, under Behavior B, `git status` should probably
report the existence of any untracked files that do not match the
sparsity patterns as an erroneous condition (because why wait until
git-add to throw an error; let the user know early).  Under your
usecase, we wouldn't.


You might think I'm describing Behavior A above, but only because
Behavior A and Behavior B overlap on worktree-related operations.  The
primary difference between Behavior A and Behavior B is with behavior
of history-related operations.  Behavior B says "the working tree is
sparse, but history is dense; if I do a query on older revisions of
history (grep/diff/log/etc.) then give me results across all paths",
whereas Behavior A says "I only care about this subset of files, both
for the working tree and for history.  Unless I override,
grep/diff/etc. on history should restrict all output to files within
the sparse specification.

One potential way to (over)simplify this would be:
    Behavior A: `--scope=sparse` for both worktree and history operations
    Behavior B: `--scope=sparse` for worktree operations,
`--scope=all` for history operations
    Behavior C: `--scope=all` for both worktree and history operations

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-04 20:34     ` Victoria Dye
  2022-10-05  4:53       ` Martin von Zweigbergk
  2022-10-05  5:22       ` Elijah Newren
@ 2022-10-06 19:30       ` Junio C Hamano
  2022-10-06 19:38         ` Junio C Hamano
  2 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2022-10-06 19:30 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Martin von Zweigbergk, Glen Choo, git, Derrick Stolee,
	Elijah Newren

Victoria Dye <vdye@github.com> writes:

>> Restoring only into the index (as I think `git restore --staged` is supposed
>> to do) is weird. 
>
> 'git restore --staged' is intended to restore to both the worktree and index
> (per 183fb44fd2 (restore: add --worktree and --staged, 2019-04-25)). The bug
> you've identified is that it's not restoring to the worktree.

I think you misread 183fb44fd2, which says --staged is to restore
the contents in the index and --worktree is to restore the contents
in the working tree, and both of them can be used at the same time
to affect both destinations.  As gitcli(7) says, --staged here is a
synonym to --cached here and should not touch the working tree.

Having needless synonym may be a source of confusion, so we may want
to straighten out the UI a bit around here, but that is a separate
topic.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug report: `git restore --source --staged` deals poorly with sparse-checkout
  2022-10-06 19:30       ` Junio C Hamano
@ 2022-10-06 19:38         ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2022-10-06 19:38 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Martin von Zweigbergk, Glen Choo, git, Derrick Stolee,
	Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

> I think you misread 183fb44fd2, ...

Ah, now I see this was already resolved yesterday while I was
offline.  Sorry for the noise.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-10-06 19:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 22:05 Bug report: `git restore --source --staged` deals poorly with sparse-checkout Glen Choo
2022-10-04 16:34 ` Victoria Dye
     [not found]   ` <CAESOdVAh68HoQoyicfZn4XbjGfiRFCu1zFQmUjMcSAg3tUzr4Q@mail.gmail.com>
2022-10-04 20:34     ` Victoria Dye
2022-10-05  4:53       ` Martin von Zweigbergk
2022-10-05  7:51         ` Elijah Newren
2022-10-05 20:00           ` Martin von Zweigbergk
2022-10-06  4:20             ` Elijah Newren
2022-10-05 16:11         ` Victoria Dye
2022-10-05  5:22       ` Elijah Newren
2022-10-06 19:30       ` Junio C Hamano
2022-10-06 19:38         ` Junio C Hamano
2022-10-05  5:00     ` Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).