git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Sparse Checkout Trouble (2.5.0)
@ 2020-01-22  0:42 JunkYardMail1
  2020-01-22  1:19 ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: JunkYardMail1 @ 2020-01-22  0:42 UTC (permalink / raw)
  To: git

I have a shallow repository clone and using sparse-checkout of just a
handful of directories.  When I upgraded from git version 2.24.1 to 2.25.0
some files not in the sparse-checkout were staged to be deleted.  The
directory path of these files contain the Windows reserved name of "prn".
Ex: "japanese/prn/. . ."  Unable to un-stage these files and reset the
changes.

.git/inof/Sparse-checkout:
!/*/
/*/directory-*/

Reverting back to the previous version (2.24.1) allowed to un-stage the
files and reset the changes.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-22  0:42 Sparse Checkout Trouble (2.5.0) JunkYardMail1
@ 2020-01-22  1:19 ` Derrick Stolee
  2020-01-22  2:06   ` JunkYardMail1
  2020-01-24 15:40   ` Derrick Stolee
  0 siblings, 2 replies; 11+ messages in thread
From: Derrick Stolee @ 2020-01-22  1:19 UTC (permalink / raw)
  To: JunkYardMail1, git

On 1/21/2020 7:42 PM, JunkYardMail1 wrote:
> I have a shallow repository clone and using sparse-checkout of just a
> handful of directories.  When I upgraded from git version 2.24.1 to 2.25.0
> some files not in the sparse-checkout were staged to be deleted.  The
> directory path of these files contain the Windows reserved name of "prn".
> Ex: "japanese/prn/. . ."  Unable to un-stage these files and reset the
> changes.

Are you using Git for Windows? I'm not sure why the "prn" would be important
otherwise.

> !/*/
> /*/directory-*/

I don't see anything complicated with these patterns, although they seem
a bit strange. Why would you not want anything in a directory except one
named "<something>/directory-<anotherthing>/"? Is this actually the exact
pattern set?
 
> Reverting back to the previous version (2.24.1) allowed to un-stage the
> files and reset the changes.

This is definitely pointing to a regression, and the feature did get a
bit of an overhaul. The goal was to not change how existing users
interacted with it, though, so this is a bit alarming.

I did find a behavior change related to these paths in Git for Windows
2.25.0 versus 2.24.1:

  $ git clone https://github.com/derrickstolee/git-sparse-checkout-test
  Cloning into 'git-sparse-checkout-test'...
  remote: Enumerating objects: 6, done.
  remote: Counting objects: 100% (6/6), done.
  remote: Compressing objects: 100% (4/4), done.
  remote: Total 6 (delta 0), reused 6 (delta 0), pack-reused 0
  Receiving objects: 100% (6/6), done.
  error: invalid path 'directory-1/prn/a'
  fatal: unable to checkout working tree
  warning: Clone succeeded, but checkout failed.
  You can inspect what was checked out with 'git status'
  and retry with 'git restore --source=HEAD :/'
  
The checkout here is _expected_ to fail, due to the protected
filenames. What is troublesome is the following sparse-checkout
commands fail, and the read-tree command fails in a way that
2.24.1 does not:
 
  $ git sparse-checkout init
  error: invalid path 'directory-1/prn/a'
  error: invalid path 'directory-1/prn/a'
  
  $ git sparse-checkout set "/*" "\!/*/"
  error: invalid path 'directory-1/prn/a'
  error: invalid path 'directory-1/prn/a'
  
  $ git read-tree -mu HEAD
  error: invalid path 'directory-1/prn/a'

The double error messages are due to the "try with an in-memory
pattern set, then roll back if there is a failure." The patterns
we are trying to create do not include the directory that is failing.

I'll keep looking into this.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Sparse Checkout Trouble (2.5.0)
  2020-01-22  1:19 ` Derrick Stolee
@ 2020-01-22  2:06   ` JunkYardMail1
  2020-01-24 15:40   ` Derrick Stolee
  1 sibling, 0 replies; 11+ messages in thread
From: JunkYardMail1 @ 2020-01-22  2:06 UTC (permalink / raw)
  To: 'Derrick Stolee', git

Yes.  Using git for Windows with bash terminal and Git Extensions GUI.

The reason for the sparse checkout pattern is that the repo is very large, approximately a gigabyte, and I only need of few sub-directories that amount to only about 60 megabytes.
The first sparse checkout rule "!/*/" is expected to exclude everything.
The second sparse checkout rule "/*/directory-*/" is expected to include only sub-directories that contain "directory-" as the leading part of a directory element.

Looks like you're able to reproduce the issue and are hot on the trail.  Thanks.


-----Original Message-----
From: Derrick Stolee [mailto:stolee@gmail.com] 
Sent: Tuesday, January 21, 2020 5:19 PM
To: JunkYardMail1 <JunkYardMail1@Frontier.com>; git@vger.kernel.org
Subject: Re: Sparse Checkout Trouble (2.5.0)

On 1/21/2020 7:42 PM, JunkYardMail1 wrote:
> I have a shallow repository clone and using sparse-checkout of just a 
> handful of directories.  When I upgraded from git version 2.24.1 to 
> 2.25.0 some files not in the sparse-checkout were staged to be 
> deleted.  The directory path of these files contain the Windows reserved name of "prn".
> Ex: "japanese/prn/. . ."  Unable to un-stage these files and reset the 
> changes.

Are you using Git for Windows? I'm not sure why the "prn" would be important otherwise.

> !/*/
> /*/directory-*/

I don't see anything complicated with these patterns, although they seem a bit strange. Why would you not want anything in a directory except one named "<something>/directory-<anotherthing>/"? Is this actually the exact pattern set?
 
> Reverting back to the previous version (2.24.1) allowed to un-stage 
> the files and reset the changes.

This is definitely pointing to a regression, and the feature did get a bit of an overhaul. The goal was to not change how existing users interacted with it, though, so this is a bit alarming.

I did find a behavior change related to these paths in Git for Windows
2.25.0 versus 2.24.1:

  $ git clone https://github.com/derrickstolee/git-sparse-checkout-test
  Cloning into 'git-sparse-checkout-test'...
  remote: Enumerating objects: 6, done.
  remote: Counting objects: 100% (6/6), done.
  remote: Compressing objects: 100% (4/4), done.
  remote: Total 6 (delta 0), reused 6 (delta 0), pack-reused 0
  Receiving objects: 100% (6/6), done.
  error: invalid path 'directory-1/prn/a'
  fatal: unable to checkout working tree
  warning: Clone succeeded, but checkout failed.
  You can inspect what was checked out with 'git status'
  and retry with 'git restore --source=HEAD :/'
  
The checkout here is _expected_ to fail, due to the protected filenames. What is troublesome is the following sparse-checkout commands fail, and the read-tree command fails in a way that
2.24.1 does not:
 
  $ git sparse-checkout init
  error: invalid path 'directory-1/prn/a'
  error: invalid path 'directory-1/prn/a'
  
  $ git sparse-checkout set "/*" "\!/*/"
  error: invalid path 'directory-1/prn/a'
  error: invalid path 'directory-1/prn/a'
  
  $ git read-tree -mu HEAD
  error: invalid path 'directory-1/prn/a'

The double error messages are due to the "try with an in-memory pattern set, then roll back if there is a failure." The patterns we are trying to create do not include the directory that is failing.

I'll keep looking into this.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-22  1:19 ` Derrick Stolee
  2020-01-22  2:06   ` JunkYardMail1
@ 2020-01-24 15:40   ` Derrick Stolee
  2020-01-25 20:59     ` Elijah Newren
  1 sibling, 1 reply; 11+ messages in thread
From: Derrick Stolee @ 2020-01-24 15:40 UTC (permalink / raw)
  To: JunkYardMail1, git, Elijah Newren

On 1/21/2020 8:19 PM, Derrick Stolee wrote:
> On 1/21/2020 7:42 PM, JunkYardMail1 wrote:
>> Reverting back to the previous version (2.24.1) allowed to un-stage the
>> files and reset the changes.
> 
> This is definitely pointing to a regression, and the feature did get a
> bit of an overhaul. The goal was to not change how existing users
> interacted with it, though, so this is a bit alarming.
> 
> I did find a behavior change related to these paths in Git for Windows
> 2.25.0 versus 2.24.1:
> 
>   $ git clone https://github.com/derrickstolee/git-sparse-checkout-test
>   Cloning into 'git-sparse-checkout-test'...
>   remote: Enumerating objects: 6, done.
>   remote: Counting objects: 100% (6/6), done.
>   remote: Compressing objects: 100% (4/4), done.
>   remote: Total 6 (delta 0), reused 6 (delta 0), pack-reused 0
>   Receiving objects: 100% (6/6), done.
>   error: invalid path 'directory-1/prn/a'
>   fatal: unable to checkout working tree
>   warning: Clone succeeded, but checkout failed.
>   You can inspect what was checked out with 'git status'
>   and retry with 'git restore --source=HEAD :/'
>   
> The checkout here is _expected_ to fail, due to the protected
> filenames. What is troublesome is the following sparse-checkout
> commands fail, and the read-tree command fails in a way that
> 2.24.1 does not:
>  
>   $ git sparse-checkout init
>   error: invalid path 'directory-1/prn/a'
>   error: invalid path 'directory-1/prn/a'
>   
>   $ git sparse-checkout set "/*" "\!/*/"
>   error: invalid path 'directory-1/prn/a'
>   error: invalid path 'directory-1/prn/a'
>   
>   $ git read-tree -mu HEAD
>   error: invalid path 'directory-1/prn/a'
> 
> The double error messages are due to the "try with an in-memory
> pattern set, then roll back if there is a failure." The patterns
> we are trying to create do not include the directory that is failing.
> 
> I'll keep looking into this.

I am still struggling to find out exactly what goes wrong here.

I'm CC'ing Elijah because he also made changes to dir.c, and
perhaps he has some idea of what's going on.

Using GIT_TRACE2_PERF=1, I can see that the error appears during
the traverse_trees() which is after the sparse-checkout patterns
have applied the skip-worktree bit. 

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-24 15:40   ` Derrick Stolee
@ 2020-01-25 20:59     ` Elijah Newren
  2020-01-26 14:09       ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Elijah Newren @ 2020-01-25 20:59 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: JunkYardMail1, Git Mailing List

On Fri, Jan 24, 2020 at 7:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/21/2020 8:19 PM, Derrick Stolee wrote:
> > On 1/21/2020 7:42 PM, JunkYardMail1 wrote:
> >> Reverting back to the previous version (2.24.1) allowed to un-stage the
> >> files and reset the changes.
> >
> > This is definitely pointing to a regression, and the feature did get a
> > bit of an overhaul. The goal was to not change how existing users
> > interacted with it, though, so this is a bit alarming.
> >
> > I did find a behavior change related to these paths in Git for Windows
> > 2.25.0 versus 2.24.1:
> >
> >   $ git clone https://github.com/derrickstolee/git-sparse-checkout-test
> >   Cloning into 'git-sparse-checkout-test'...
> >   remote: Enumerating objects: 6, done.
> >   remote: Counting objects: 100% (6/6), done.
> >   remote: Compressing objects: 100% (4/4), done.
> >   remote: Total 6 (delta 0), reused 6 (delta 0), pack-reused 0
> >   Receiving objects: 100% (6/6), done.
> >   error: invalid path 'directory-1/prn/a'
> >   fatal: unable to checkout working tree
> >   warning: Clone succeeded, but checkout failed.
> >   You can inspect what was checked out with 'git status'
> >   and retry with 'git restore --source=HEAD :/'
> >
> > The checkout here is _expected_ to fail, due to the protected
> > filenames. What is troublesome is the following sparse-checkout
> > commands fail, and the read-tree command fails in a way that
> > 2.24.1 does not:
> >
> >   $ git sparse-checkout init
> >   error: invalid path 'directory-1/prn/a'
> >   error: invalid path 'directory-1/prn/a'
> >
> >   $ git sparse-checkout set "/*" "\!/*/"
> >   error: invalid path 'directory-1/prn/a'
> >   error: invalid path 'directory-1/prn/a'
> >
> >   $ git read-tree -mu HEAD
> >   error: invalid path 'directory-1/prn/a'
> >
> > The double error messages are due to the "try with an in-memory
> > pattern set, then roll back if there is a failure." The patterns
> > we are trying to create do not include the directory that is failing.
> >
> > I'll keep looking into this.
>
> I am still struggling to find out exactly what goes wrong here.
>
> I'm CC'ing Elijah because he also made changes to dir.c, and
> perhaps he has some idea of what's going on.

If you think it might be related to the dir.c changes, I can take a
look.  I don't have any immediate ideas off the top of my head.
However, since I'm really suffering with "git read-tree -mu HEAD"
being the mechanism for updating sparsity (for reasons independent of
the issue being discussed here), I've been tempted to dig into that
anyway to write a replacement without the nasty side-effects.  I'll
take a look early next week and see if I can spot anything.

> Using GIT_TRACE2_PERF=1, I can see that the error appears during
> the traverse_trees() which is after the sparse-checkout patterns
> have applied the skip-worktree bit.
>
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-25 20:59     ` Elijah Newren
@ 2020-01-26 14:09       ` Derrick Stolee
  2020-01-28  5:21         ` Elijah Newren
  0 siblings, 1 reply; 11+ messages in thread
From: Derrick Stolee @ 2020-01-26 14:09 UTC (permalink / raw)
  To: Elijah Newren; +Cc: JunkYardMail1, Git Mailing List

On 1/25/2020 3:59 PM, Elijah Newren wrote:
> On Fri, Jan 24, 2020 at 7:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>> I'm CC'ing Elijah because he also made changes to dir.c, and
>> perhaps he has some idea of what's going on.
> 
> If you think it might be related to the dir.c changes, I can take a
> look.  I don't have any immediate ideas off the top of my head.

The only thing I can think of is that these paths are already
marked as sparse, but something is requiring us to test if the
path can be created with the filesystem. I'll try to debug
more into exactly where that is. It's telling that this happens
both in cone mode and without.

> However, since I'm really suffering with "git read-tree -mu HEAD"
> being the mechanism for updating sparsity (for reasons independent of
> the issue being discussed here), I've been tempted to dig into that
> anyway to write a replacement without the nasty side-effects.  I'll
> take a look early next week and see if I can spot anything.

If by "nasty side-effects" you mean "overwrites staged changes, even
if in the sparse set before and after" then I would welcome such a
change. Otherwise, it will fall to me, and this is far outside of
my expertise. Of course, it is something I should learn, but I can
learn that during code review, too. ;)

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-26 14:09       ` Derrick Stolee
@ 2020-01-28  5:21         ` Elijah Newren
  2020-01-28 20:47           ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Elijah Newren @ 2020-01-28  5:21 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: JunkYardMail1, Git Mailing List

On Sun, Jan 26, 2020 at 6:09 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/25/2020 3:59 PM, Elijah Newren wrote:
> > On Fri, Jan 24, 2020 at 7:41 AM Derrick Stolee <stolee@gmail.com> wrote:
> >> I'm CC'ing Elijah because he also made changes to dir.c, and
> >> perhaps he has some idea of what's going on.
> >
> > If you think it might be related to the dir.c changes, I can take a
> > look.  I don't have any immediate ideas off the top of my head.
>
> The only thing I can think of is that these paths are already
> marked as sparse, but something is requiring us to test if the
> path can be created with the filesystem. I'll try to debug
> more into exactly where that is. It's telling that this happens
> both in cone mode and without.

Yeah, I'll take a look.  The exponentially slow 'status --ignored'
report is forcing me to look at dir.c again anyway, though it's also
delaying me from getting a chance to look at this particular report.

> > However, since I'm really suffering with "git read-tree -mu HEAD"
> > being the mechanism for updating sparsity (for reasons independent of
> > the issue being discussed here), I've been tempted to dig into that
> > anyway to write a replacement without the nasty side-effects.  I'll
> > take a look early next week and see if I can spot anything.
>
> If by "nasty side-effects" you mean "overwrites staged changes, even
> if in the sparse set before and after" then I would welcome such a
> change. Otherwise, it will fall to me, and this is far outside of
> my expertise. Of course, it is something I should learn, but I can
> learn that during code review, too. ;)

Yep, that's exactly what I mean.  It'd be nice if Duy were still
around to bug, but I've touched unpack-trees a few times so I might be
able to find my way around.  I'll try to take a stab at it.

> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-28  5:21         ` Elijah Newren
@ 2020-01-28 20:47           ` Derrick Stolee
  2020-01-29 15:30             ` Johannes Schindelin
  0 siblings, 1 reply; 11+ messages in thread
From: Derrick Stolee @ 2020-01-28 20:47 UTC (permalink / raw)
  To: Elijah Newren, Johannes Schindelin; +Cc: JunkYardMail1, Git Mailing List

On 1/28/2020 12:21 AM, Elijah Newren wrote:
> On Sun, Jan 26, 2020 at 6:09 AM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 1/25/2020 3:59 PM, Elijah Newren wrote:
>>> On Fri, Jan 24, 2020 at 7:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>>>> I'm CC'ing Elijah because he also made changes to dir.c, and
>>>> perhaps he has some idea of what's going on.
>>>
>>> If you think it might be related to the dir.c changes, I can take a
>>> look.  I don't have any immediate ideas off the top of my head.
>>
>> The only thing I can think of is that these paths are already
>> marked as sparse, but something is requiring us to test if the
>> path can be created with the filesystem. I'll try to debug
>> more into exactly where that is. It's telling that this happens
>> both in cone mode and without.
> 
> Yeah, I'll take a look.  The exponentially slow 'status --ignored'
> report is forcing me to look at dir.c again anyway, though it's also
> delaying me from getting a chance to look at this particular report.

I made some progress, at least, in root-causing the issue.
The problem bisects down to 4dc42c6c1 (mingw: refuse paths
containing reserved names, 2019-12-21) [1]. CC'ing Dscho.

That commit updates is_valid_win32_path() to fail on these
paths. We were _already_ calling this method even for paths
outside the sparse cone, but the method didn't fail for these
examples.

This means the fix is probably even more complicated: we need
to not call this method when traversing paths that have the
skip-worktree bit enabled. This may lead to some tiny
performance gains when hydrating a very small fraction of a
very large index.

Thanks,
-Stolee

[1] http://github.com/git/git/commit/4dc42c6c1867a52e22f1f04a1a247b5a7538b8af

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-28 20:47           ` Derrick Stolee
@ 2020-01-29 15:30             ` Johannes Schindelin
  2020-01-29 15:40               ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Schindelin @ 2020-01-29 15:30 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren, JunkYardMail1, Git Mailing List

Hi,

On Tue, 28 Jan 2020, Derrick Stolee wrote:

> On 1/28/2020 12:21 AM, Elijah Newren wrote:
> > On Sun, Jan 26, 2020 at 6:09 AM Derrick Stolee <stolee@gmail.com> wrote:
> >>
> >> On 1/25/2020 3:59 PM, Elijah Newren wrote:
> >>> On Fri, Jan 24, 2020 at 7:41 AM Derrick Stolee <stolee@gmail.com> wrote:
> >>>> I'm CC'ing Elijah because he also made changes to dir.c, and
> >>>> perhaps he has some idea of what's going on.
> >>>
> >>> If you think it might be related to the dir.c changes, I can take a
> >>> look.  I don't have any immediate ideas off the top of my head.
> >>
> >> The only thing I can think of is that these paths are already
> >> marked as sparse, but something is requiring us to test if the
> >> path can be created with the filesystem. I'll try to debug
> >> more into exactly where that is. It's telling that this happens
> >> both in cone mode and without.
> >
> > Yeah, I'll take a look.  The exponentially slow 'status --ignored'
> > report is forcing me to look at dir.c again anyway, though it's also
> > delaying me from getting a chance to look at this particular report.
>
> I made some progress, at least, in root-causing the issue.
> The problem bisects down to 4dc42c6c1 (mingw: refuse paths
> containing reserved names, 2019-12-21) [1]. CC'ing Dscho.
>
> That commit updates is_valid_win32_path() to fail on these
> paths. We were _already_ calling this method even for paths
> outside the sparse cone, but the method didn't fail for these
> examples.
>
> This means the fix is probably even more complicated: we need
> to not call this method when traversing paths that have the
> skip-worktree bit enabled. This may lead to some tiny
> performance gains when hydrating a very small fraction of a
> very large index.

Hmm. I am actually not sure that this would be a fix. It is all too easy
for a skip-worktree entry to become checked out (think e.g. a merge
conflict in a cherry-pick, during a three-way merge of a file that is not
in the cone but still needs to be handled).

My preferred solution for this issue would be for the files/directories to
be renamed using `git -c core.protectntfs=false mv <reserved-name>
<non-reserved-name>`.

I think if we try to play extra games with the skip-worktree bit, we risk
opening a vulnerability again.

Ciao,
Dscho

>
> Thanks,
> -Stolee
>
> [1] http://github.com/git/git/commit/4dc42c6c1867a52e22f1f04a1a247b5a7538b8af
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-29 15:30             ` Johannes Schindelin
@ 2020-01-29 15:40               ` Derrick Stolee
  2020-01-29 16:31                 ` Derrick Stolee
  0 siblings, 1 reply; 11+ messages in thread
From: Derrick Stolee @ 2020-01-29 15:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Elijah Newren, JunkYardMail1, Git Mailing List

On 1/29/2020 10:30 AM, Johannes Schindelin wrote:
> Hi,
> 
> On Tue, 28 Jan 2020, Derrick Stolee wrote:
>> I made some progress, at least, in root-causing the issue.
>> The problem bisects down to 4dc42c6c1 (mingw: refuse paths
>> containing reserved names, 2019-12-21) [1]. CC'ing Dscho.
>>
>> That commit updates is_valid_win32_path() to fail on these
>> paths. We were _already_ calling this method even for paths
>> outside the sparse cone, but the method didn't fail for these
>> examples.
>>
>> This means the fix is probably even more complicated: we need
>> to not call this method when traversing paths that have the
>> skip-worktree bit enabled. This may lead to some tiny
>> performance gains when hydrating a very small fraction of a
>> very large index.
> 
> Hmm. I am actually not sure that this would be a fix. It is all too easy
> for a skip-worktree entry to become checked out (think e.g. a merge
> conflict in a cherry-pick, during a three-way merge of a file that is not
> in the cone but still needs to be handled).

This is a very good point, and something worth investigating.

> My preferred solution for this issue would be for the files/directories to
> be renamed using `git -c core.protectntfs=false mv <reserved-name>
> <non-reserved-name>`.

One thing that I realized after root-causing the issue is that now the
Linux kernel repository cannot be checked out _at all_ on Windows due to
the existence of an aux.c file. Git complains that the path is invalid
and does not write a single file to the working directory.

At least we _could_ allow someone to create most of the working directory
(as we did before) by allowing invalid paths outside the sparse cone.
 
> I think if we try to play extra games with the skip-worktree bit, we risk
> opening a vulnerability again.

I agree that we need to be _very_ careful with this.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Sparse Checkout Trouble (2.5.0)
  2020-01-29 15:40               ` Derrick Stolee
@ 2020-01-29 16:31                 ` Derrick Stolee
  0 siblings, 0 replies; 11+ messages in thread
From: Derrick Stolee @ 2020-01-29 16:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Elijah Newren, JunkYardMail1, Git Mailing List

On 1/29/2020 10:40 AM, Derrick Stolee wrote:
> On 1/29/2020 10:30 AM, Johannes Schindelin wrote:
>> Hi,
>>
>> On Tue, 28 Jan 2020, Derrick Stolee wrote:
>>> I made some progress, at least, in root-causing the issue.
>>> The problem bisects down to 4dc42c6c1 (mingw: refuse paths
>>> containing reserved names, 2019-12-21) [1]. CC'ing Dscho.
>>>
>>> That commit updates is_valid_win32_path() to fail on these
>>> paths. We were _already_ calling this method even for paths
>>> outside the sparse cone, but the method didn't fail for these
>>> examples.
>>>
>>> This means the fix is probably even more complicated: we need
>>> to not call this method when traversing paths that have the
>>> skip-worktree bit enabled. This may lead to some tiny
>>> performance gains when hydrating a very small fraction of a
>>> very large index.

After digging into this more, at the place where we call
verify_path() and create this warning, all cache entries have
the same ce_flags  (0x0209_0000):

* 0x0200_0000: CE_NEW_SKIP_WORKTREE
* 0x0008_0000: CE_ADDED
* 0x0001_0000: CE_UPDATE

So, at this point there is no way to differentiate between
paths that will be written to disk or paths that will not.
That decision appears to be made at another time, and connecting
these decisions at a distance is ripe for unintended behavior.

>> My preferred solution for this issue would be for the files/directories to
>> be renamed using `git -c core.protectntfs=false mv <reserved-name>
>> <non-reserved-name>`.
> 
> One thing that I realized after root-causing the issue is that now the
> Linux kernel repository cannot be checked out _at all_ on Windows due to
> the existence of an aux.c file. Git complains that the path is invalid
> and does not write a single file to the working directory.
>
> At least we _could_ allow someone to create most of the working directory
> (as we did before) by allowing invalid paths outside the sparse cone.

Of course, as you say, setting core.protectntfs=false allows the Linux
kernel to be checked out as before, but without the safety valve.
  
>> I think if we try to play extra games with the skip-worktree bit, we risk
>> opening a vulnerability again.
> 
> I agree that we need to be _very_ careful with this.

After my investigation, and your workaround, I'm content to settle
this situation "as designed" and to leave it at that.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-01-29 16:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22  0:42 Sparse Checkout Trouble (2.5.0) JunkYardMail1
2020-01-22  1:19 ` Derrick Stolee
2020-01-22  2:06   ` JunkYardMail1
2020-01-24 15:40   ` Derrick Stolee
2020-01-25 20:59     ` Elijah Newren
2020-01-26 14:09       ` Derrick Stolee
2020-01-28  5:21         ` Elijah Newren
2020-01-28 20:47           ` Derrick Stolee
2020-01-29 15:30             ` Johannes Schindelin
2020-01-29 15:40               ` Derrick Stolee
2020-01-29 16:31                 ` Derrick Stolee

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).