git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Why is "Sparse checkout leaves no entry on working directory" a fatal error?
@ 2019-10-08  6:45 Josef Wolf
  2019-10-08 16:14 ` Elijah Newren
  0 siblings, 1 reply; 3+ messages in thread
From: Josef Wolf @ 2019-10-08  6:45 UTC (permalink / raw)
  To: git

Hello,

This is a repost, since the original message seems to have been lost somehow.


I am trying to add a file to an arbitrary branch without touching the current
worktree with as little overhead as possible. This should work no matter in
which state the current worktree is in. And it should not touch the current WT
in any way.

For this, the sparse-checkout feature in conjuntion with the "shared
repository" feature seems to be perfect.

The basic idea goes like this:


   TMP=`mktemp -d /var/tmp/test-XXXXXXXXX`
   GD=$TMP/git
   WD=$TMP/wd
   
   git --work-tree $WD --git-dir $GD clone -qns -n . $GD
   git --work-tree $WD --git-dir $GD config core.sparsecheckout true
   echo path/of/file/which/I/want/to/create >>$GD/info/sparse-checkout
   
   git --work-tree $WD --git-dir $GD checkout -b some-branch remotes/origin/some-branch  # !!!
   
   ( cd $WD
     mkdir -p path/of/file/which/I/want/to
     echo huhuh >path/of/file/which/I/want/to/create
     git --work-tree $WD --git-dir $GD add path/of/file/which/I/want/to/create
     git --work-tree $WD --git-dir $GD commit
     git --work-tree $WD --git-dir $GD push
   )
   
   rm -rf $TMP


Unfortunately, the marked command errors out with

   "error: Sparse checkout leaves no entry on working directory"

and won't create/switch to the branch that is to be modified.

Why is this an error? Since there are no matching files, an empty worktree
is EXACTLY what I wanted. Why will the "git checkout -b" command error out?


Strange enough, I have some repositories at this machine where the
.git/info/sparse-checkout file contains only non-existing files and git
happily executes this "git checkout -b XXX remotes/origin/XXX" command leaving
the working tree totally empty all the time.

Someone understands this inconsistent behaviour?

Thanks,

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Why is "Sparse checkout leaves no entry on working directory" a fatal error?
  2019-10-08  6:45 Why is "Sparse checkout leaves no entry on working directory" a fatal error? Josef Wolf
@ 2019-10-08 16:14 ` Elijah Newren
  2019-10-09  9:37   ` Josef Wolf
  0 siblings, 1 reply; 3+ messages in thread
From: Elijah Newren @ 2019-10-08 16:14 UTC (permalink / raw)
  To: Josef Wolf; +Cc: Git Mailing List

On Mon, Oct 7, 2019 at 11:52 PM Josef Wolf <jw@raven.inka.de> wrote:
>
> Hello,
>
> This is a repost, since the original message seems to have been lost somehow.
>
>
> I am trying to add a file to an arbitrary branch without touching the current
> worktree with as little overhead as possible. This should work no matter in
> which state the current worktree is in. And it should not touch the current WT
> in any way.
>
> For this, the sparse-checkout feature in conjuntion with the "shared
> repository" feature seems to be perfect.

I can see the logical progression that a sparse worktree would be less
overhead than a full worktree, and that a bare worktree would be even
better.  But you're still dealing with unnecessary overhead; you don't
need a worktree at all to achieve what you want.

Traditionally, if you wanted to modify another branch without touching
the worktree at all, you would use a combination of hash-object,
mktree, commit-tree, and update-ref.  That would be a better solution
to your problem than trying to approximate it with a sparse checkout.
However, that's at least four invocations of git, and you said as
little overhead as possible, so I'd recommend you use fast-import.

But, since you asked some other questions about sparse checkouts...

> The basic idea goes like this:
>
>
>    TMP=`mktemp -d /var/tmp/test-XXXXXXXXX`
>    GD=$TMP/git
>    WD=$TMP/wd
>
>    git --work-tree $WD --git-dir $GD clone -qns -n . $GD
>    git --work-tree $WD --git-dir $GD config core.sparsecheckout true
>    echo path/of/file/which/I/want/to/create >>$GD/info/sparse-checkout
>
>    git --work-tree $WD --git-dir $GD checkout -b some-branch remotes/origin/some-branch  # !!!
>
>    ( cd $WD
>      mkdir -p path/of/file/which/I/want/to
>      echo huhuh >path/of/file/which/I/want/to/create
>      git --work-tree $WD --git-dir $GD add path/of/file/which/I/want/to/create
>      git --work-tree $WD --git-dir $GD commit
>      git --work-tree $WD --git-dir $GD push
>    )
>
>    rm -rf $TMP
>
>
> Unfortunately, the marked command errors out with
>
>    "error: Sparse checkout leaves no entry on working directory"
>
> and won't create/switch to the branch that is to be modified.
>
> Why is this an error? Since there are no matching files, an empty worktree
> is EXACTLY what I wanted. Why will the "git checkout -b" command error out?

It is very easy to mess up the sparse specifications.  We can't check
for all errors, but a pretty obvious one is when people specify
restrictions that match no path.  We can at least give an error in
that case.  There are times when folks might intentionally specify
paths that don't match anything, but they are quite rare.  The ones I
can think of:

1) When they are doing something exotic where they are just trying to
approximate something else rather than actual use sparse checkouts as
intended.
2) When they've learned about sparse checkouts and just want to test
what things are like in extreme situations.

Case 1 consists of stuff like what you are doing here, for which there
are better solutions, or when I was attempting to simulate the
performance issues microsoft folks were having with a really large
repo and knowing they used sparse checkouts as part of VFS-for-git (I
created a very large index and had no entries checked out at first,
but then ran into these errors, and added one file to the index and
had a sparse specification match it.)

For case 2, people learn that an empty working tree is a too extreme
situation that we'll throw an error at and so they adjust and make
sure to match at least one path.

> Strange enough, I have some repositories at this machine where the
> .git/info/sparse-checkout file contains only non-existing files and git
> happily executes this "git checkout -b XXX remotes/origin/XXX" command leaving
> the working tree totally empty all the time.

I can't reproduce:

$ git config core.sparseCheckout true
$ echo 'non-existent' > .git/info/sparse-checkout
$ git checkout -b next origin/next
error: Sparse checkout leaves no entry on working directory

Can you provide any more details about how you get into this state?

> Someone understands this inconsistent behaviour?

No, but I wouldn't be surprised if there are bugs and edge cases.  I
think I ran into one or two when testing things out, didn't take good
enough notes, and had trouble reproducing later.  The sparse checkout
stuff has been under-tested and not well documented, something Stolee
is trying to fix right now.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Why is "Sparse checkout leaves no entry on working directory" a fatal error?
  2019-10-08 16:14 ` Elijah Newren
@ 2019-10-09  9:37   ` Josef Wolf
  0 siblings, 0 replies; 3+ messages in thread
From: Josef Wolf @ 2019-10-09  9:37 UTC (permalink / raw)
  To: Git Mailing List

Thanks for your comprehensive answer, Elijah!

On Di, Okt 08, 2019 at 09:14:27 -0700, Elijah Newren wrote:
> On Mon, Oct 7, 2019 at 11:52 PM Josef Wolf <jw@raven.inka.de> wrote:
> >
> > I am trying to add a file to an arbitrary branch without touching the current
> > worktree with as little overhead as possible.
>
> I can see the logical progression that a sparse worktree would be less
> overhead than a full worktree, and that a bare worktree would be even
> better.  But you're still dealing with unnecessary overhead; you don't
> need a worktree at all to achieve what you want.

Well, the "as little overhead as possible" must be seen from the context. This
is a repository with roundabout 10GB and more than 6200 files. Shared-clones
with sparse-worktree is a BIG BIG BIG improvement here, which reduces
operations from "minutes" to "withhin a second".

> Traditionally, if you wanted to modify another branch without touching
> the worktree at all, you would use a combination of hash-object,
> mktree, commit-tree, and update-ref.  That would be a better solution
> to your problem than trying to approximate it with a sparse checkout.
> However, that's at least four invocations of git, and you said as
> little overhead as possible, so I'd recommend you use fast-import.

I have taken a look into the commands you are recommending, and indeed, they
seem to be better suited. Especially fast-import looks very
promising. Unfortunately, those commands require intimate knowledge about git
internals. I'll take a closer look into this!

> It is very easy to mess up the sparse specifications.  We can't check
> for all errors, but a pretty obvious one is when people specify
> restrictions that match no path.

But why erroring out only on completely empty tree? Why not requiring that
_every_ line in .git/info/sparse-checkout should match at least one file?
Would make no sense, right?

> We can at least give an error in that case.

Why must this be a fatal error? Wouldn't a warning suffice?

> 2) When they've learned about sparse checkouts and just want to test
> what things are like in extreme situations.
[ ... ]
> For case 2, people learn that an empty working tree is a too extreme
> situation that we'll throw an error at and so they adjust and make
> sure to match at least one path.

When I am trying to learn how a new feature works, I tend to double-check the
results. If I expect contens but end up with an empty WT, I'd go and double
check the specifications I've given anyway.

I can easily understand that a warning might be desirable. But erroring out
and failing to honor the "-b" flag is a bit too drastic, IMHO.

> > Strange enough, I have some repositories at this machine where the
> > .git/info/sparse-checkout file contains only non-existing files and git
> > happily executes this "git checkout -b XXX remotes/origin/XXX" command leaving
> > the working tree totally empty all the time.
> 
> I can't reproduce:
> 
> $ git config core.sparseCheckout true
> $ echo 'non-existent' > .git/info/sparse-checkout
> $ git checkout -b next origin/next
> error: Sparse checkout leaves no entry on working directory
> 
> Can you provide any more details about how you get into this state?

Unfortunately not.

Honestly, I have tried to reproduce for several days, since I tried
to find a way how to work around that fatal error. Unfortunately, I could not
find how to reproduce it. The only thing I can say is: threre are several
clones on my disk which happily switch branches with an empty WT and without
any complaints.

> > Someone understands this inconsistent behaviour?
> 
> No, but I wouldn't be surprised if there are bugs and edge cases.  I
> think I ran into one or two when testing things out, didn't take good
> enough notes, and had trouble reproducing later.  The sparse checkout
> stuff has been under-tested and not well documented, something Stolee
> is trying to fix right now.

Yes, I've seen the work on the ML. But I am only a user of git and have a very
hard time to understand what is going on there.

-- 
Josef Wolf
jw@raven.inka.de

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-10-09  9:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08  6:45 Why is "Sparse checkout leaves no entry on working directory" a fatal error? Josef Wolf
2019-10-08 16:14 ` Elijah Newren
2019-10-09  9:37   ` Josef Wolf

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).