git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Error when "git mv" file in a sparsed checkout
@ 2023-11-07 13:03 Josef Wolf
  2023-11-08  2:21 ` Elijah Newren
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Wolf @ 2023-11-07 13:03 UTC (permalink / raw
  To: git

Hello,

I have used the procedure described below for many years. In fact,
this procedure is part of a script which I am using for about 10 years.
This procedure was definitely working with git-2-25-1 and git-2.26.2.

Now, with git-2.34.1 (on a freshly installed ubuntu-22.04), this
procedure fails.

Here is what I do:

I want to rename a file on a branch which is not currently checked out
without messing/touching my current working directory.

For this, I first create a clone of the repo with shared git-directory:

  $ SANDBOX=/var/tmp/manage-scans-X1pKZQiey
  $ WT=$SANDBOX/wt
  $ GIT=$SANDBOX/git

  $ mkdir -p $SANDBOX
  $ git --work-tree $WT --git-dir $GIT clone -qns -n ~/upstream-repo $GIT

Then, I do a sparse checkout in this clone, containing only the file
that is to be renamed:

  $ cd $WT
  $ echo 'path/to/old-filename' >>$GIT/info/sparse-checkout
  $ git --work-tree $WT --git-dir $GIT config core.sparsecheckout true
  $ git --work-tree $WT --git-dir $GIT checkout -b the-branch remotes/origin/the-branch
  Switched to a new branch 'the-branch'

Next step would be to "git mv" the file:

  $ mkdir -p /path/to  # already exists, but should do no harm
  $ git --work-tree $WT --git-dir $GIT mv path/to/old-filename path/to/new-filename
  The following paths and/or pathspecs matched paths that exist
  outside of your sparse-checkout definition, so will not be
  updated in the index:
  path/to/new-filename
  hint: If you intend to update such entries, try one of the following:
  hint: * Use the --sparse option.
  hint: * Disable or modify the sparsity rules.
  hint: Disable this message with "git config advice.updateSparsePath false"

This error is something I have not expected.

Error message suggests, there already exists a file named "new-filename". This
is not true at all. There is no file named "new-filename" in the entire
repository. Not in any directory of any branch.

-- 
Josef Wolf
jw@raven.inka.de


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Error when "git mv" file in a sparsed checkout
  2023-11-07 13:03 Error when "git mv" file in a sparsed checkout Josef Wolf
@ 2023-11-08  2:21 ` Elijah Newren
  2023-11-08 11:36   ` Josef Wolf
  0 siblings, 1 reply; 4+ messages in thread
From: Elijah Newren @ 2023-11-08  2:21 UTC (permalink / raw
  To: Josef Wolf, git

Hi,

On Tue, Nov 7, 2023 at 5:32 AM Josef Wolf <jw@raven.inka.de> wrote:
>
> Hello,
>
> I have used the procedure described below for many years. In fact,
> this procedure is part of a script which I am using for about 10 years.
> This procedure was definitely working with git-2-25-1 and git-2.26.2.
>
> Now, with git-2.34.1 (on a freshly installed ubuntu-22.04), this
> procedure fails.
>
> Here is what I do:
>
> I want to rename a file on a branch which is not currently checked out
> without messing/touching my current working directory.
>
> For this, I first create a clone of the repo with shared git-directory:
>
>   $ SANDBOX=/var/tmp/manage-scans-X1pKZQiey
>   $ WT=$SANDBOX/wt
>   $ GIT=$SANDBOX/git
>
>   $ mkdir -p $SANDBOX
>   $ git --work-tree $WT --git-dir $GIT clone -qns -n ~/upstream-repo $GIT
>
> Then, I do a sparse checkout in this clone, containing only the file
> that is to be renamed:
>
>   $ cd $WT
>   $ echo 'path/to/old-filename' >>$GIT/info/sparse-checkout
>   $ git --work-tree $WT --git-dir $GIT config core.sparsecheckout true
>   $ git --work-tree $WT --git-dir $GIT checkout -b the-branch remotes/origin/the-branch
>   Switched to a new branch 'the-branch'
>
> Next step would be to "git mv" the file:
>
>   $ mkdir -p /path/to  # already exists, but should do no harm
>   $ git --work-tree $WT --git-dir $GIT mv path/to/old-filename path/to/new-filename

sparse checkouts are designed such that only files matching the
patterns in the sparse-checkout file should be present in the working
tree, so renaming to a path that should not be present is problematic.
We could possibly have "git-mv" immediately remove the path from the
working tree (while leaving the new pathname in the index), but that's
problematic in that users often overlook the index and only look at
the working tree and might think the file was deleted instead of
renamed.  Not immediately removing it is potentially even worse,
because any subsequent operation (particularly ones like checkout,
reset, merge, rebase, etc.) are likely to nuke the file from the
working tree and the fact that the removal is delayed makes it much
harder for users to understand and diagnose.

So, Stolee fixed this to make it throw an error; see
https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
for details.  His description did focus on cone mode, but you'll note
that none of my explanation here did.  The logic for making this an
error fully applies to non-cone mode for all the same reasons.

If you want to interact with `path/to/new-filename` as a path within
your sparse checkout (as suggested by your git-mv command), then that
path should actually be part of your sparse checkout.  In other words,
you should add `path/to/new-filename` to $GIT/info/sparse-checkout and
do so _before_ attempting your `git mv` command.  If you don't like
that for some reason, you are allowed to instead ignore the
problematic consequences of renaming outside the sparse-checkout by
providing the `--sparse` flag.  Both of these possibilities are
documented in the hints provided along with the error message you
showed below:

>   The following paths and/or pathspecs matched paths that exist
>   outside of your sparse-checkout definition, so will not be
>   updated in the index:
>   path/to/new-filename
>   hint: If you intend to update such entries, try one of the following:
>   hint: * Use the --sparse option.
>   hint: * Disable or modify the sparsity rules.
>   hint: Disable this message with "git config advice.updateSparsePath false"
>
> This error is something I have not expected.
>
> Error message suggests, there already exists a file named "new-filename". This
> is not true at all. There is no file named "new-filename" in the entire
> repository. Not in any directory of any branch.

You are correct; the wording of the error message here is suboptimal
and seems to have been focused more on the git-add case (the error
message is shared by git-add, git-mv, and git-rm).  Thanks for
pointing it out!  We could improve that wording, perhaps with
something like:

    The following paths and/or pathspecs match paths that are
    outside of your sparse-checkout definition, so will not be
    updated:

Which is still slightly slanted towards git-add and git-rm cases, but
I hope it works better than the current message.  Thoughts?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Error when "git mv" file in a sparsed checkout
  2023-11-08  2:21 ` Elijah Newren
@ 2023-11-08 11:36   ` Josef Wolf
  2023-11-10 20:11     ` Elijah Newren
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Wolf @ 2023-11-08 11:36 UTC (permalink / raw
  To: git

Thanks for the reply, Elijah!

On Tue, Nov 07, 2023 at 06:21:00PM -0800, Elijah Newren wrote:
> On Tue, Nov 7, 2023 at 5:32 AM Josef Wolf <jw@raven.inka.de> wrote:
> > I have used the procedure described below for many years. In fact,
> > this procedure is part of a script which I am using for about 10 years.
> > This procedure was definitely working with git-2-25-1 and git-2.26.2.
> >
> > Now, with git-2.34.1 (on a freshly installed ubuntu-22.04), this
> > procedure fails.
> >
> > Here is what I do:
> >
> > I want to rename a file on a branch which is not currently checked out
> > without messing/touching my current working directory.
> >
> > For this, I first create a clone of the repo with shared git-directory:
> >
> >   $ SANDBOX=/var/tmp/manage-scans-X1pKZQiey
> >   $ WT=$SANDBOX/wt
> >   $ GIT=$SANDBOX/git
> >
> >   $ mkdir -p $SANDBOX
> >   $ git --work-tree $WT --git-dir $GIT clone -qns -n ~/upstream-repo $GIT
> >
> > Then, I do a sparse checkout in this clone, containing only the file
> > that is to be renamed:
> >
> >   $ cd $WT
> >   $ echo 'path/to/old-filename' >>$GIT/info/sparse-checkout
> >   $ git --work-tree $WT --git-dir $GIT config core.sparsecheckout true
> >   $ git --work-tree $WT --git-dir $GIT checkout -b the-branch remotes/origin/the-branch
> >   Switched to a new branch 'the-branch'
> >
> > Next step would be to "git mv" the file:
> >
> >   $ mkdir -p /path/to  # already exists, but should do no harm
> >   $ git --work-tree $WT --git-dir $GIT mv path/to/old-filename path/to/new-filename
> 
> sparse checkouts are designed such that only files matching the
> patterns in the sparse-checkout file should be present in the working
> tree, so renaming to a path that should not be present is problematic.
> We could possibly have "git-mv" immediately remove the path from the
> working tree (while leaving the new pathname in the index), but that's
> problematic in that users often overlook the index and only look at
> the working tree and might think the file was deleted instead of
> renamed.  Not immediately removing it is potentially even worse,
> because any subsequent operation (particularly ones like checkout,
> reset, merge, rebase, etc.) are likely to nuke the file from the
> working tree and the fact that the removal is delayed makes it much
> harder for users to understand and diagnose.
> 
> So, Stolee fixed this to make it throw an error; see
> https://lore.kernel.org/git/pull.1018.v4.git.1632497954.gitgitgadget@gmail.com/
> for details.  His description did focus on cone mode, but you'll note
> that none of my explanation here did.  The logic for making this an
> error fully applies to non-cone mode for all the same reasons.
> 
> If you want to interact with `path/to/new-filename` as a path within
> your sparse checkout (as suggested by your git-mv command), then that
> path should actually be part of your sparse checkout.  In other words,
> you should add `path/to/new-filename` to $GIT/info/sparse-checkout and
> do so _before_ attempting your `git mv` command.  If you don't like
> that for some reason, you are allowed to instead ignore the
> problematic consequences of renaming outside the sparse-checkout by
> providing the `--sparse` flag.  Both of these possibilities are
> documented in the hints provided along with the error message you
> showed below:
> 
> >   The following paths and/or pathspecs matched paths that exist
> >   outside of your sparse-checkout definition, so will not be
> >   updated in the index:
> >   path/to/new-filename
> >   hint: If you intend to update such entries, try one of the following:
> >   hint: * Use the --sparse option.
> >   hint: * Disable or modify the sparsity rules.
> >   hint: Disable this message with "git config advice.updateSparsePath false"
> >
> > This error is something I have not expected.
> >
> > Error message suggests, there already exists a file named "new-filename". This
> > is not true at all. There is no file named "new-filename" in the entire
> > repository. Not in any directory of any branch.
> 
> You are correct; the wording of the error message here is suboptimal
> and seems to have been focused more on the git-add case (the error
> message is shared by git-add, git-mv, and git-rm).  Thanks for
> pointing it out!  We could improve that wording, perhaps with
> something like:
> 
>     The following paths and/or pathspecs match paths that are
>     outside of your sparse-checkout definition, so will not be
>     updated:
> 
> Which is still slightly slanted towards git-add and git-rm cases, but
> I hope it works better than the current message.  Thoughts?

Yes, the wording was pretty much confusing me, since i could not find a file
named "new-file" anywhere in the repo.


There are more things confusing concerning sparse mode:

- It is not clear from git-sparse-checkout(1) when changes to
  $GIT_DIR/info/sparse-checkout are catched up. In my case: would it be enough
  to add the new pathname just before git-mv or would a fresh git-checkout be
  needed after modifying $GIT_DIR/info/sparse-checkout? You have clarified
  this in your response, but shouldn't this be clear from the manpage?

- git-sparse-checkout(1) refers to "skip-worktree bit". This concept is
  potentially not very familiar to the average git user which uses mostly
  porcelain. Thus, edge cases remain to be unclear.

- The pathspecs refers to .gitignore (which by itself is not very clear). But
  there are differences:
  1. giignore is relative to containing directoy, which don't seem to make
     much sense for sparse mode
  2. sparse specs are the opposite of gitignore, which seems to have different
     meaning in some edge-cases.

- For cone, it is not clear how the two "accepted patterns" look like what the
  semantics are. I understand that specifying a directory adds siblings
  recursively. But what does the "Parent" mode mean exactly and when/how is
  this recognized? I guess, this is just a mis-namer? IMHO, parent of /a/b/c would
  be /a/b and not /a/b/c/* (as git-sparse-chekout(1) suggests).

I guess all this is very clear to the core-developers. But for the occasional
user like myself, all this is pretty much confusing.


-- 
Josef Wolf
jw@raven.inka.de


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Error when "git mv" file in a sparsed checkout
  2023-11-08 11:36   ` Josef Wolf
@ 2023-11-10 20:11     ` Elijah Newren
  0 siblings, 0 replies; 4+ messages in thread
From: Elijah Newren @ 2023-11-10 20:11 UTC (permalink / raw
  To: Josef Wolf, git

Hi,

On Wed, Nov 8, 2023 at 3:38 AM Josef Wolf <jw@raven.inka.de> wrote:
>
> Thanks for the reply, Elijah!
>
[...]
> > > Error message suggests, there already exists a file named "new-filename". This
> > > is not true at all. There is no file named "new-filename" in the entire
> > > repository. Not in any directory of any branch.
> >
> > You are correct; the wording of the error message here is suboptimal
> > and seems to have been focused more on the git-add case (the error
> > message is shared by git-add, git-mv, and git-rm).  Thanks for
> > pointing it out!  We could improve that wording, perhaps with
> > something like:
> >
> >     The following paths and/or pathspecs match paths that are
> >     outside of your sparse-checkout definition, so will not be
> >     updated:
> >
> > Which is still slightly slanted towards git-add and git-rm cases, but
> > I hope it works better than the current message.  Thoughts?
>
> Yes, the wording was pretty much confusing me, since i could not find a file
> named "new-file" anywhere in the repo.
>
>
> There are more things confusing concerning sparse mode:

Sweet, thanks for taking the time to write these up.  It certainly
helps confirm some of the directions we picked and changes we made to
make things a little clearer, and helps us continue working in that
direction.  Some comments below on individual points...

> - It is not clear from git-sparse-checkout(1) when changes to
>   $GIT_DIR/info/sparse-checkout are catched up. In my case: would it be enough
>   to add the new pathname just before git-mv or would a fresh git-checkout be
>   needed after modifying $GIT_DIR/info/sparse-checkout? You have clarified
>   this in your response, but shouldn't this be clear from the manpage?

I believe at least part of this confusion is due to using the old
style of handling sparse checkouts; namely, by actually editing
$GIT_DIR/info/sparse-checkout.  We have taken pains to guide people
away from that workflow, because it is both more work, and leads to
more confusion.  If you instead do a
   git sparse-checkout set --no-cone <pattern1> <pattern2> ... <patternN>
or a
   git sparse-checkout add <another-pattern>
then the sparse-checkout command handles populating the
$GIT_DIR/info/sparse-checkout for you as well as any needed checkout,
meaning that there isn't a "catch up" step as there traditionally was.
And it makes it a bit clearer that if you add some path to your
sparse-checkout, then your sparse-checkout is ready to handle the
additional path right away.

> - git-sparse-checkout(1) refers to "skip-worktree bit". This concept is
>   potentially not very familiar to the average git user which uses mostly
>   porcelain. Thus, edge cases remain to be unclear.

I totally agree that the "skip-worktree bit" is something we should
avoid exposing to the user.  I called it out previously:
"""
Most sparse checkout users are unaware of this implementation
detail, and the term should generally be avoided in user-facing
descriptions and command flags.  Unfortunately, prior to the
`sparse-checkout` subcommand this low-level detail was exposed,
and as of time of writing, is still exposed in various places.
"""

However, we should also note that you reported using v2.34.1, which is
really quite old.  The current git-sparse-checkout(1) has been almost
completely overhauled in the meantime:

$ git show v2.34.1:Documentation/git-sparse-checkout.txt | wc -l
263
$ git diff --stat v2.34.1 v2.43.0-rc1 -- Documentation/git-sparse-checkout.txt
 Documentation/git-sparse-checkout.txt | 446 +++++++++++++++++++++++++---------
 1 file changed, 333 insertions(+), 113 deletions(-)

And, in particular, "skip-worktree" doesn't appear until quite a bit
later in the file, and then only appears in a section labelled
"INTERNALS -- SPARSE CHECKOUT".

> - The pathspecs refers to .gitignore (which by itself is not very clear). But
>   there are differences:
>   1. giignore is relative to containing directoy, which don't seem to make
>      much sense for sparse mode
>   2. sparse specs are the opposite of gitignore, which seems to have different
>      meaning in some edge-cases.

Yeah, copying .gitignore syntax and merely referring to the gitignore
manual for specification of the patterns was a huge design mistake.
I've hated it since the beginning.  The internals actually managed to
make it *even more* confusing for quite some time as it referred to
everything as "excludes", regardless of whether used for gitignore or
sparse-checkout.  But yeah, these and other inherent problems with
non-cone mode are called out in the "INTERNALS -- NON-CONE PROBLEMS"
section of the manual, and is a big piece of why we recommend users
migrate away from it if possible.

> - For cone, it is not clear how the two "accepted patterns" look like what the
>   semantics are.

Yeah, it is a bit complex, and I advocated for just using a different
control file for cone mode to help us step away from the blight of
$GIT_DIR/info/sparse-checkout and its inherent tie to gitignore.  I
didn't win that one.

However, why does it actually matter?  You shouldn't be bothering with
the patterns or editing the $GIT_DIR/info/sparse-checkout file for
cone mode.  You just
    git sparse-checkout set <directory1> <directory2> ... <directoryN>
and then the file will be set up for you.

>  I understand that specifying a directory adds siblings
>   recursively. But what does the "Parent" mode mean exactly and when/how is
>   this recognized? I guess, this is just a mis-namer? IMHO, parent of /a/b/c would
>   be /a/b and not /a/b/c/* (as git-sparse-chekout(1) suggests).

No, /a/b/c/* is not the parent, that's the portion that says that all
things below a/b/c (i.e. all descendant files) should be included.

The parent parts would be where it adds /a/b/ and /a/ as patterns to
ensure things directly under a/ and directly under a/b/ are included.

I think the newer version of the manual might explain this a little
better, but honestly, attempting to explain it is a losing battle.
Users shouldn't read or edit the sparse-checkout file in cone mode.
We let them, but we strongly recommend against it.  Just pass the
actual directory names to `git sparse-checkout {set,add}` and let it
take care of the patterns for you.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-10 20:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-07 13:03 Error when "git mv" file in a sparsed checkout Josef Wolf
2023-11-08  2:21 ` Elijah Newren
2023-11-08 11:36   ` Josef Wolf
2023-11-10 20:11     ` Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).