From: Jeff King <peff@peff.net>
To: Torsten Krah <krah.tm@gmail.com>
Cc: Emily Shaffer <emilyshaffer@google.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash
Date: Wed, 8 Jan 2020 05:40:08 -0500 [thread overview]
Message-ID: <20200108104008.GA2207365@coredump.intra.peff.net> (raw)
In-Reply-To: <2423f8c0b91578c0faf7527b7d97b0e1e9666261.camel@gmail.com>
On Wed, Jan 08, 2020 at 11:02:41AM +0100, Torsten Krah wrote:
> Hi Jeff, I have a poc you can try:
Thanks, I can see the problem now.
> cd /tmp
> mkdir testrepo
> cd testrepo
> touch TEST1 TEST2
> git add -A
> git commit -m First
> touch TEST3 TEST4
> git add -A
> git commit -m Second
> git reset --soft HEAD~1
OK, so we'd expect to still see TEST3 and TEST4 in the index. And we do:
> git status
>
> Auf Branch master
> Zum Commit vorgemerkte Änderungen:
> (benutzen Sie "git restore --staged <Datei>..." zum Entfernen aus
> der Staging-Area)
> neue Datei: TEST3
> neue Datei: TEST4
And then if we try to unstage one of them, we should see that:
> git restore --staged TEST3
>
> [10:57:26][tkrah@torstenknbl:/tmp/testrepo] (master) $ LC_ALL=C git
> status
> On branch master
> Changes to be committed:
> (use "git restore --staged <file>..." to unstage)
> new file: TEST4
>
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
> TEST3
So far so good. But I think there is a lurking problem in the index,
because...
> git commit -m Second
>
> [master 5b62331] Second
> 2 files changed, 0 insertions(+), 0
> deletions(-)
> create mode 100644 TEST3
> create mode 100644 TEST4
This is very wrong (and is a bug, not a problem with what you're doing).
We wrote a commit with TEST3 in it, even though it wasn't in the index.
Why would we do that? Almost certainly it's because of a cache-tree
extension in the index. Presumably "git restore --staged" is not
properly invalidating the cache-tree entry. And that would explain what
you see next:
> And now TEST3 is in the commit and what is even more "interesting" is
> the next one:
>
> [10:59:16][tkrah@torstenknbl:/tmp/testrepo] (master) $ LC_ALL=C git
> status
> On branch master
> Changes to be committed:
> (use "git restore --staged <file>..." to unstage)
> deleted: TEST3
>
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
> TEST3
>
> TEST3 is unstaged and deleted now.
That happens because we wrote out a commit with TEST3, but it's still
not in the index! So it appears as if a deletion was staged (remember
that what is "staged" is really just the diff between the index and
HEAD).
We can repeat the same process, substituting "git reset -- TEST3" for
git-restore, and it works fine. So the problem is in git-restore itself.
Here's another observation. If we do:
git restore --staged TEST3 TEST4
then running "git commit" won't do anything, since there's nothing
staged (and this is why I had trouble reproducing the problem earlier).
If we add a new file, like:
echo whatever >new
git add new
then the problem won't exhibit, because that will also invalidate the
cache-tree (because it has to account for "new"). But we could put those
files in their own subdirectory and do:
$ git restore --staged subdir/TEST3
$ git status
On branch master
Changes to be committed:
new file: subdir/TEST4
Untracked files:
subdir/TEST3
and if I commit that, I see the bug again:
$ git commit -m second
[master 5c3b8c1] second
2 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 subdir/TEST3
create mode 100644 subdir/TEST4
Now here's even more fun. I wanted to do another test, so I tried to
reset back to the original "second" commit, but it segfaults! Yikes.
$ git reset --hard HEAD@{2}
Segmentation fault
Running it under valgrind shows we're indeed seeing a bogus cache-tree:
==2214443== Invalid read of size 4
==2214443== at 0x362DFF: traverse_by_cache_tree (unpack-trees.c:734)
==2214443== by 0x3630A3: traverse_trees_recursive (unpack-trees.c:799)
==2214443== by 0x364209: unpack_callback (unpack-trees.c:1258)
==2214443== by 0x35F925: traverse_trees (tree-walk.c:497)
==2214443== by 0x364DDD: unpack_trees (unpack-trees.c:1589)
==2214443== by 0x1BE6CF: reset_index (reset.c:95)
==2214443== by 0x1BF894: cmd_reset (reset.c:421)
==2214443== by 0x124868: run_builtin (git.c:444)
==2214443== by 0x124BAE: handle_builtin (git.c:674)
==2214443== by 0x124E24: run_argv (git.c:741)
==2214443== by 0x1252AB: cmd_main (git.c:872)
==2214443== by 0x1E811A: main (common-main.c:52)
==2214443== Address 0x40 is not stack'd, malloc'd or (recently) free'd
==2214443==
==2214443== Process terminating with default action of signal 11 (SIGSEGV)
This isn't the same spot that Emily fixed earlier today in [1], but it
seems like a very similar problem. Weird.
So instead I blew away the index and then did hard reset:
$ rm .git/index
$ git reset --hard
HEAD is now at 5c3b8c1 second
And now we can see what happens if both files are unstaged, and then we
add a new file, and then commit:
$ git reset --soft HEAD^
$ git restore --staged subdir/TEST3 subdir/TEST4
$ echo whatever >new
$ git add new
$ git commit -m again
[master 119d9e4] again
1 file changed, 1 insertion(+)
create mode 100644 new
So that _doesn't_ exhibit the problem. I guess because adding the new
file causes git-add to reexamine the whole cache-tree and clean it up.
So there seem to be at least two bugs:
- git-restore doesn't properly invalidate the cache-tree
- the index-reading code is not careful enough about bogus cache-trees,
and may segfault
-Peff
[1] https://lore.kernel.org/git/20200108023127.219429-1-emilyshaffer@google.com/
next prev parent reply other threads:[~2020-01-08 10:40 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-07 12:55 Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash Torsten Krah
2020-01-07 13:43 ` Torsten Krah
2020-01-07 15:28 ` Torsten Krah
2020-01-08 9:11 ` Jeff King
2020-01-08 10:02 ` Torsten Krah
2020-01-08 10:31 ` Torsten Krah
2020-01-08 10:40 ` Jeff King [this message]
2020-01-08 11:43 ` [PATCH] restore: invalidate cache-tree when removing entries with --staged Jeff King
2020-01-08 15:41 ` Junio C Hamano
2020-02-05 20:24 ` Dennis Kaarsemaker
2020-01-08 12:42 ` Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash Torsten Krah
2020-01-09 9:39 ` Jeff King
2020-01-09 10:49 ` Torsten Krah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200108104008.GA2207365@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=krah.tm@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).