git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Grzegorz Rajchman <rayman17@gmail.com>, git@vger.kernel.org
Subject: Re: [BUG] git stash pop --quiet deletes files in git 2.24.0
Date: Mon, 11 Nov 2019 19:56:41 +0000	[thread overview]
Message-ID: <20191111195641.GC3115@cat> (raw)
In-Reply-To: <xmqqk188l0pn.fsf@gitster-ct.c.googlers.com>

On 11/10, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
> 
> > On 11/08, Junio C Hamano wrote:
> >> So, I do not think removing that discard_cache() alone solves the
> >> breakage exposed by 34933d0eff.  Discarding and re-reading the
> >> on-disk index there would restore correctness, but then you would
> >> want to make sure that we are not wasting the overall cost for the
> >> I/O and refreshing.
> >> 
> >> I think the safer immediate short-term fix is to revert the change
> >> to the quiet codepath and let it only refresh the in-core index.
> >
> > Yup, this is certainly my bad, we shouldn't be writing the discarded
> > index of course.  I don't think what we were doing here before was
> > correct either though.  The only thing that would be called after this
> > is 'do_stash_drop()', which only executes external commands.
> 
> Right.  Removing discard alone would not be a correct fix exactly
> for that reason: the in-core index was stale wrt the on-disk index.
> 
> If the program later used in-core index for further processing
> (which is not, and that is why the short-term solution of reverting
> that hunk would work), we would have been operating on a wrong data.
> So for the fix that keeps data we have in-core always up-to-date, we
> should be re-reading from the on-disk index there after discard().
> 
> And in the longer term, it would likely be the right direction, as
> the "git status" invocation on the non-quiet codepath would want to
> become an in-core direct calls into wt-status machinery instead of
> fork+exec eventually.

Right.  I'd argue that that's even the right direction in the short
term.  It does require some more I/O but it also prevents similar
mistakes.  And I don't think one additional read of the index is going
to make it or break it for performance here, there are plenty of reads
already, and there's probably better ways to speed 'git stash' up.

> > From what you are saying above, and from my testing I think this
> > refresh is actually unnecessary, and we could just remove it outright.
> 
> Perhaps.  But later it will bite us when somebody wants to rewrite
> the "status at the end" part in C.

Hmm, wouldn't the not re-reading the index part bite us there, rather
than the not refreshing the index?

In the 'has_index' codepath, we write the index to disk, so we already
have a fresh one in-core.  This codepath is what used to require
refreshing the index afterwards, but no longer does.

Previously we used to use 'git read-tree "$unstashed_index_tree"'
there, which does require a 'git update-index -q --refresh'
afterwards.  However we have replaced that with an internal call to
'reset_tree', which always sets 'o.merge = 1' for unpack-trees.  Which
in turn means that the index is already refreshed appropriately iiuc.

In the other codepath we do 'git update-index --add --stdin', which
also doesn't require refreshing the index, but does require the
'discard_cache()' + 'read_cache()' afterwards, so we're not left in a
half state.

> Besides, if the original was "update-index -q --refresh" in the
> scripted Porcelain after an pop was attempted, it would have shown
> the unmerged paths as "needs merge", wouldn't it?  For that, we need
> to have something (I do not remember if refresh_and_write_cache()
> would be the in-core API call to do so offhand).

The original used 'git status >/dev/null 2>&1' to refresh the index
after the 'git read-tree' I mentioned above, but would not show the
"needs merge" message, so I think we're okay on that front.

Below is the patch that I believe has the least chances of biting us
in the future, with the appropriate updated tests.  I had considered
leaving the 'refresh_and_write_cache()' call there, but as I was
writing the commit message I had a harder and harder time justifying
that, so it's gone now, which I think is the right thing to do.
Leaving it there would be okay as well, however I don't think it would
have any benefit.

--- >8 ---
Subject: [PATCH] stash: make sure we have a valid index before writing it

In 'do_apply_stash()' we refresh the index in the end.  Since
34933d0eff ("stash: make sure to write refreshed cache", 2019-09-11),
we also write that refreshed index when --quiet is given to 'git stash
apply'.

However if '--index' is not given to 'git stash apply', we also
discard the index in the else clause just before.  This leads to
writing the discarded index, which means we essentially write an empty
index file.  This is obviously not correct, or the behaviour the user
wanted.  We should not modify the users index without being asked to
do so.

Make sure to re-read the index after discarding the current in-core
index, to avoid dealing with outdated information.

We can drop the 'refresh_and_write_cache' completely in the quiet
case.  Previously in legacy stash we relied on 'git status' to refresh
the index after calling 'git read-tree' when '--index' was passed to
'git apply'.  However the 'reset_tree()' call that replaced 'git
read-tree' always passes options that are equivalent to '-m', making
the refresh of the index unnecessary.

We could also drop the 'discard_cache()' + 'read_cache()', however
that would make it easy to fall into the same trap as 34933d0eff did,
so it's better to avoid that.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 builtin/stash.c  | 6 ++----
 t/t3903-stash.sh | 5 ++++-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index ab30d1e920..d00567285f 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -482,12 +482,10 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
 			return -1;
 
 		discard_cache();
+		read_cache();
 	}
 
-	if (quiet) {
-		if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
-			warning("could not refresh index");
-	} else {
+	if (!quiet) {
 		struct child_process cp = CHILD_PROCESS_INIT;
 
 		/*
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 392954d6dd..b1c973e3d9 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -232,8 +232,9 @@ test_expect_success 'save -q is quiet' '
 	test_must_be_empty output.out
 '
 
-test_expect_success 'pop -q is quiet' '
+test_expect_success 'pop -q works and is quiet' '
 	git stash pop -q >output.out 2>&1 &&
+	test bar = "$(git show :file)" &&
 	test_must_be_empty output.out
 '
 
@@ -242,6 +243,8 @@ test_expect_success 'pop -q --index works and is quiet' '
 	git add file &&
 	git stash save --quiet &&
 	git stash pop -q --index >output.out 2>&1 &&
+	git diff-files file2 >file2.diff &&
+	test_must_be_empty file2.diff &&
 	test foo = "$(git show :file)" &&
 	test_must_be_empty output.out
 '
-- 
2.24.0.155.gd9f6f3b619


  reply	other threads:[~2019-11-11 19:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 10:36 [BUG] git stash pop --quiet deletes files in git 2.24.0 Grzegorz Rajchman
2019-11-07 18:49 ` Thomas Gummerer
2019-11-08  2:32   ` Junio C Hamano
2019-11-08 16:59     ` Thomas Gummerer
2019-11-10  6:11       ` Junio C Hamano
2019-11-11 19:56         ` Thomas Gummerer [this message]
2019-11-12  5:21           ` Junio C Hamano
2019-11-13 11:15             ` Thomas Gummerer
2019-11-13 13:31               ` Junio C Hamano
2019-11-13 15:01                 ` [PATCH v3] stash: make sure we have a valid index before writing it Thomas Gummerer
2019-11-14  2:07                   ` Junio C Hamano
2019-11-13 11:17           ` [PATCH v2 1/2] t3903: avoid git commands inside command substitution Thomas Gummerer
2019-11-13 11:17             ` [PATCH v2 2/2] stash: make sure we have a valid index before writing it Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191111195641.GC3115@cat \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=rayman17@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).