git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: matheus.bernardino@usp.br, dstolee@microsoft.com,
	Elijah Newren <newren@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: [PATCH 3/3] stash: fix stash application in sparse-checkouts
Date: Fri, 20 Nov 2020 16:53:42 +0000	[thread overview]
Message-ID: <5143cba7047d25137b3d7f8c7811a875c1931aee.1605891222.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.919.git.git.1605891222.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

sparse-checkouts are built on the patterns in the
$GIT_DIR/info/sparse-checkout file, where commands have modified
behavior for paths that do not match those patterns.  The differences in
behavior, as far as the bugs concerned here, fall into three different
categories (with git subcommands that fall into each category listed):

  * commands that only look at files matching the patterns:
      * status
      * diff
      * clean
      * update-index
  * commands that remove files from the working tree that do not match
    the patterns, and restore files that do match them:
      * read-tree
      * switch
      * checkout
      * reset (--hard)
  * commands that omit writing files to the working tree that do not
    match the patterns, unless those files are not clean:
      * merge
      * rebase
      * cherry-pick
      * revert

There are some caveats above, e.g. a plain `git diff` ignores files
outside the sparsity patterns but will show diffs for paths outside the
sparsity patterns when revision arguments are passed.  (Technically,
diff is treating the sparse paths as matching HEAD.)  So, there is some
internal inconsistency among these commands.  There are also additional
commands that should behave differently in the face of sparse-checkouts,
as the sparse-checkout documentation alludes to, but the above is
sufficient for me to explain how `git stash` is affected.

What is relevant here is that logically 'stash' should behave like a
merge; it three-way merges the changes the user had in progress at stash
creation time, the HEAD at the time the stash was created, and the
current HEAD, in order to get the stashed changes applied to the current
branch.  However, this simplistic view doesn't quite work in practice,
because stash tweaks it a bit due to two factors: (1) flags like
--keep-index and --include-untracked (why we used two different verbs,
'keep' and 'include', is a rant for another day) modify what should be
staged at the end and include more things that should be quasi-merged,
(2) stash generally wants changes to NOT be staged.  It only provides
exceptions when (a) some of the changes had conflicts and thus we want
to use staged to denote the clean merges and higher order stages to
mark the conflicts, or (b) if there is a brand new file we don't want
it to become untracked.

stash has traditionally gotten this special behavior by first doing a
merge, and then when it's clean, applying a pipeline of commands to
modify the result.  This series of commands for
unstaging-non-newly-added-files came from the following commands:

    git diff-index --cached --name-only --diff-filter=A $CTREE >"$a"
    git read-tree --reset $CTREE
    git update-index --add --stdin <"$a"
    rm -f "$a"

You might that between the merge that proceeded these commands and these
different commands here, that we have commands from each of the
different types of special sparsity handling listed at the beginning of
this message, and in fact this precisely led to the following buggy
behaviors:

(1) If a path merged cleanly and it didn't match the sparsity patterns,
the merge backend would know to avoid writing it to the working tree and
keep the SKIP_WORKTREE bit, simply only updating it in the index.
Unfortunately, the subsequent commands would essentially undo the
changes in the index and thus simply toss the changes altogether since
there was nothing left in the working tree.  This means the stash is
only partially applied.

(2) If a path existed in the worktree before `git stash apply` despite
having the SKIP_WORKTREE bit set, then the `git read-tree --reset` would
print an error message of the form
      error: Entry 'modified' not uptodate. Cannot merge.
and cause stash to abort early.

(3) If there was a brand new file added by the stash, then the
diff-index command would save that pathname to the temporary file, the
read-tree --reset would remove it from the index, and the update-index
command would barf due to no such file being present in the working
copy; it would print a message of the form:
      error: NEWFILE: does not exist and --remove not passed
      fatal: Unable to process path NEWFILE
and then cause stash to abort early.

Basically, the whole idea of unstage-unless-brand-new requires special
care when you are dealing with a sparse-checkout.  Fix these problems
by applying the following simple rule:

  When we unstage files, if they have the SKIP_WORKTREE bit set,
  clear that bit and write the file out to the working directory.

  (*) If there's already a file present in the way, rename it first.

This fixes all three problems in t7012.13 and allows us to mark it as
passing.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/stash.c                  | 36 +++++++++++++++++++++++++++++++-
 t/t7012-skip-worktree-writing.sh |  2 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/builtin/stash.c b/builtin/stash.c
index 8117d7647d..0f7e78d315 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -331,13 +331,23 @@ static void unstage_changes_unless_new(struct object_id *cache_tree)
 	 * When we enter this function, there has been a clean merge of
 	 * relevant trees, and the merge logic always stages whatever merges
 	 * cleanly.  We want to unstage those changes, unless it corresponds
-	 * to a file that didn't exist as of cache_tree.
+	 * to a file that didn't exist as of cache_tree.  However, if any
+	 * SKIP_WORKTREE path is modified relative to cache_tree, then we
+	 * want to clear the SKIP_WORKTREE bit and write it to the worktree
+	 * before unstaging.
 	 */
 
+	struct checkout state = CHECKOUT_INIT;
 	struct diff_options diff_opts;
 	struct lock_file lock = LOCK_INIT;
 	int i;
 
+	/* If any entries have skip_worktree set, we'll have to check 'em out */
+	state.force = 1;
+	state.quiet = 1;
+	state.refresh_cache = 1;
+	state.istate = &the_index;
+
 	diff_setup(&diff_opts);
 	diff_opts.flags.recursive = 1;
 	diff_opts.detect_rename = 0;
@@ -367,6 +377,30 @@ static void unstage_changes_unless_new(struct object_id *cache_tree)
 			continue;
 		}
 		ce = active_cache[pos];
+		if (ce_skip_worktree(ce)) {
+			struct stat st;
+			if (!lstat(ce->name, &st)) {
+				struct strbuf new_path = STRBUF_INIT;
+				int fd;
+
+				strbuf_addf(&new_path,
+					    "%s.stash.XXXXXX", ce->name);
+				fd = xmkstemp(new_path.buf);
+				close(fd);
+				printf(_("WARNING: Untracked file in way of "
+					 "tracked file!  Renaming\n "
+					 "           %s -> %s\n"
+					 "         to make room.\n"),
+				       ce->name, new_path.buf);
+				if (rename(ce->name, new_path.buf))
+					die("Failed to move %s to %s\n",
+					    ce->name, new_path.buf);
+				strbuf_release(&new_path);
+			}
+			checkout_entry(ce, &state, NULL, NULL);
+		}
+
+		ce->ce_flags &= ~CE_SKIP_WORKTREE;
 		if (p->one->oid_valid) {
 			ce = make_cache_entry(&the_index,
 					      p->one->mode,
diff --git a/t/t7012-skip-worktree-writing.sh b/t/t7012-skip-worktree-writing.sh
index a184ee97fb..e5c6a038fb 100755
--- a/t/t7012-skip-worktree-writing.sh
+++ b/t/t7012-skip-worktree-writing.sh
@@ -149,7 +149,7 @@ test_expect_success '--ignore-skip-worktree-entries leaves worktree alone' '
 		--diff-filter=D -- keep-me.t
 '
 
-test_expect_failure 'stash restore in sparse checkout' '
+test_expect_success 'stash restore in sparse checkout' '
 	test_create_repo stash-restore &&
 	(
 		cd stash-restore &&
-- 
gitgitgadget

  parent reply	other threads:[~2020-11-20 17:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 16:53 [PATCH 0/3] Fix stash apply in sparse checkouts (and a submodule test) Elijah Newren via GitGitGadget
2020-11-20 16:53 ` [PATCH 1/3] t7012: add a testcase demonstrating stash apply bugs in sparse checkouts Elijah Newren via GitGitGadget
2020-11-20 16:53 ` [PATCH 2/3] stash: remove unnecessary process forking Elijah Newren via GitGitGadget
2020-11-20 16:53 ` Elijah Newren via GitGitGadget [this message]
2020-11-21 12:47   ` [PATCH 3/3] stash: fix stash application in sparse-checkouts Chris Torek
2020-11-22  3:47     ` Elijah Newren
2020-11-25 22:14 ` [PATCH 0/3] Fix stash apply in sparse checkouts (and a submodule test) Junio C Hamano
2020-11-26  5:31   ` Elijah Newren
2020-12-01 22:25 ` [PATCH v2 " Elijah Newren via GitGitGadget
2020-12-01 22:25   ` [PATCH v2 1/3] t7012: add a testcase demonstrating stash apply bugs in sparse checkouts Elijah Newren via GitGitGadget
2020-12-01 22:25   ` [PATCH v2 2/3] stash: remove unnecessary process forking Elijah Newren via GitGitGadget
2020-12-01 23:02     ` Junio C Hamano
2020-12-02 16:09       ` Elijah Newren
2020-12-01 22:25   ` [PATCH v2 3/3] stash: fix stash application in sparse-checkouts Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5143cba7047d25137b3d7f8c7811a875c1931aee.1605891222.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=matheus.bernardino@usp.br \
    --cc=newren@gmail.com \
    --subject='Re: [PATCH 3/3] stash: fix stash application in sparse-checkouts' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).