git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] Fix merge restore state
@ 2022-05-19 16:26 Elijah Newren via GitGitGadget
  2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-05-19 16:26 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren

A simple series to fix restore_state() in builtin/merge.c, fixing the issue
reported by ZheNing Hu over here:
https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

Elijah Newren (2):
  merge: remove unused variable
  merge: make restore_state() do as its name says

 builtin/merge.c        | 12 ++++++------
 t/t7607-merge-state.sh | 25 +++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 6 deletions(-)
 create mode 100755 t/t7607-merge-state.sh


base-commit: 6cd33dceed60949e2dbc32e3f0f5e67c4c882e1e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1231%2Fnewren%2Ffix-merge-restore-state-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1231/newren/fix-merge-restore-state-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1231
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH 1/2] merge: remove unused variable
  2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
@ 2022-05-19 16:26 ` Elijah Newren via GitGitGadget
  2022-05-19 17:45   ` Junio C Hamano
  2022-05-19 16:26 ` [PATCH 2/2] merge: make restore_state() do as its name says Elijah Newren via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-05-19 16:26 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

restore_state() had a local variable sb that is not used, and in fact,
was never used even in the original commit that introduced it,
1c7b76be7d ("Build in merge", 2008-07-07).  Remove it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index f178f5a3ee1..00de224a2da 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -375,7 +375,6 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	struct strbuf sb = STRBUF_INIT;
 	const char *args[] = { "stash", "apply", NULL, NULL };
 
 	if (is_null_oid(stash))
@@ -391,7 +390,6 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	strbuf_release(&sb);
 	refresh_cache(REFRESH_QUIET);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH 2/2] merge: make restore_state() do as its name says
  2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
  2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
@ 2022-05-19 16:26 ` Elijah Newren via GitGitGadget
  2022-05-19 17:44   ` Junio C Hamano
  2022-06-12  6:58 ` [PATCH 0/2] Fix merge restore state Elijah Newren
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
  3 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-05-19 16:26 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c        | 10 ++++++----
 t/t7607-merge-state.sh | 25 +++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100755 t/t7607-merge-state.sh

diff --git a/builtin/merge.c b/builtin/merge.c
index 00de224a2da..ae3ee3a996b 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -377,11 +377,11 @@ static void restore_state(const struct object_id *head,
 {
 	const char *args[] = { "stash", "apply", NULL, NULL };
 
-	if (is_null_oid(stash))
-		return;
-
 	reset_hard(head, 1);
 
+	if (is_null_oid(stash))
+		goto refresh_cache;
+
 	args[2] = oid_to_hex(stash);
 
 	/*
@@ -390,7 +390,9 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	refresh_cache(REFRESH_QUIET);
+refresh_cache:
+	if (discard_cache() < 0 || read_cache() < 0)
+		die(_("could not read index"));
 }
 
 /* This is called when no merge was necessary. */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
new file mode 100755
index 00000000000..655478cd0b3
--- /dev/null
+++ b/t/t7607-merge-state.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+test_description="Test that merge state is as expected after failed merge"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+. ./test-lib.sh
+
+test_expect_success 'set up custom strategy' '
+	test_commit --no-tag "Initial" base base &&
+git show-ref &&
+
+	for b in branch1 branch2 branch3
+	do
+		git checkout -b $b main &&
+		test_commit --no-tag "Change on $b" base $b
+	done &&
+
+	git checkout branch1 &&
+	test_must_fail git merge branch2 branch3 &&
+	git diff --exit-code --name-status &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH 2/2] merge: make restore_state() do as its name says
  2022-05-19 16:26 ` [PATCH 2/2] merge: make restore_state() do as its name says Elijah Newren via GitGitGadget
@ 2022-05-19 17:44   ` Junio C Hamano
  2022-05-19 18:32     ` Junio C Hamano
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-05-19 17:44 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/builtin/merge.c b/builtin/merge.c
> index 00de224a2da..ae3ee3a996b 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -377,11 +377,11 @@ static void restore_state(const struct object_id *head,
>  {
>  	const char *args[] = { "stash", "apply", NULL, NULL };
>  
> -	if (is_null_oid(stash))
> -		return;
> -
>  	reset_hard(head, 1);

when there is only one strategy to be tried, save_state() will never
be called.  Removing the above safety means the hard-reset is
discarding a local change that is not saved anywhere.  The reason
why the merge stopped may be because such a local change has crashed
with the change the merge wanted to bring in, no?

> +	if (is_null_oid(stash))
> +		goto refresh_cache;
> +
> +test_description="Test that merge state is as expected after failed merge"
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +. ./test-lib.sh
> +
> +test_expect_success 'set up custom strategy' '
> +	test_commit --no-tag "Initial" base base &&
> +git show-ref &&
> +
> +	for b in branch1 branch2 branch3
> +	do
> +		git checkout -b $b main &&
> +		test_commit --no-tag "Change on $b" base $b
> +	done &&
> +
> +	git checkout branch1 &&

Here, perhaps we can make two additional test cases, that try with
local change that (1) overlaps with the changes branch2 and branch3
bring in and that (2) does not overlap.  I am worried about the case
(2) losing the local change due to the call to reset_hard().

> +	test_must_fail git merge branch2 branch3 &&
> +	git diff --exit-code --name-status &&
> +	test_path_is_missing .git/MERGE_HEAD
> +'
> +
> +test_done

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 1/2] merge: remove unused variable
  2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
@ 2022-05-19 17:45   ` Junio C Hamano
  0 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-05-19 17:45 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> restore_state() had a local variable sb that is not used, and in fact,
> was never used even in the original commit that introduced it,
> 1c7b76be7d ("Build in merge", 2008-07-07).  Remove it.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 2 --
>  1 file changed, 2 deletions(-)

Nice.  Thanks.

>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index f178f5a3ee1..00de224a2da 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -375,7 +375,6 @@ static void reset_hard(const struct object_id *oid, int verbose)
>  static void restore_state(const struct object_id *head,
>  			  const struct object_id *stash)
>  {
> -	struct strbuf sb = STRBUF_INIT;
>  	const char *args[] = { "stash", "apply", NULL, NULL };
>  
>  	if (is_null_oid(stash))
> @@ -391,7 +390,6 @@ static void restore_state(const struct object_id *head,
>  	 */
>  	run_command_v_opt(args, RUN_GIT_CMD);
>  
> -	strbuf_release(&sb);
>  	refresh_cache(REFRESH_QUIET);
>  }

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 2/2] merge: make restore_state() do as its name says
  2022-05-19 17:44   ` Junio C Hamano
@ 2022-05-19 18:32     ` Junio C Hamano
  0 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-05-19 18:32 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

>> +test_expect_success 'set up custom strategy' '
>> +	test_commit --no-tag "Initial" base base &&
>> +git show-ref &&
>> +
>> +	for b in branch1 branch2 branch3
>> +	do
>> +		git checkout -b $b main &&
>> +		test_commit --no-tag "Change on $b" base $b
>> +	done &&
>> +
>> +	git checkout branch1 &&
>
> Here, perhaps we can make two additional test cases, that try with
> local change that (1) overlaps with the changes branch2 and branch3
> bring in and that (2) does not overlap.  I am worried about the case
> (2) losing the local change due to the call to reset_hard().

We do not need a new test to demonstrate the breakage in the
proposed patch, I think.  Here is one place I found that we already
test that merging in a dirty working tree fails.  We only need to
make sure that we do so without losing local changes.

 t/t6424-merge-unrelated-index-changes.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git c/t/t6424-merge-unrelated-index-changes.sh w/t/t6424-merge-unrelated-index-changes.sh
index 89dd544f38..88e0b541a0 100755
--- c/t/t6424-merge-unrelated-index-changes.sh
+++ w/t/t6424-merge-unrelated-index-changes.sh
@@ -171,7 +171,8 @@ test_expect_success 'octopus, unrelated file touched' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge C^0 D^0 &&
-	test_path_is_missing .git/MERGE_HEAD
+	test_path_is_missing .git/MERGE_HEAD &&
+	test_path_exists random_file
 '
 
 test_expect_success 'octopus, related file removed' '

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH 0/2] Fix merge restore state
  2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
  2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
  2022-05-19 16:26 ` [PATCH 2/2] merge: make restore_state() do as its name says Elijah Newren via GitGitGadget
@ 2022-06-12  6:58 ` Elijah Newren
  2022-06-12  8:54   ` ZheNing Hu
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
  3 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren @ 2022-06-12  6:58 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git Mailing List, ZheNing Hu

On Thu, May 19, 2022 at 9:26 AM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> A simple series to fix restore_state() in builtin/merge.c, fixing the issue
> reported by ZheNing Hu over here:
> https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

I updated this series a few weeks ago, I believe addressing all the
feedback.  But I hesitated to type "/submit" over at
https://github.com/gitgitgadget/git/pull/1231.  I've done virtually no
development on "interesting" (to me) projects for quite some time; I'm
only responding to email questions and review requests on the list.
And the review process takes quite a bit of work at times, and on the
chance there was more feedback to address, I just didn't have it in me
to submit a new round even though it _might_ be complete and certainly
fixes some known issues.  I haven't even gotten to review Dscho's
updates of my patches this whole week (spent the time I had mostly on
the merge-ort issue reported).

Does anyone else want to take this topic over?  There are updated
patches and even an updated cover letter over at
https://github.com/gitgitgadget/git/pull/1231.  It might involve
nothing more than submitting those patches; they might be good enough
already.  (Alternatively, I can type "/submit" to send them in...if
someone else agrees to respond to any feedback.)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH 0/2] Fix merge restore state
  2022-06-12  6:58 ` [PATCH 0/2] Fix merge restore state Elijah Newren
@ 2022-06-12  8:54   ` ZheNing Hu
  0 siblings, 0 replies; 87+ messages in thread
From: ZheNing Hu @ 2022-06-12  8:54 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

Elijah Newren <newren@gmail.com> 于2022年6月12日周日 14:58写道:
>
> On Thu, May 19, 2022 at 9:26 AM Elijah Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > A simple series to fix restore_state() in builtin/merge.c, fixing the issue
> > reported by ZheNing Hu over here:
> > https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/
>
> I updated this series a few weeks ago, I believe addressing all the
> feedback.  But I hesitated to type "/submit" over at
> https://github.com/gitgitgadget/git/pull/1231.  I've done virtually no
> development on "interesting" (to me) projects for quite some time; I'm
> only responding to email questions and review requests on the list.
> And the review process takes quite a bit of work at times, and on the
> chance there was more feedback to address, I just didn't have it in me
> to submit a new round even though it _might_ be complete and certainly
> fixes some known issues.  I haven't even gotten to review Dscho's
> updates of my patches this whole week (spent the time I had mostly on
> the merge-ort issue reported).
>

Sorry for missing this patch before. I have tried this patch, which
can perfectly
solve my origin problem. Now it can reset to origin state (Even when there are
some files in the work-tree here that have changed)

> Does anyone else want to take this topic over?  There are updated
> patches and even an updated cover letter over at
> https://github.com/gitgitgadget/git/pull/1231.  It might involve
> nothing more than submitting those patches; they might be good enough
> already.  (Alternatively, I can type "/submit" to send them in...if
> someone else agrees to respond to any feedback.)

Yes, I think you can submit it, thanks very much!

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v2 0/6] Fix merge restore state
  2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-06-12  6:58 ` [PATCH 0/2] Fix merge restore state Elijah Newren
@ 2022-06-19  6:50 ` Elijah Newren via GitGitGadget
  2022-06-19  6:50   ` [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes Junio C Hamano via GitGitGadget
                     ` (6 more replies)
  3 siblings, 7 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren

MAINTAINER NOTE: Rebased on latest main/master. (In particular, needs
vd/sparse-stash; otherwise, the changes made here regress a
ensure-full-index testcase in t1092). Also, this fixes issues that predate
the v2.37 cycle, so this series can wait until v2.38 opens up.

Other note: If this rounds needs updates, ZheNing Hu may be the one to
respond and make any necessary updates, as per [1].

----------------------------------------------------------------------------

This is a simple series to fix restore_state() in builtin/merge.c, fixing
the issue reported by ZheNing Hu over here:
https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

Changes since v1:

 * Rebased
 * Included Junio's patch providing more testcases from
   https://lore.kernel.org/git/xmqqbkvtnyae.fsf@gitster.g/
 * Added three new patches to fix issues highlighted by Junio's testcases,
   in particular to (a) fix stashing with racy-dirty files present, (b) fix
   restoring staged state in restore_state(), and (c) ensure we can restore
   pre-merge state. All three were long-standing issues that we just hadn't
   noticed yet and thus are useful fixes on their own. However, my fix from
   v1 (which still remains as the final patch) does make it easier to notice
   these issues, and in particular that combined with Junio's new testcases
   unearthed those problems.

[1]
https://lore.kernel.org/git/CAOLTT8RpGGioOyaMw5tkeWXmHpOaBW9UH8JghUvBRQ50ZcDdYQ@mail.gmail.com/

Elijah Newren (5):
  merge: remove unused variable
  merge: fix save_state() to work when there are racy-dirty files
  merge: make restore_state() restore staged state too
  merge: ensure we can actually restore pre-merge state
  merge: do not exit restore_state() prematurely

Junio C Hamano (1):
  t6424: make sure a failed merge preserves local changes

 builtin/merge.c                          | 32 ++++++++++++++----------
 t/t6424-merge-unrelated-index-changes.sh | 32 ++++++++++++++++++++++--
 t/t7607-merge-state.sh                   | 25 ++++++++++++++++++
 3 files changed, 74 insertions(+), 15 deletions(-)
 create mode 100755 t/t7607-merge-state.sh


base-commit: 8ddf593a250e07d388059f7e3f471078e1d2ed5c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1231%2Fnewren%2Ffix-merge-restore-state-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1231/newren/fix-merge-restore-state-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1231

Range-diff vs v1:

 -:  ----------- > 1:  6147e72c309 t6424: make sure a failed merge preserves local changes
 1:  042d624b815 = 2:  230d84f09c8 merge: remove unused variable
 -:  ----------- > 3:  89e5e633241 merge: fix save_state() to work when there are racy-dirty files
 -:  ----------- > 4:  4a8b7c9e06d merge: make restore_state() restore staged state too
 -:  ----------- > 5:  a03075167c1 merge: ensure we can actually restore pre-merge state
 2:  88bdca72a78 ! 6:  0783b48c121 merge: make restore_state() do as its name says
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    merge: make restore_state() do as its name says
     +    merge: do not exit restore_state() prematurely
      
          Previously, if the user:
      
     @@ Commit message
       ## builtin/merge.c ##
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,
       {
     - 	const char *args[] = { "stash", "apply", NULL, NULL };
     + 	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
       
      -	if (is_null_oid(stash))
      -		return;
     @@ builtin/merge.c: static void restore_state(const struct object_id *head,
      +	if (is_null_oid(stash))
      +		goto refresh_cache;
      +
     - 	args[2] = oid_to_hex(stash);
     + 	args[3] = oid_to_hex(stash);
       
       	/*
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
@ 2022-06-19  6:50   ` Junio C Hamano via GitGitGadget
  2022-06-19  6:50   ` [PATCH v2 2/6] merge: remove unused variable Elijah Newren via GitGitGadget
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Junio C Hamano

From: Junio C Hamano <gitster@pobox.com>

We do make sure that an attempt to merge with various forms of local
changes will "fail", but the point of stopping the merge is so that
we refrain from discarding uncommitted local changes that could be
precious.  Add a few more checks for each case to make sure the
local changes are left intact.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6424-merge-unrelated-index-changes.sh | 32 ++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 89dd544f388..b6e424a427b 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -71,7 +71,9 @@ test_expect_success 'ff update' '
 	git merge E^0 &&
 
 	test_must_fail git rev-parse HEAD:random_file &&
-	test "$(git diff --name-only --cached E)" = "random_file"
+	test "$(git diff --name-only --cached E)" = "random_file" &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file
 '
 
 test_expect_success 'ff update, important file modified' '
@@ -83,6 +85,8 @@ test_expect_success 'ff update, important file modified' '
 	git add subdir/e &&
 
 	test_must_fail git merge E^0 &&
+	test_path_is_file subdir/e &&
+	git rev-parse --verify :subdir/e &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -93,6 +97,8 @@ test_expect_success 'resolve, trivial' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge -s resolve C^0 &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -103,6 +109,8 @@ test_expect_success 'resolve, non-trivial' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge -s resolve D^0 &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -113,6 +121,8 @@ test_expect_success 'recursive' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge -s recursive C^0 &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -145,9 +155,12 @@ test_expect_success 'recursive, when file has staged changes not matching HEAD n
 	mkdir subdir &&
 	test_seq 1 10 >subdir/a &&
 	git add subdir/a &&
+	git rev-parse --verify :subdir/a >expect &&
 
 	# We have staged changes; merge should error out
 	test_must_fail git merge -s recursive E^0 2>err &&
+	git rev-parse --verify :subdir/a >actual &&
+	test_cmp expect actual &&
 	test_i18ngrep "changes to the following files would be overwritten" err
 '
 
@@ -158,9 +171,12 @@ test_expect_success 'recursive, when file has staged changes matching what a mer
 	mkdir subdir &&
 	test_seq 1 11 >subdir/a &&
 	git add subdir/a &&
+	git rev-parse --verify :subdir/a >expect &&
 
 	# We have staged changes; merge should error out
 	test_must_fail git merge -s recursive E^0 2>err &&
+	git rev-parse --verify :subdir/a >actual &&
+	test_cmp expect actual &&
 	test_i18ngrep "changes to the following files would be overwritten" err
 '
 
@@ -171,7 +187,9 @@ test_expect_success 'octopus, unrelated file touched' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge C^0 D^0 &&
-	test_path_is_missing .git/MERGE_HEAD
+	test_path_is_missing .git/MERGE_HEAD &&
+	git rev-parse --verify :random_file &&
+	test_path_exists random_file
 '
 
 test_expect_success 'octopus, related file removed' '
@@ -181,6 +199,8 @@ test_expect_success 'octopus, related file removed' '
 	git rm b &&
 
 	test_must_fail git merge C^0 D^0 &&
+	test_path_is_missing b &&
+	test_must_fail git rev-parse --verify :b &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -189,8 +209,12 @@ test_expect_success 'octopus, related file modified' '
 	git checkout B^0 &&
 
 	echo 12 >>a && git add a &&
+	git rev-parse --verify :a >expect &&
 
 	test_must_fail git merge C^0 D^0 &&
+	test_path_is_file a &&
+	git rev-parse --verify :a >actual &&
+	test_cmp expect actual &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -201,6 +225,8 @@ test_expect_success 'ours' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge -s ours C^0 &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
@@ -211,6 +237,8 @@ test_expect_success 'subtree' '
 	touch random_file && git add random_file &&
 
 	test_must_fail git merge -s subtree E^0 &&
+	test_path_is_file random_file &&
+	git rev-parse --verify :random_file &&
 	test_path_is_missing .git/MERGE_HEAD
 '
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 2/6] merge: remove unused variable
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
  2022-06-19  6:50   ` [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes Junio C Hamano via GitGitGadget
@ 2022-06-19  6:50   ` Elijah Newren via GitGitGadget
  2022-07-19 23:14     ` Junio C Hamano
  2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

restore_state() had a local variable sb that is not used, and in fact,
was never used even in the original commit that introduced it,
1c7b76be7d ("Build in merge", 2008-07-07).  Remove it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index f178f5a3ee1..00de224a2da 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -375,7 +375,6 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	struct strbuf sb = STRBUF_INIT;
 	const char *args[] = { "stash", "apply", NULL, NULL };
 
 	if (is_null_oid(stash))
@@ -391,7 +390,6 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	strbuf_release(&sb);
 	refresh_cache(REFRESH_QUIET);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
  2022-06-19  6:50   ` [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes Junio C Hamano via GitGitGadget
  2022-06-19  6:50   ` [PATCH v2 2/6] merge: remove unused variable Elijah Newren via GitGitGadget
@ 2022-06-19  6:50   ` Elijah Newren via GitGitGadget
  2022-07-17 16:28     ` ZheNing Hu
  2022-07-19 22:43     ` Junio C Hamano
  2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
                     ` (3 subsequent siblings)
  6 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When there are racy-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Refresh the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 00de224a2da..8ce4336dd3f 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
 	int len;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buffer = STRBUF_INIT;
+	struct lock_file lock_file = LOCK_INIT;
+	int fd;
 	int rc = -1;
 
+	fd = repo_hold_locked_index(the_repository, &lock_file, 0);
+	refresh_cache(REFRESH_QUIET);
+	if (0 <= fd)
+		repo_update_index_if_able(the_repository, &lock_file);
+	rollback_lock_file(&lock_file);
+
 	strvec_pushl(&cp.args, "stash", "create", NULL);
 	cp.out = -1;
 	cp.git_cmd = 1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 4/6] merge: make restore_state() restore staged state too
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
@ 2022-06-19  6:50   ` Elijah Newren via GitGitGadget
  2022-07-17 16:37     ` ZheNing Hu
  2022-07-19 23:14     ` Junio C Hamano
  2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
                     ` (2 subsequent siblings)
  6 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

merge can be invoked with uncommitted changes, including staged changes.
merge is responsible for restoring this state if some of the merge
strategies make changes.  However, it was not restoring staged changes
due to the lack of the "--index" option to "git stash apply".  Add the
option to fix this shortcoming.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 8ce4336dd3f..2dc56fab70b 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -383,14 +383,14 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	const char *args[] = { "stash", "apply", NULL, NULL };
+	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
 
 	if (is_null_oid(stash))
 		return;
 
 	reset_hard(head, 1);
 
-	args[2] = oid_to_hex(stash);
+	args[3] = oid_to_hex(stash);
 
 	/*
 	 * It is OK to ignore error here, for example when there was
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-06-19  6:50   ` Elijah Newren via GitGitGadget
  2022-07-17 16:41     ` ZheNing Hu
  2022-07-19 22:57     ` Junio C Hamano
  2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  6 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Merge strategies can fail -- not just have conflicts, but give up and
say that they are unable to handle the current type of merge.  However,
they can also make changes to the index and working tree before giving
up; merge-octopus does this, for example.  Currently, we do not expect
the individual strategies to clean up after themselves, but instead
expect builtin/merge.c to do so.  For it to be able to, it needs to save
the state before trying the merge strategy so it can have something to
restore to.  Therefore, remove the shortcut bypassing the save_state()
call.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 2dc56fab70b..aaee8f6a553 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1663,12 +1663,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	 * tree in the index -- this means that the index must be in
 	 * sync with the head commit.  The strategies are responsible
 	 * to ensure this.
+	 *
+	 * Stash away the local changes so that we can try more than one.
 	 */
-	if (use_strategies_nr == 1 ||
-	    /*
-	     * Stash away the local changes so that we can try more than one.
-	     */
-	    save_state(&stash))
+	if (save_state(&stash))
 		oidclr(&stash);
 
 	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-06-19  6:50   ` Elijah Newren via GitGitGadget
  2022-07-17 16:44     ` ZheNing Hu
  2022-07-19 23:13     ` Junio C Hamano
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  6 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-06-19  6:50 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c        | 10 ++++++----
 t/t7607-merge-state.sh | 25 +++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 4 deletions(-)
 create mode 100755 t/t7607-merge-state.sh

diff --git a/builtin/merge.c b/builtin/merge.c
index aaee8f6a553..a21dece1b55 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -385,11 +385,11 @@ static void restore_state(const struct object_id *head,
 {
 	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
 
-	if (is_null_oid(stash))
-		return;
-
 	reset_hard(head, 1);
 
+	if (is_null_oid(stash))
+		goto refresh_cache;
+
 	args[3] = oid_to_hex(stash);
 
 	/*
@@ -398,7 +398,9 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	refresh_cache(REFRESH_QUIET);
+refresh_cache:
+	if (discard_cache() < 0 || read_cache() < 0)
+		die(_("could not read index"));
 }
 
 /* This is called when no merge was necessary. */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
new file mode 100755
index 00000000000..655478cd0b3
--- /dev/null
+++ b/t/t7607-merge-state.sh
@@ -0,0 +1,25 @@
+#!/bin/sh
+
+test_description="Test that merge state is as expected after failed merge"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+. ./test-lib.sh
+
+test_expect_success 'set up custom strategy' '
+	test_commit --no-tag "Initial" base base &&
+git show-ref &&
+
+	for b in branch1 branch2 branch3
+	do
+		git checkout -b $b main &&
+		test_commit --no-tag "Change on $b" base $b
+	done &&
+
+	git checkout branch1 &&
+	test_must_fail git merge branch2 branch3 &&
+	git diff --exit-code --name-status &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files
  2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
@ 2022-07-17 16:28     ` ZheNing Hu
  2022-07-19 22:49       ` Junio C Hamano
  2022-07-19 22:43     ` Junio C Hamano
  1 sibling, 1 reply; 87+ messages in thread
From: ZheNing Hu @ 2022-07-17 16:28 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Elijah Newren

Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
>
> From: Elijah Newren <newren@gmail.com>
>
> When there are racy-dirty files, but no files are modified,
> `git stash create` exits with unsuccessful status.  This causes merge
> to fail.  Refresh the index first to avoid this problem.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 00de224a2da..8ce4336dd3f 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
>         int len;
>         struct child_process cp = CHILD_PROCESS_INIT;
>         struct strbuf buffer = STRBUF_INIT;
> +       struct lock_file lock_file = LOCK_INIT;
> +       int fd;
>         int rc = -1;
>
> +       fd = repo_hold_locked_index(the_repository, &lock_file, 0);
> +       refresh_cache(REFRESH_QUIET);
> +       if (0 <= fd)
> +               repo_update_index_if_able(the_repository, &lock_file);
> +       rollback_lock_file(&lock_file);
> +
>         strvec_pushl(&cp.args, "stash", "create", NULL);
>         cp.out = -1;
>         cp.git_cmd = 1;
> --
> gitgitgadget
>

I just want to show what sence will meet this errors:

1. touch file
2. git add file
3. git stash push (user may do it before git merge)
4. touch file (update file but not update its content)
5. git merge (call git stash create and return 1)

So I have knew about what's the meaning of this patch:

If user do the git stash manually and update file timestamp
before git merge, it may make next git stash failed. And
refresh index can make index entry sync with the work tree
file status.

Thanks.

ZheNing Hu

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 4/6] merge: make restore_state() restore staged state too
  2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-17 16:37     ` ZheNing Hu
  2022-07-19 23:14     ` Junio C Hamano
  1 sibling, 0 replies; 87+ messages in thread
From: ZheNing Hu @ 2022-07-17 16:37 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Elijah Newren

Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
>
> From: Elijah Newren <newren@gmail.com>
>
> merge can be invoked with uncommitted changes, including staged changes.
> merge is responsible for restoring this state if some of the merge
> strategies make changes.  However, it was not restoring staged changes
> due to the lack of the "--index" option to "git stash apply".  Add the
> option to fix this shortcoming.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 8ce4336dd3f..2dc56fab70b 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -383,14 +383,14 @@ static void reset_hard(const struct object_id *oid, int verbose)
>  static void restore_state(const struct object_id *head,
>                           const struct object_id *stash)
>  {
> -       const char *args[] = { "stash", "apply", NULL, NULL };
> +       const char *args[] = { "stash", "apply", "--index", NULL, NULL };
>
>         if (is_null_oid(stash))
>                 return;
>
>         reset_hard(head, 1);
>
> -       args[2] = oid_to_hex(stash);
> +       args[3] = oid_to_hex(stash);
>
>         /*
>          * It is OK to ignore error here, for example when there was
> --
> gitgitgadget
>

Now git merge (all strategies) can learn to restore origin index state.
LGTM.

ZheNing Hu

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state
  2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-07-17 16:41     ` ZheNing Hu
  2022-07-19 22:57     ` Junio C Hamano
  1 sibling, 0 replies; 87+ messages in thread
From: ZheNing Hu @ 2022-07-17 16:41 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Elijah Newren

Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
>
> From: Elijah Newren <newren@gmail.com>
>
> Merge strategies can fail -- not just have conflicts, but give up and
> say that they are unable to handle the current type of merge.  However,
> they can also make changes to the index and working tree before giving
> up; merge-octopus does this, for example.  Currently, we do not expect
> the individual strategies to clean up after themselves, but instead
> expect builtin/merge.c to do so.  For it to be able to, it needs to save
> the state before trying the merge strategy so it can have something to
> restore to.  Therefore, remove the shortcut bypassing the save_state()
> call.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 2dc56fab70b..aaee8f6a553 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1663,12 +1663,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>          * tree in the index -- this means that the index must be in
>          * sync with the head commit.  The strategies are responsible
>          * to ensure this.
> +        *
> +        * Stash away the local changes so that we can try more than one.
>          */
> -       if (use_strategies_nr == 1 ||
> -           /*
> -            * Stash away the local changes so that we can try more than one.
> -            */
> -           save_state(&stash))
> +       if (save_state(&stash))
>                 oidclr(&stash);
>
>         for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
> --
> gitgitgadget
>

So now we will not make "stash" empty even if we are using
only one merge strategy (e.g. octopus), so we can reset to
the original state correctly.

Good.

ZheNing Hu

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-17 16:44     ` ZheNing Hu
  2022-07-19 23:13     ` Junio C Hamano
  1 sibling, 0 replies; 87+ messages in thread
From: ZheNing Hu @ 2022-07-17 16:44 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Elijah Newren

Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
>
> From: Elijah Newren <newren@gmail.com>
>
> @@ -398,7 +398,9 @@ static void restore_state(const struct object_id *head,
>          */
>         run_command_v_opt(args, RUN_GIT_CMD);
>
> -       refresh_cache(REFRESH_QUIET);
> +refresh_cache:
> +       if (discard_cache() < 0 || read_cache() < 0)
> +               die(_("could not read index"));
>  }
>

We don't need to check discard_cache() return value,
it's equal to zero constantly.

>  /* This is called when no merge was necessary. */
> diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
> new file mode 100755
> index 00000000000..655478cd0b3
> --- /dev/null
> +++ b/t/t7607-merge-state.sh
> @@ -0,0 +1,25 @@
> +#!/bin/sh
> +
> +test_description="Test that merge state is as expected after failed merge"
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +. ./test-lib.sh
> +
> +test_expect_success 'set up custom strategy' '
> +       test_commit --no-tag "Initial" base base &&
> +git show-ref &&
> +
> +       for b in branch1 branch2 branch3
> +       do
> +               git checkout -b $b main &&
> +               test_commit --no-tag "Change on $b" base $b
> +       done &&
> +
> +       git checkout branch1 &&
> +       test_must_fail git merge branch2 branch3 &&
> +       git diff --exit-code --name-status &&
> +       test_path_is_missing .git/MERGE_HEAD
> +'
> +

Little typo: less a small tab before "git show ref"?

> +test_done
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files
  2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
  2022-07-17 16:28     ` ZheNing Hu
@ 2022-07-19 22:43     ` Junio C Hamano
  1 sibling, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 22:43 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> When there are racy-dirty files, but no files are modified,
> `git stash create` exits with unsuccessful status.  This causes merge
> to fail.  Refresh the index first to avoid this problem.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 00de224a2da..8ce4336dd3f 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
>  	int len;
>  	struct child_process cp = CHILD_PROCESS_INIT;
>  	struct strbuf buffer = STRBUF_INIT;
> +	struct lock_file lock_file = LOCK_INIT;
> +	int fd;
>  	int rc = -1;
>  
> +	fd = repo_hold_locked_index(the_repository, &lock_file, 0);
> +	refresh_cache(REFRESH_QUIET);
> +	if (0 <= fd)
> +		repo_update_index_if_able(the_repository, &lock_file);
> +	rollback_lock_file(&lock_file);

I might have added "else" but rolling back a lock file that was
already committed or rolled back is a safe no-op, so this is OK.
The pattern already appears elsewhere twice, anyway.

Is it sufficient to be opportunistic?  IOW, if we fail to refresh
the index or write the refreshed result to disk, can we be silent
here and rely on "stash create" and things that follow to safely
fail as necessary, or should we also be detecting errors?

>  	strvec_pushl(&cp.args, "stash", "create", NULL);
>  	cp.out = -1;
>  	cp.git_cmd = 1;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files
  2022-07-17 16:28     ` ZheNing Hu
@ 2022-07-19 22:49       ` Junio C Hamano
  2022-07-21  1:09         ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 22:49 UTC (permalink / raw)
  To: ZheNing Hu; +Cc: Elijah Newren via GitGitGadget, Git List, Elijah Newren

ZheNing Hu <adlternative@gmail.com> writes:

> Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
>>
>> From: Elijah Newren <newren@gmail.com>
>>
>> When there are racy-dirty files, but no files are modified,
>> `git stash create` exits with unsuccessful status.  This causes merge
>> to fail.  Refresh the index first to avoid this problem.

Racily dirty?  Or just being stat-dirty is sufficient to cause the
"stash create" to fail?

> I just want to show what sence will meet this errors:
>
> 1. touch file
> 2. git add file
> 3. git stash push (user may do it before git merge)
> 4. touch file (update file but not update its content)
> 5. git merge (call git stash create and return 1)

I think, from the above reproduction recipe, that the breakage does
not depend on racily-clean index entries (i.e. file touched within
the same timestamp as the last write of the index without changing
their size).  So s/racy-dirty/stat-dirty/ (both on the title and the
body) would be a sufficient fix.

Thanks.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state
  2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
  2022-07-17 16:41     ` ZheNing Hu
@ 2022-07-19 22:57     ` Junio C Hamano
  2022-07-21  2:03       ` Elijah Newren
  1 sibling, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 22:57 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> Merge strategies can fail -- not just have conflicts, but give up and
> say that they are unable to handle the current type of merge.  However,
> they can also make changes to the index and working tree before giving
> up; merge-octopus does this, for example.  Currently, we do not expect
> the individual strategies to clean up after themselves, but instead
> expect builtin/merge.c to do so.  For it to be able to, it needs to save
> the state before trying the merge strategy so it can have something to
> restore to.  Therefore, remove the shortcut bypassing the save_state()
> call.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 2dc56fab70b..aaee8f6a553 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1663,12 +1663,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>  	 * tree in the index -- this means that the index must be in
>  	 * sync with the head commit.  The strategies are responsible
>  	 * to ensure this.
> +	 *
> +	 * Stash away the local changes so that we can try more than one.
>  	 */

The comment explains why we limited the save_state() to avoid wasted
cycles and SSD wear and tear by looking at the number of strategies.
But because we are removing the restriction (which I am not 100%
sure is a good idea), "so that we can try more than one" no longer
applies as the reason why we run save_state() here.

> -	if (use_strategies_nr == 1 ||
> -	    /*
> -	     * Stash away the local changes so that we can try more than one.
> -	     */
> -	    save_state(&stash))
> +	if (save_state(&stash))
>  		oidclr(&stash);
>  
>  	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
  2022-07-17 16:44     ` ZheNing Hu
@ 2022-07-19 23:13     ` Junio C Hamano
  2022-07-20  0:09       ` Eric Sunshine
  2022-07-21  3:27       ` Elijah Newren
  1 sibling, 2 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 23:13 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Fix the main problem by making sure that restore_state() only skips the
> stash application if the stash is null rather than skipping the whole
> function.

OK.


> However, there is a secondary problem -- since merge.c forks
> subprocesses to do the cleanup, the in-memory index is left out-of-sync.
> While there was a refresh_cache(REFRESH_QUIET) call that attempted to
> correct that, that function would not handle cases where the previous
> merge strategy added conflicted entries.  We need to drop the index and
> re-read it to handle such cases.

Absolutely right.

> diff --git a/builtin/merge.c b/builtin/merge.c
> index aaee8f6a553..a21dece1b55 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -385,11 +385,11 @@ static void restore_state(const struct object_id *head,
>  {
>  	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
>  
> -	if (is_null_oid(stash))
> -		return;
> -
>  	reset_hard(head, 1);
>  
> +	if (is_null_oid(stash))
> +		goto refresh_cache;
> +
>  	args[3] = oid_to_hex(stash);
>  
>  	/*
> @@ -398,7 +398,9 @@ static void restore_state(const struct object_id *head,
>  	 */
>  	run_command_v_opt(args, RUN_GIT_CMD);
>  
> -	refresh_cache(REFRESH_QUIET);
> +refresh_cache:
> +	if (discard_cache() < 0 || read_cache() < 0)
> +		die(_("could not read index"));

Don't we need refresh_cache() after re-reading the on-disk index, or
do we have nothing to do further after restore_state() returns and
the stat-info being stale does not matter?  Given that [3/6] exists,
I suspect that we do want to make sure the in-core index is refreshed
before we go ahead and run the next merge, no?

>  }
>  
>  /* This is called when no merge was necessary. */

> diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
> new file mode 100755

As long we are adding a brand-new script for new tests, probably we
should add tests for other steps (like [4/6]) here, perhaps?

> index 00000000000..655478cd0b3
> --- /dev/null
> +++ b/t/t7607-merge-state.sh
> @@ -0,0 +1,25 @@
> +#!/bin/sh
> +
> +test_description="Test that merge state is as expected after failed merge"
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +. ./test-lib.sh
> +
> +test_expect_success 'set up custom strategy' '
> +	test_commit --no-tag "Initial" base base &&
> +git show-ref &&

Is this part of the test, or a leftover debugging aid?

> +
> +	for b in branch1 branch2 branch3
> +	do
> +		git checkout -b $b main &&
> +		test_commit --no-tag "Change on $b" base $b
> +	done &&
> +
> +	git checkout branch1 &&
> +	test_must_fail git merge branch2 branch3 &&
> +	git diff --exit-code --name-status &&
> +	test_path_is_missing .git/MERGE_HEAD
> +'

Hmph, I am not sure if the new behaviour is not too pessimistic.
When octopus fails after successfully merging branch2 and then
failing the merge of branch3 (i.e. the last one) due to conflict,
I think octpus users are used to be able to resolve it manually
and make a commit.  Are we making it impossible by doing the
reset-restore dance here?

I do not use, and more importantly, I do not recommend others to
use, Octopus anymore, and from that point of view, it is a good move
to make Octopus harder to use on any non-trivial merge, but those
who still like Octopus may disagree.

Thanks.

> +test_done

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 4/6] merge: make restore_state() restore staged state too
  2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
  2022-07-17 16:37     ` ZheNing Hu
@ 2022-07-19 23:14     ` Junio C Hamano
  2022-07-19 23:28       ` Junio C Hamano
  1 sibling, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 23:14 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> merge can be invoked with uncommitted changes, including staged changes.
> merge is responsible for restoring this state if some of the merge
> strategies make changes.  However, it was not restoring staged changes
> due to the lack of the "--index" option to "git stash apply".  Add the
> option to fix this shortcoming.

Shouldn't this be testable?

> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 8ce4336dd3f..2dc56fab70b 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -383,14 +383,14 @@ static void reset_hard(const struct object_id *oid, int verbose)
>  static void restore_state(const struct object_id *head,
>  			  const struct object_id *stash)
>  {
> -	const char *args[] = { "stash", "apply", NULL, NULL };
> +	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
>  
>  	if (is_null_oid(stash))
>  		return;
>  
>  	reset_hard(head, 1);
>  
> -	args[2] = oid_to_hex(stash);
> +	args[3] = oid_to_hex(stash);
>  
>  	/*
>  	 * It is OK to ignore error here, for example when there was

OK.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 2/6] merge: remove unused variable
  2022-06-19  6:50   ` [PATCH v2 2/6] merge: remove unused variable Elijah Newren via GitGitGadget
@ 2022-07-19 23:14     ` Junio C Hamano
  0 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 23:14 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> restore_state() had a local variable sb that is not used, and in fact,
> was never used even in the original commit that introduced it,
> 1c7b76be7d ("Build in merge", 2008-07-07).  Remove it.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index f178f5a3ee1..00de224a2da 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -375,7 +375,6 @@ static void reset_hard(const struct object_id *oid, int verbose)
>  static void restore_state(const struct object_id *head,
>  			  const struct object_id *stash)
>  {
> -	struct strbuf sb = STRBUF_INIT;
>  	const char *args[] = { "stash", "apply", NULL, NULL };
>  
>  	if (is_null_oid(stash))
> @@ -391,7 +390,6 @@ static void restore_state(const struct object_id *head,
>  	 */
>  	run_command_v_opt(args, RUN_GIT_CMD);
>  
> -	strbuf_release(&sb);
>  	refresh_cache(REFRESH_QUIET);
>  }

Obviously correct ;-)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 4/6] merge: make restore_state() restore staged state too
  2022-07-19 23:14     ` Junio C Hamano
@ 2022-07-19 23:28       ` Junio C Hamano
  2022-07-21  1:37         ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-19 23:28 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, ZheNing Hu, Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Elijah Newren <newren@gmail.com>
>>
>> merge can be invoked with uncommitted changes, including staged changes.
>> merge is responsible for restoring this state if some of the merge
>> strategies make changes.  However, it was not restoring staged changes
>> due to the lack of the "--index" option to "git stash apply".  Add the
>> option to fix this shortcoming.
>
> Shouldn't this be testable?

I actually take this part (which implied that the change is a good
idea) back.  I think we have clearly documented for the past 17
years that you can have local changes but your index must match the
HEAD before you start your merge.

If "stash apply" vs "stash apply --index" makes any difference,
there is something wrong.  We should be aborting the "git merge"
even before we even start mucking with the working tree and the
index with strategies, no?  I think it is the bug, if this change
makes any difference, to be fixed---we shouldn't be proceeding to
even create a stash with index changes to begin with.

>
>> Signed-off-by: Elijah Newren <newren@gmail.com>
>> ---
>>  builtin/merge.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/builtin/merge.c b/builtin/merge.c
>> index 8ce4336dd3f..2dc56fab70b 100644
>> --- a/builtin/merge.c
>> +++ b/builtin/merge.c
>> @@ -383,14 +383,14 @@ static void reset_hard(const struct object_id *oid, int verbose)
>>  static void restore_state(const struct object_id *head,
>>  			  const struct object_id *stash)
>>  {
>> -	const char *args[] = { "stash", "apply", NULL, NULL };
>> +	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
>>  
>>  	if (is_null_oid(stash))
>>  		return;
>>  
>>  	reset_hard(head, 1);
>>  
>> -	args[2] = oid_to_hex(stash);
>> +	args[3] = oid_to_hex(stash);
>>  
>>  	/*
>>  	 * It is OK to ignore error here, for example when there was
>
> OK.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-07-19 23:13     ` Junio C Hamano
@ 2022-07-20  0:09       ` Eric Sunshine
  2022-07-21  2:03         ` Elijah Newren
  2022-07-21  3:27       ` Elijah Newren
  1 sibling, 1 reply; 87+ messages in thread
From: Eric Sunshine @ 2022-07-20  0:09 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git List, ZheNing Hu,
	Junio C Hamano

[replying to Junio's email since I don't have the original available...]

On Tue, Jul 19, 2022 at 7:22 PM Junio C Hamano <gitster@pobox.com> wrote:
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > +     for b in branch1 branch2 branch3
> > +     do
> > +             git checkout -b $b main &&
> > +             test_commit --no-tag "Change on $b" base $b
> > +     done &&

Let's break out of the loop with `|| return 1` if something in the
loop body fails.

    for b in branch1 branch2 branch3
    do
        git checkout -b $b main &&
        test_commit --no-tag "Change on $b" base $b || return 1
    done &&

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files
  2022-07-19 22:49       ` Junio C Hamano
@ 2022-07-21  1:09         ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21  1:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: ZheNing Hu, Elijah Newren via GitGitGadget, Git List

On Tue, Jul 19, 2022 at 3:49 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> > Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年6月19日周日 14:50写道:
> >>
> >> From: Elijah Newren <newren@gmail.com>
> >>
> >> When there are racy-dirty files, but no files are modified,
> >> `git stash create` exits with unsuccessful status.  This causes merge
> >> to fail.  Refresh the index first to avoid this problem.
>
> Racily dirty?  Or just being stat-dirty is sufficient to cause the
> "stash create" to fail?
>
> > I just want to show what sence will meet this errors:
> >
> > 1. touch file
> > 2. git add file
> > 3. git stash push (user may do it before git merge)
> > 4. touch file (update file but not update its content)
> > 5. git merge (call git stash create and return 1)
>
> I think, from the above reproduction recipe, that the breakage does
> not depend on racily-clean index entries (i.e. file touched within
> the same timestamp as the last write of the index without changing
> their size).  So s/racy-dirty/stat-dirty/ (both on the title and the
> body) would be a sufficient fix.

Yep, stat-dirty.  I'll fix up the title and body; thanks.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 4/6] merge: make restore_state() restore staged state too
  2022-07-19 23:28       ` Junio C Hamano
@ 2022-07-21  1:37         ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21  1:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu

On Tue, Jul 19, 2022 at 4:28 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> >> From: Elijah Newren <newren@gmail.com>
> >>
> >> merge can be invoked with uncommitted changes, including staged changes.
> >> merge is responsible for restoring this state if some of the merge
> >> strategies make changes.  However, it was not restoring staged changes
> >> due to the lack of the "--index" option to "git stash apply".  Add the
> >> option to fix this shortcoming.
> >
> > Shouldn't this be testable?

Yes, I will add a test.

> I actually take this part (which implied that the change is a good
> idea) back. I think we have clearly documented for the past 17
> years that you can have local changes but your index must match the
> HEAD before you start your merge.

Actually, we don't enforce that the index must match HEAD in all
cases, as noted in commit 55f39cf755 ("merge: fix misleading pre-merge
check documentation", 2018-06-30).  That commit also pointed out how
the documentation was a bit unclear in this area.

We also apparently fail to enforce the condition in at least two cases
that weren't a valid exception, which I just found while working on a
testcase for this patch.  (Thus, we have one more sordid tale to add
to the saga in commit 9822175d2b ("Ensure index matches head before
invoking merge machinery, round N", 2019-08-17))

However, the failed enforcement and the "valid" special exceptions
aren't too relevant here, so...

> If "stash apply" vs "stash apply --index" makes any difference,
> there is something wrong.  We should be aborting the "git merge"
> even before we even start mucking with the working tree and the
> index with strategies, no?  I think it is the bug, if this change
> makes any difference, to be fixed---we shouldn't be proceeding to
> even create a stash with index changes to begin with.

I agree with you that generally if the index does not match HEAD, then
(A) we should abort the merge, and (B) the working tree and index need
to be left intact when the merge aborts.

But I don't think your conclusion follows from those two items,
because of the last sentence of this comment:

   /*
    * At this point, we need a real merge.  No matter what strategy
    * we use, it would operate on the index, possibly affecting the
    * working tree, and when resolved cleanly, have the desired
    * tree in the index -- this means that the index must be in
    * sync with the head commit.  The strategies are responsible
    * to ensure this.
    */

Due to this requirement, if a user has staged changes before starting
the merge, builtin/merge.c will:

   * stash the changes
   * try all the merge strategies in turn, each of which report they
cannot function due to index not matching HEAD
   * restore the changes via "git stash apply"

This sequence has the net effect of not quite cleanly aborting the
merge -- it also unstashes the user's changes.

One way to fix this problem is the simple patch I proposed.  An
alternative fix would be to rip out the extra code from all the merge
strategies that enforces the index matches HEAD requirement, and then
adding enforcement of that condition early in builtin/merge.c.  That
alternative fix probably would have saved us from a lot of the
headache detailed in commit 9822175d2b above, but it may also make
recursive and ort a bit slower (which had relied on unpack-trees to do
some of this checking, and thus they'd have some redundant checks).

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state
  2022-07-19 22:57     ` Junio C Hamano
@ 2022-07-21  2:03       ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21  2:03 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu

On Tue, Jul 19, 2022 at 3:57 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Merge strategies can fail -- not just have conflicts, but give up and
> > say that they are unable to handle the current type of merge.  However,
> > they can also make changes to the index and working tree before giving
> > up; merge-octopus does this, for example.  Currently, we do not expect
> > the individual strategies to clean up after themselves, but instead
> > expect builtin/merge.c to do so.  For it to be able to, it needs to save
> > the state before trying the merge strategy so it can have something to
> > restore to.  Therefore, remove the shortcut bypassing the save_state()
> > call.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/merge.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index 2dc56fab70b..aaee8f6a553 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -1663,12 +1663,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
> >        * tree in the index -- this means that the index must be in
> >        * sync with the head commit.  The strategies are responsible
> >        * to ensure this.
> > +      *
> > +      * Stash away the local changes so that we can try more than one.
> >        */
>
> The comment explains why we limited the save_state() to avoid wasted
> cycles and SSD wear and tear by looking at the number of strategies.
> But because we are removing the restriction (which I am not 100%
> sure is a good idea), "so that we can try more than one" no longer
> applies as the reason why we run save_state() here.

I should probably change it to "Stash away the local changes so that
we can try more than one and/or recover from merge strategies
bailing".

In regards to the "good idea" side, I don't really like it either, but:

  1. Merge strategies are allowed to make whatever changes they want
to the index and working tree.  They have no requirement to clean up
after themselves.
  2. Merge strategies are allowed to bail and say, "Nevermind, not
only can I not successfully merge this, I can't even leave conflicts
for the users to resolve; I just can't handle it at all.  Try some
other strategy."  (See the "exit 2" codepath of commit 98efc8f3d8
("octopus: allow manual resolve on the last round.", 2006-01-13), for
example)
  3. Merge strategies can bail with the "try some other strategy"
response _after_ mucking with the index and working tree.  octopus
does (again, see the commit mentioned above)
  4. builtin/merge.c previously would _only_ cleanup for merge
strategies if there was more than 1.

This combination is clearly broken.  We need to fix either item 1 or
item 4 (or maybe item 3?).  Since 1 might run into issues with
user-custom merge strategies which may have been written to mimic
octopus and thus rely on the assurances in items 1-3 above, I figured
fixing item 4 was the easiest.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-07-20  0:09       ` Eric Sunshine
@ 2022-07-21  2:03         ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21  2:03 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Elijah Newren via GitGitGadget, Git List, ZheNing Hu,
	Junio C Hamano

On Tue, Jul 19, 2022 at 5:09 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> [replying to Junio's email since I don't have the original available...]
>
> On Tue, Jul 19, 2022 at 7:22 PM Junio C Hamano <gitster@pobox.com> wrote:
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > > +     for b in branch1 branch2 branch3
> > > +     do
> > > +             git checkout -b $b main &&
> > > +             test_commit --no-tag "Change on $b" base $b
> > > +     done &&
>
> Let's break out of the loop with `|| return 1` if something in the
> loop body fails.
>
>     for b in branch1 branch2 branch3
>     do
>         git checkout -b $b main &&
>         test_commit --no-tag "Change on $b" base $b || return 1
>     done &&

Okay, will do.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v2 6/6] merge: do not exit restore_state() prematurely
  2022-07-19 23:13     ` Junio C Hamano
  2022-07-20  0:09       ` Eric Sunshine
@ 2022-07-21  3:27       ` Elijah Newren
  1 sibling, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21  3:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu

On Tue, Jul 19, 2022 at 4:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Fix the main problem by making sure that restore_state() only skips the
> > stash application if the stash is null rather than skipping the whole
> > function.
>
> OK.
>
>
> > However, there is a secondary problem -- since merge.c forks
> > subprocesses to do the cleanup, the in-memory index is left out-of-sync.
> > While there was a refresh_cache(REFRESH_QUIET) call that attempted to
> > correct that, that function would not handle cases where the previous
> > merge strategy added conflicted entries.  We need to drop the index and
> > re-read it to handle such cases.
>
> Absolutely right.
>
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index aaee8f6a553..a21dece1b55 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -385,11 +385,11 @@ static void restore_state(const struct object_id *head,
> >  {
> >       const char *args[] = { "stash", "apply", "--index", NULL, NULL };
> >
> > -     if (is_null_oid(stash))
> > -             return;
> > -
> >       reset_hard(head, 1);
> >
> > +     if (is_null_oid(stash))
> > +             goto refresh_cache;
> > +
> >       args[3] = oid_to_hex(stash);
> >
> >       /*
> > @@ -398,7 +398,9 @@ static void restore_state(const struct object_id *head,
> >        */
> >       run_command_v_opt(args, RUN_GIT_CMD);
> >
> > -     refresh_cache(REFRESH_QUIET);
> > +refresh_cache:
> > +     if (discard_cache() < 0 || read_cache() < 0)
> > +             die(_("could not read index"));
>
> Don't we need refresh_cache() after re-reading the on-disk index, or
> do we have nothing to do further after restore_state() returns and
> the stat-info being stale does not matter?  Given that [3/6] exists,
> I suspect that we do want to make sure the in-core index is refreshed
> before we go ahead and run the next merge, no?

I don't think so; the situation for [3/6] is different.  The basic
timeline is as follows:
    1. <User does lots of stuff over weeks and months>
    2. User decides to merge one or more branches
    3. merge does save_state() [i.e. "git stash create"]
    4. The first strategy fails
    5. We restore the state before trying the next strategy

The current code is dealing with step 5.  Patch [3/6] was to prevent
failures in step 3 from users creating stat-dirty files in step 1.
Once step 3 runs, the only way to become stat-dirty again is if the
user simultaneously messes with their checkout while the "git merge"
command is running.  Attempting to preventatively handle users
modifying the working tree simultaneously with concurrent git commands
like `git merge` seems like a losing proposition to me; it'd be a huge
can of worms and have a million holes.  I don't think that's worth it.

> >  }
> >
> >  /* This is called when no merge was necessary. */
>
> > diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
> > new file mode 100755
>
> As long we are adding a brand-new script for new tests, probably we
> should add tests for other steps (like [4/6]) here, perhaps?

Yes.

>
> > index 00000000000..655478cd0b3
> > --- /dev/null
> > +++ b/t/t7607-merge-state.sh
> > @@ -0,0 +1,25 @@
> > +#!/bin/sh
> > +
> > +test_description="Test that merge state is as expected after failed merge"
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'set up custom strategy' '
> > +     test_commit --no-tag "Initial" base base &&
> > +git show-ref &&
>
> Is this part of the test, or a leftover debugging aid?

Looks like part of a leftover debugging aid; sorry about that.  Will clean up.

> > +
> > +     for b in branch1 branch2 branch3
> > +     do
> > +             git checkout -b $b main &&
> > +             test_commit --no-tag "Change on $b" base $b
> > +     done &&
> > +
> > +     git checkout branch1 &&
> > +     test_must_fail git merge branch2 branch3 &&
> > +     git diff --exit-code --name-status &&
> > +     test_path_is_missing .git/MERGE_HEAD
> > +'
>
> Hmph, I am not sure if the new behaviour is not too pessimistic.
> When octopus fails after successfully merging branch2 and then
> failing the merge of branch3 (i.e. the last one) due to conflict,

That's not what's happening here.  It is not failing due to conflict,
octopus is reporting that it cannot even leave things in a conflicted
state for the user, and is actually incapable of handling this
particular type of merge.  Part of the output seen when attempting
this merge includes:
    fatal: merge program failed
    Should not be doing an octopus.

See previous discussion at
https://lore.kernel.org/git/xmqq35hdd205.fsf@gitster.g/.  To make this
clearer, perhaps I should use "test_expect_code 2" instead of
test_must_fail, and also grep the output/error for the above messages.

> I think octpus users are used to be able to resolve it manually
> and make a commit.  Are we making it impossible by doing the
> reset-restore dance here?

No, we are not changing what octopus can handle here.  The code above
is not triggered when octopus returns that it ran into conflicts for
the user to resolve.  It is only triggered when octopus says it cannot
handle the merge in question.  See your commit 98efc8f3d8 ("octopus:
allow manual resolve on the last round.", 2006-01-13), and note that
the the "exit 2" code path is the one that we are hitting.  I'll add
some comments to the testcase and rewrite the commit message to try to
make this clearer.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v3 0/7] Fix merge restore state
  2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
                     ` (5 preceding siblings ...)
  2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-21  8:16   ` Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
                       ` (7 more replies)
  6 siblings, 8 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

NOTE: Rebased on master, yet again, because (1) Junio merged his commit to
master separately, and (2) Ævar's intentional duplication of my second
patch[1] later conflicted with other changes I had to make in the same area.
(Which isn't a big deal, but for future reference, it would be nicer to
avoid conflicts by omitting the fixup I had already submitted[2] instead of
intentionally duplicating it).

This started as a simple series to fix restore_state() in builtin/merge.c,
fixing an issue reported by ZheNing Hu[3]. It's grown so much it's hard to
call it simple. Anyway...

Changes since v2:

 * Removed the first two patches, as noted above in the comment about
   rebasing.
 * Inserted new patches 3, 4, and 5 to fix some related bugs. Folks are more
   likely to object to patch 5 than the others; people should probably take
   a look at that one if they have limited time.
 * Dramatically reworded commit messages given the misunderstandings of what
   was being addressed and done. Hopefully it is much clearer what the last
   three patches are doing and what they are not doing, and why.
 * Added several new testcases

[1]
https://lore.kernel.org/git/patch-1.1-7d90f26b73f-20220520T115426Z-avarab@gmail.com/
[2] https://lore.kernel.org/git/xmqqedyyghsc.fsf@gitster.g/ [3]
https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

Elijah Newren (7):
  merge-ort-wrappers: make printed message match the one from recursive
  merge-resolve: abort if index does not match HEAD
  merge: do not abort early if one strategy fails to handle the merge
  merge: fix save_state() to work when there are stat-dirty files
  merge: make restore_state() restore staged state too
  merge: ensure we can actually restore pre-merge state
  merge: do not exit restore_state() prematurely

 builtin/merge.c                          | 59 ++++++++++++++++++------
 git-merge-resolve.sh                     | 10 ++++
 merge-ort-wrappers.c                     |  7 ++-
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 58 +++++++++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 t/t7607-merge-state.sh                   | 32 +++++++++++++
 7 files changed, 154 insertions(+), 15 deletions(-)
 create mode 100755 t/t7607-merge-state.sh


base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1231%2Fnewren%2Ffix-merge-restore-state-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1231/newren/fix-merge-restore-state-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1231

Range-diff vs v2:

 1:  6147e72c309 < -:  ----------- t6424: make sure a failed merge preserves local changes
 2:  230d84f09c8 < -:  ----------- merge: remove unused variable
 -:  ----------- > 1:  e39b2e15ece merge-ort-wrappers: make printed message match the one from recursive
 -:  ----------- > 2:  2810dec7608 merge-resolve: abort if index does not match HEAD
 -:  ----------- > 3:  b41853e3f99 merge: do not abort early if one strategy fails to handle the merge
 3:  89e5e633241 ! 4:  64700338a28 merge: fix save_state() to work when there are racy-dirty files
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    merge: fix save_state() to work when there are racy-dirty files
     +    merge: fix save_state() to work when there are stat-dirty files
      
     -    When there are racy-dirty files, but no files are modified,
     +    When there are stat-dirty files, but no files are modified,
          `git stash create` exits with unsuccessful status.  This causes merge
     -    to fail.  Refresh the index first to avoid this problem.
     +    to fail.  Copy some code from sequencer.c's create_autostash to refresh
     +    the index first to avoid this problem.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/merge.c: static int save_state(struct object_id *stash)
       	strvec_pushl(&cp.args, "stash", "create", NULL);
       	cp.out = -1;
       	cp.git_cmd = 1;
     +
     + ## t/t6424-merge-unrelated-index-changes.sh ##
     +@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'subtree' '
     + 	test_path_is_missing .git/MERGE_HEAD
     + '
     + 
     ++test_expect_success 'avoid failure due to stat-dirty files' '
     ++	git reset --hard &&
     ++	git checkout B^0 &&
     ++
     ++	# Make "a" be stat-dirty
     ++	test-tool chmtime =+1 a &&
     ++
     ++	# stat-dirty file should not prevent stash creation in builtin/merge.c
     ++	git merge -s resolve -s recursive D^0
     ++'
     ++
     + test_expect_success 'resolve && recursive && ort' '
     + 	git reset --hard &&
     + 	git checkout B^0 &&
 4:  4a8b7c9e06d ! 5:  91c495c770e merge: make restore_state() restore staged state too
     @@ Metadata
       ## Commit message ##
          merge: make restore_state() restore staged state too
      
     -    merge can be invoked with uncommitted changes, including staged changes.
     -    merge is responsible for restoring this state if some of the merge
     -    strategies make changes.  However, it was not restoring staged changes
     -    due to the lack of the "--index" option to "git stash apply".  Add the
     -    option to fix this shortcoming.
     +    There are multiple issues at play here:
     +
     +      1) If `git merge` is invoked with staged changes, it should abort
     +         without doing any merging, and the user's working tree and index
     +         should be the same as before merge was invoked.
     +      2) Merge strategies are responsible for enforcing the index == HEAD
     +         requirement. (See 9822175d2b ("Ensure index matches head before
     +         invoking merge machinery, round N", 2019-08-17) for some history
     +         around this.)
     +      3) Merge strategies can bail saying they are not an appropriate
     +         handler for the merge in question (possibly allowing other
     +         strategies to be used instead).
     +      4) Merge strategies can make changes to the index and working tree,
     +         and have no expectation to clean up after themselves, *even* if
     +         they bail out and say they are not an appropriate handler for
     +         the merge in question.  (The `octopus` merge strategy does this,
     +         for example.)
     +      5) Because of (3) and (4), builtin/merge.c stashes state before
     +         trying merge strategies and restores it afterward.
     +
     +    Unfortunately, if users had staged changes before calling `git merge`,
     +    builtin/merge.c could do the following:
     +
     +       * stash the changes, in order to clean up after the strategies
     +       * try all the merge strategies in turn, each of which report they
     +         cannot function due to the index not matching HEAD
     +       * restore the changes via "git stash apply"
     +
     +    But that last step would have the net effect of unstaging the user's
     +    changes.  Fix this by adding the "--index" option to "git stash apply".
     +    While at it, also squelch the stash apply output; we already report
     +    "Rewinding the tree to pristine..." and don't need a detailed `git
     +    status` report afterwards.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/merge.c: static void reset_hard(const struct object_id *oid, int verbose
       			  const struct object_id *stash)
       {
      -	const char *args[] = { "stash", "apply", NULL, NULL };
     -+	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
     ++	const char *args[] = { "stash", "apply", "--index", "--quiet",
     ++			       NULL, NULL };
       
       	if (is_null_oid(stash))
       		return;
     @@ builtin/merge.c: static void reset_hard(const struct object_id *oid, int verbose
       	reset_hard(head, 1);
       
      -	args[2] = oid_to_hex(stash);
     -+	args[3] = oid_to_hex(stash);
     ++	args[4] = oid_to_hex(stash);
       
       	/*
       	 * It is OK to ignore error here, for example when there was
     +
     + ## t/t6424-merge-unrelated-index-changes.sh ##
     +@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'resolve && recursive && ort' '
     + 
     + 	test_seq 0 10 >a &&
     + 	git add a &&
     ++	git rev-parse :a >expect &&
     + 
     + 	sane_unset GIT_TEST_MERGE_ALGORITHM &&
     + 	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
     +@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'resolve && recursive && ort' '
     + 	grep "Trying merge strategy resolve..." output &&
     + 	grep "Trying merge strategy recursive..." output &&
     + 	grep "Trying merge strategy ort..." output &&
     +-	grep "No merge strategy handled the merge." output
     ++	grep "No merge strategy handled the merge." output &&
     ++
     ++	# Changes to "a" should remain staged
     ++	git rev-parse :a >actual &&
     ++	test_cmp expect actual
     + '
     + 
     + test_done
 5:  a03075167c1 ! 6:  887967c1f3f merge: ensure we can actually restore pre-merge state
     @@ Metadata
       ## Commit message ##
          merge: ensure we can actually restore pre-merge state
      
     -    Merge strategies can fail -- not just have conflicts, but give up and
     -    say that they are unable to handle the current type of merge.  However,
     -    they can also make changes to the index and working tree before giving
     -    up; merge-octopus does this, for example.  Currently, we do not expect
     -    the individual strategies to clean up after themselves, but instead
     -    expect builtin/merge.c to do so.  For it to be able to, it needs to save
     -    the state before trying the merge strategy so it can have something to
     -    restore to.  Therefore, remove the shortcut bypassing the save_state()
     -    call.
     +    Merge strategies can:
     +      * succeed with a clean merge
     +      * succeed with a conflicted merge
     +      * fail to handle the given type of merge
     +
     +    If one is thinking in terms of automatic mergeability, they would use
     +    the word "fail" instead of "succeed" for the second bullet, but I am
     +    focusing here on ability of the merge strategy to handle the given
     +    inputs, not on whether the given inputs are mergeable.  The third
     +    category is about the merge strategy failing to know how to handle the
     +    given data; examples include:
     +
     +      * Passing more than 2 branches to 'recursive' or 'ort'
     +      * Passing 2 or fewer branches to 'octopus'
     +      * Trying to do more complicated merges with 'resolve' (I believe
     +        directory/file conflicts will cause it to bail.)
     +      * Octopus running into a merge conflict for any branch OTHER than
     +        the final one (see the "exit 2" codepath of commit 98efc8f3d8
     +        ("octopus: allow manual resolve on the last round.", 2006-01-13))
     +
     +    That final one is particularly interesting, because it shows that the
     +    merge strategy can muck with the index and working tree, and THEN bail
     +    and say "sorry, this strategy cannot handle this type of merge; use
     +    something else".
     +
     +    Further, we do not currently expect the individual strategies to clean
     +    up after themselves, but instead expect builtin/merge.c to do so.  For
     +    it to be able to, it needs to save the state before trying the merge
     +    strategy so it can have something to restore to.  Therefore, remove the
     +    shortcut bypassing the save_state() call.
     +
     +    There is another bug on the restore_state() side of things, so no
     +    testcase will be added until the next commit when we have addressed that
     +    issue as well.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/merge.c: int cmd_merge(int argc, const char **argv, const char *prefix)
       	 * sync with the head commit.  The strategies are responsible
       	 * to ensure this.
      +	 *
     -+	 * Stash away the local changes so that we can try more than one.
     ++	 * Stash away the local changes so that we can try more than one
     ++	 * and/or recover from merge strategies bailing while leaving the
     ++	 * index and working tree polluted.
       	 */
      -	if (use_strategies_nr == 1 ||
      -	    /*
 6:  0783b48c121 ! 7:  81c40492a62 merge: do not exit restore_state() prematurely
     @@ Commit message
          appropriate function to do the work which would update the in-memory
          index automatically.  For now, just do the simple fix.)
      
     +    Also, add a testcase checking this, one for which the octopus strategy
     +    fails on the first commit it attempts to merge, and thus which it
     +    cannot handle at all and must completely bail on (as per the "exit 2"
     +    code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
     +    last round.", 2006-01-13)).
     +
          Reported-by: ZheNing Hu <adlternative@gmail.com>
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## builtin/merge.c ##
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,
     - {
     - 	const char *args[] = { "stash", "apply", "--index", NULL, NULL };
     + 	const char *args[] = { "stash", "apply", "--index", "--quiet",
     + 			       NULL, NULL };
       
      -	if (is_null_oid(stash))
      -		return;
     @@ builtin/merge.c: static void restore_state(const struct object_id *head,
      +	if (is_null_oid(stash))
      +		goto refresh_cache;
      +
     - 	args[3] = oid_to_hex(stash);
     + 	args[4] = oid_to_hex(stash);
       
       	/*
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,
     @@ t/t7607-merge-state.sh (new)
      +
      +test_expect_success 'set up custom strategy' '
      +	test_commit --no-tag "Initial" base base &&
     -+git show-ref &&
      +
      +	for b in branch1 branch2 branch3
      +	do
      +		git checkout -b $b main &&
     -+		test_commit --no-tag "Change on $b" base $b
     ++		test_commit --no-tag "Change on $b" base $b || return 1
      +	done &&
      +
      +	git checkout branch1 &&
     -+	test_must_fail git merge branch2 branch3 &&
     ++	# This is a merge that octopus cannot handle.  Note, that it does not
     ++	# just hit conflicts, it completely fails and says that it cannot
     ++	# handle this type of merge.
     ++	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
     ++	grep "fatal: merge program failed" output &&
     ++	grep "Should not be doing an octopus" output &&
     ++
     ++	# Make sure we did not leave stray changes around when no appropriate
     ++	# merge strategy was found
      +	git diff --exit-code --name-status &&
      +	test_path_is_missing .git/MERGE_HEAD
      +'

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21 15:47       ` Junio C Hamano
  2022-07-21  8:16     ` [PATCH v3 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
                       ` (6 subsequent siblings)
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When the index does not match HEAD, the merge strategies are responsible
to detect that condition and abort.  The merge-ort-wrappers had code to
implement this and meant to copy the error message from merge-recursive
but deviated in two ways, both due to the message in merge-recursive
being processed by another function that made additional changes:
  * It added an implicit "error: " prefix
  * It added an implicit trailing newline

Add these things, but do so in a couple extra steps to avoid having
translators need to translate another not-quite-identical string.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort-wrappers.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/merge-ort-wrappers.c b/merge-ort-wrappers.c
index ad041061695..d2c416bb5c0 100644
--- a/merge-ort-wrappers.c
+++ b/merge-ort-wrappers.c
@@ -10,8 +10,13 @@ static int unclean(struct merge_options *opt, struct tree *head)
 	struct strbuf sb = STRBUF_INIT;
 
 	if (head && repo_index_has_changes(opt->repo, head, &sb)) {
-		fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
+		struct strbuf err = STRBUF_INIT;
+		strbuf_addstr(&err, "error: ");
+		strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
 		    sb.buf);
+		strbuf_addch(&err, '\n');
+		fputs(err.buf, stderr);
+		strbuf_release(&err);
 		strbuf_release(&sb);
 		return -1;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

As noted in commit 9822175d2b ("Ensure index matches head before
invoking merge machinery, round N", 2019-08-17), we have had a very
long history of problems with failing to enforce the requirement that
index matches HEAD when starting a merge.  One of the commits
referenced in the long tale of issues arising from lax enforcement of
this requirement was commit 55f39cf755 ("merge: fix misleading
pre-merge check documentation", 2018-06-30), which tried to document
the requirement and noted there were some exceptions.  As mentioned in
that commit message, the `resolve` strategy was the one strategy that
did not have an explicit index matching HEAD check, and the reason it
didn't was that I wasn't able to discover any cases where the
implementation would fail to catch the problem and abort, and didn't
want to introduce unnecessary performance overhead of adding another
check.

Well, today I discovered a testcase where the implementation does not
catch the problem and so an explicit check is needed.  Add a testcase
that previously would have failed, and update git-merge-resolve.sh to
have an explicit check.  Note that the code is copied from 3ec62ad9ff
("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
that we reuse the same message and avoid making translators need to
translate some new message.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 20 ++++++++++++++++++
 git-merge-resolve.sh                     | 10 +++++++++
 t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 23170f2d2a6..13884b8e836 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		 */
 		refresh_cache(REFRESH_QUIET);
 		if (allow_trivial && fast_forward != FF_ONLY) {
+			/*
+			 * Must first ensure that index matches HEAD before
+			 * attempting a trivial merge.
+			 */
+			struct tree *head_tree = get_commit_tree(head_commit);
+			struct strbuf sb = STRBUF_INIT;
+
+			if (repo_index_has_changes(the_repository, head_tree,
+						   &sb)) {
+				struct strbuf err = STRBUF_INIT;
+				strbuf_addstr(&err, "error: ");
+				strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
+					    sb.buf);
+				strbuf_addch(&err, '\n');
+				fputs(err.buf, stderr);
+				strbuf_release(&err);
+				strbuf_release(&sb);
+				return -1;
+			}
+
 			/* See if it is really trivial. */
 			git_committer_info(IDENT_STRICT);
 			printf(_("Trying really trivial in-index merge...\n"));
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 343fe7bccd0..77e93121bf8 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -5,6 +5,16 @@
 #
 # Resolve two trees, using enhanced multi-base read-tree.
 
+. git-sh-setup
+
+# Abort if index does not match HEAD
+if ! git diff-index --quiet --cached HEAD --
+then
+    gettextln "Error: Your local changes to the following files would be overwritten by merge"
+    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
+    exit 2
+fi
+
 # The first parameters up to -- are merge bases; the rest are heads.
 bases= head= remotes= sep_seen=
 for arg
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index b6e424a427b..f35d3182b86 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve, trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve C^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_expect_success 'resolve, non-trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve D^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
 test_expect_success 'recursive' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21 16:09       ` Junio C Hamano
  2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
  2022-07-21  8:16     ` [PATCH v3 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
                       ` (4 subsequent siblings)
  7 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  6 ++++--
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 13884b8e836..dec7375bf2a 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		else
 			clean = merge_recursive(&o, head, remoteheads->item,
 						reversed, &result);
-		if (clean < 0)
-			exit(128);
+		if (clean < 0) {
+			rollback_lock_file(&lock);
+			return 2;
+		}
 		if (write_locked_index(&the_index, &lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
index 3a32b1a45cf..772238e582c 100755
--- a/t/t6402-merge-rename.sh
+++ b/t/t6402-merge-rename.sh
@@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
 	echo >>M one line addition &&
 	cat M >M.saved &&
 	git update-index M &&
-	test_expect_code 128 git pull --no-rebase . yellow &&
+	test_expect_code 2 git pull --no-rebase . yellow &&
 	test_cmp M M.saved &&
 	rm -f M.saved
 '
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index f35d3182b86..8b749e19083 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -268,4 +268,20 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve && recursive && ort' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	test_seq 0 10 >a &&
+	git add a &&
+
+	sane_unset GIT_TEST_MERGE_ALGORITHM &&
+	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
+
+	grep "Trying merge strategy resolve..." output &&
+	grep "Trying merge strategy recursive..." output &&
+	grep "Trying merge strategy ort..." output &&
+	grep "No merge strategy handled the merge." output
+'
+
 test_done
diff --git a/t/t6439-merge-co-error-msgs.sh b/t/t6439-merge-co-error-msgs.sh
index 5bfb027099a..52cf0c87690 100755
--- a/t/t6439-merge-co-error-msgs.sh
+++ b/t/t6439-merge-co-error-msgs.sh
@@ -47,6 +47,7 @@ test_expect_success 'untracked files overwritten by merge (fast and non-fast for
 		export GIT_MERGE_VERBOSITY &&
 		test_must_fail git merge branch 2>out2
 	) &&
+	echo "Merge with strategy ${GIT_TEST_MERGE_ALGORITHM:-ort} failed." >>expect &&
 	test_cmp out2 expect &&
 	git reset --hard HEAD^
 '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 4/7] merge: fix save_state() to work when there are stat-dirty files
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  8 ++++++++
 t/t6424-merge-unrelated-index-changes.sh | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index dec7375bf2a..4170c30317e 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
 	int len;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buffer = STRBUF_INIT;
+	struct lock_file lock_file = LOCK_INIT;
+	int fd;
 	int rc = -1;
 
+	fd = repo_hold_locked_index(the_repository, &lock_file, 0);
+	refresh_cache(REFRESH_QUIET);
+	if (0 <= fd)
+		repo_update_index_if_able(the_repository, &lock_file);
+	rollback_lock_file(&lock_file);
+
 	strvec_pushl(&cp.args, "stash", "create", NULL);
 	cp.out = -1;
 	cp.git_cmd = 1;
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 8b749e19083..3019d030e07 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -268,6 +268,17 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'avoid failure due to stat-dirty files' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	# Make "a" be stat-dirty
+	test-tool chmtime =+1 a &&
+
+	# stat-dirty file should not prevent stash creation in builtin/merge.c
+	git merge -s resolve -s recursive D^0
+'
+
 test_expect_success 'resolve && recursive && ort' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 5/7] merge: make restore_state() restore staged state too
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-07-21  8:16     ` [PATCH v3 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21 16:16       ` Junio C Hamano
  2022-07-21 16:24       ` Junio C Hamano
  2022-07-21  8:16     ` [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
                       ` (2 subsequent siblings)
  7 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 5 +++--
 t/t6424-merge-unrelated-index-changes.sh | 7 ++++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 4170c30317e..f807bf335bd 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -383,14 +383,15 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	const char *args[] = { "stash", "apply", NULL, NULL };
+	const char *args[] = { "stash", "apply", "--index", "--quiet",
+			       NULL, NULL };
 
 	if (is_null_oid(stash))
 		return;
 
 	reset_hard(head, 1);
 
-	args[2] = oid_to_hex(stash);
+	args[4] = oid_to_hex(stash);
 
 	/*
 	 * It is OK to ignore error here, for example when there was
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 3019d030e07..c96649448fa 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -285,6 +285,7 @@ test_expect_success 'resolve && recursive && ort' '
 
 	test_seq 0 10 >a &&
 	git add a &&
+	git rev-parse :a >expect &&
 
 	sane_unset GIT_TEST_MERGE_ALGORITHM &&
 	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
@@ -292,7 +293,11 @@ test_expect_success 'resolve && recursive && ort' '
 	grep "Trying merge strategy resolve..." output &&
 	grep "Trying merge strategy recursive..." output &&
 	grep "Trying merge strategy ort..." output &&
-	grep "No merge strategy handled the merge." output
+	grep "No merge strategy handled the merge." output &&
+
+	# Changes to "a" should remain staged
+	git rev-parse :a >actual &&
+	test_cmp expect actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21 16:31       ` Junio C Hamano
  2022-07-21  8:16     ` [PATCH v3 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index f807bf335bd..11bb4bab0a1 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1686,12 +1686,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	 * tree in the index -- this means that the index must be in
 	 * sync with the head commit.  The strategies are responsible
 	 * to ensure this.
+	 *
+	 * Stash away the local changes so that we can try more than one
+	 * and/or recover from merge strategies bailing while leaving the
+	 * index and working tree polluted.
 	 */
-	if (use_strategies_nr == 1 ||
-	    /*
-	     * Stash away the local changes so that we can try more than one.
-	     */
-	    save_state(&stash))
+	if (save_state(&stash))
 		oidclr(&stash);
 
 	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v3 7/7] merge: do not exit restore_state() prematurely
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                       ` (5 preceding siblings ...)
  2022-07-21  8:16     ` [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-07-21  8:16     ` Elijah Newren via GitGitGadget
  2022-07-21 16:34       ` Junio C Hamano
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-21  8:16 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c        | 10 ++++++----
 t/t7607-merge-state.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100755 t/t7607-merge-state.sh

diff --git a/builtin/merge.c b/builtin/merge.c
index 11bb4bab0a1..7fb4414ebb7 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -386,11 +386,11 @@ static void restore_state(const struct object_id *head,
 	const char *args[] = { "stash", "apply", "--index", "--quiet",
 			       NULL, NULL };
 
-	if (is_null_oid(stash))
-		return;
-
 	reset_hard(head, 1);
 
+	if (is_null_oid(stash))
+		goto refresh_cache;
+
 	args[4] = oid_to_hex(stash);
 
 	/*
@@ -399,7 +399,9 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	refresh_cache(REFRESH_QUIET);
+refresh_cache:
+	if (discard_cache() < 0 || read_cache() < 0)
+		die(_("could not read index"));
 }
 
 /* This is called when no merge was necessary. */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
new file mode 100755
index 00000000000..fc33d57357b
--- /dev/null
+++ b/t/t7607-merge-state.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description="Test that merge state is as expected after failed merge"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+. ./test-lib.sh
+
+test_expect_success 'set up custom strategy' '
+	test_commit --no-tag "Initial" base base &&
+
+	for b in branch1 branch2 branch3
+	do
+		git checkout -b $b main &&
+		test_commit --no-tag "Change on $b" base $b || return 1
+	done &&
+
+	git checkout branch1 &&
+	# This is a merge that octopus cannot handle.  Note, that it does not
+	# just hit conflicts, it completely fails and says that it cannot
+	# handle this type of merge.
+	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
+	grep "fatal: merge program failed" output &&
+	grep "Should not be doing an octopus" output &&
+
+	# Make sure we did not leave stray changes around when no appropriate
+	# merge strategy was found
+	git diff --exit-code --name-status &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
@ 2022-07-21 15:47       ` Junio C Hamano
  2022-07-21 19:51         ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 15:47 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  	if (head && repo_index_has_changes(opt->repo, head, &sb)) {
> -		fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> +		struct strbuf err = STRBUF_INIT;
> +		strbuf_addstr(&err, "error: ");
> +		strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
>  		    sb.buf);
> +		strbuf_addch(&err, '\n');
> +		fputs(err.buf, stderr);
> +		strbuf_release(&err);

Makes me wonder why this is not a mere

	error(_("Your local chagnes ... by merge:\n  %s"), sb.buf);

that reuses the exact string.  The err() function in merge-recursive.c 
is strangely complex (and probably buggy---if it is not buffering
output, it adds "error: " prefix to opt->obuf before calling vaddf
to add the message, and then sends that to error() to give it
another "error: " prefix), but all the above does is to send a
message to standard error stream.

>  		strbuf_release(&sb);
>  		return -1;
>  	}

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
@ 2022-07-21 16:09       ` Junio C Hamano
  2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 16:09 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>  		else
>  			clean = merge_recursive(&o, head, remoteheads->item,
>  						reversed, &result);
> -		if (clean < 0)
> -			exit(128);
> +		if (clean < 0) {
> +			rollback_lock_file(&lock);
> +			return 2;
> +		}

Very good find.  I however wonder if negative returns are signaling
a situation where they cannot cleanly recover from (but even if it
is the case, if we are willing to do the save-restore dance, then it
is probably OK).

> diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
> index 3a32b1a45cf..772238e582c 100755
> --- a/t/t6402-merge-rename.sh
> +++ b/t/t6402-merge-rename.sh
> @@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
>  	echo >>M one line addition &&
>  	cat M >M.saved &&
>  	git update-index M &&
> -	test_expect_code 128 git pull --no-rebase . yellow &&
> +	test_expect_code 2 git pull --no-rebase . yellow &&
>  	test_cmp M M.saved &&
>  	rm -f M.saved
>  '

Understandable.

> diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> index f35d3182b86..8b749e19083 100755
> --- a/t/t6424-merge-unrelated-index-changes.sh
> +++ b/t/t6424-merge-unrelated-index-changes.sh
> @@ -268,4 +268,20 @@ test_expect_success 'subtree' '
>  	test_path_is_missing .git/MERGE_HEAD
>  '
>  
> +test_expect_success 'resolve && recursive && ort' '
> +	git reset --hard &&
> +	git checkout B^0 &&
> +
> +	test_seq 0 10 >a &&
> +	git add a &&
> +
> +	sane_unset GIT_TEST_MERGE_ALGORITHM &&
> +	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
> +
> +	grep "Trying merge strategy resolve..." output &&
> +	grep "Trying merge strategy recursive..." output &&
> +	grep "Trying merge strategy ort..." output &&
> +	grep "No merge strategy handled the merge." output
> +'

Makes sense.

>  test_done
> diff --git a/t/t6439-merge-co-error-msgs.sh b/t/t6439-merge-co-error-msgs.sh
> index 5bfb027099a..52cf0c87690 100755
> --- a/t/t6439-merge-co-error-msgs.sh
> +++ b/t/t6439-merge-co-error-msgs.sh
> @@ -47,6 +47,7 @@ test_expect_success 'untracked files overwritten by merge (fast and non-fast for
>  		export GIT_MERGE_VERBOSITY &&
>  		test_must_fail git merge branch 2>out2
>  	) &&
> +	echo "Merge with strategy ${GIT_TEST_MERGE_ALGORITHM:-ort} failed." >>expect &&
>  	test_cmp out2 expect &&
>  	git reset --hard HEAD^
>  '

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 5/7] merge: make restore_state() restore staged state too
  2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-21 16:16       ` Junio C Hamano
  2022-07-21 16:24       ` Junio C Hamano
  1 sibling, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 16:16 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>   4) Merge strategies can make changes to the index and working tree,
>      and have no expectation to clean up after themselves, *even* if
>      they bail out and say they are not an appropriate handler for
>      the merge in question.  (The `octopus` merge strategy does this,
>      for example.)

I personally consider this is a bug in `octopus` (and I can do so
without offending anybody, as `octopus` was what I did), but because
the point of having pluggable merge strategies is to allow end users
and third parties write their own.  So save-and-restore dance would
be a prudent approach to the issue than forbidding this "buggy"
behaviour.

> Unfortunately, if users had staged changes before calling `git merge`,
> builtin/merge.c could do the following:
>
>    * stash the changes, in order to clean up after the strategies
>    * try all the merge strategies in turn, each of which report they
>      cannot function due to the index not matching HEAD
>    * restore the changes via "git stash apply"

>  	test_seq 0 10 >a &&
>  	git add a &&
> +	git rev-parse :a >expect &&
> ...
> +	# Changes to "a" should remain staged
> +	git rev-parse :a >actual &&
> +	test_cmp expect actual

Makes sense.

Thanks.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 5/7] merge: make restore_state() restore staged state too
  2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
  2022-07-21 16:16       ` Junio C Hamano
@ 2022-07-21 16:24       ` Junio C Hamano
  2022-07-21 19:52         ` Elijah Newren
  1 sibling, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 16:24 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>    * stash the changes, in order to clean up after the strategies
>    * try all the merge strategies in turn, each of which report they
>      cannot function due to the index not matching HEAD
>    * restore the changes via "git stash apply"

A tangent that does not make much difference in the end, but I am
finding these lines curious and somewhat annoying.  Why do we have
&nbsp; sandwitching the plain whitespace only on lines that begin
with an asterisk?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2022-07-21  8:16     ` [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-07-21 16:31       ` Junio C Hamano
  2023-03-02  7:17         ` Ben Humphreys
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 16:31 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/builtin/merge.c b/builtin/merge.c
> index f807bf335bd..11bb4bab0a1 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1686,12 +1686,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>  	 * tree in the index -- this means that the index must be in
>  	 * sync with the head commit.  The strategies are responsible
>  	 * to ensure this.
> +	 *
> +	 * Stash away the local changes so that we can try more than one
> +	 * and/or recover from merge strategies bailing while leaving the
> +	 * index and working tree polluted.
>  	 */

Makes sense.  We may want to special-case strategies that are known
not to have the buggy "leave contaminated tree when bailing out"
behaviour to avoid waste.  I expect that more than 99.99% of the
time people are feeding a single other commit to ort or recursive,
and if these are known to be safe, a lot will be saved by not saving
"just in case".  But that can be left for later, after the series
solidifies.

Thanks.

> -	if (use_strategies_nr == 1 ||
> -	    /*
> -	     * Stash away the local changes so that we can try more than one.
> -	     */
> -	    save_state(&stash))
> +	if (save_state(&stash))
>  		oidclr(&stash);
>  
>  	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 7/7] merge: do not exit restore_state() prematurely
  2022-07-21  8:16     ` [PATCH v3 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-21 16:34       ` Junio C Hamano
  0 siblings, 0 replies; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 16:34 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -386,11 +386,11 @@ static void restore_state(const struct object_id *head,
>  	const char *args[] = { "stash", "apply", "--index", "--quiet",
>  			       NULL, NULL };
>  
> -	if (is_null_oid(stash))
> -		return;
> -
>  	reset_hard(head, 1);
>  
> +	if (is_null_oid(stash))
> +		goto refresh_cache;
> +

OK, so the idea is that we can call restore_state() without having
anything worth "restoring" in the stash, and what it means is that
we are restoring to HEAD.  As the current state does not necessarily
match HEAD, we should do the "reset --hard" part even if there was
nothing to stash.

Makes sense.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-21 15:47       ` Junio C Hamano
@ 2022-07-21 19:51         ` Elijah Newren
  2022-07-21 20:05           ` Junio C Hamano
  0 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren @ 2022-07-21 19:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Ævar Arnfjörð Bjarmason

On Thu, Jul 21, 2022 at 8:47 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >       if (head && repo_index_has_changes(opt->repo, head, &sb)) {
> > -             fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> > +             struct strbuf err = STRBUF_INIT;
> > +             strbuf_addstr(&err, "error: ");
> > +             strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> >                   sb.buf);
> > +             strbuf_addch(&err, '\n');
> > +             fputs(err.buf, stderr);
> > +             strbuf_release(&err);
>
> Makes me wonder why this is not a mere
>
>         error(_("Your local chagnes ... by merge:\n  %s"), sb.buf);
>
> that reuses the exact string.  The err() function in merge-recursive.c
> is strangely complex (and probably buggy---if it is not buffering
> output, it adds "error: " prefix to opt->obuf before calling vaddf
> to add the message, and then sends that to error() to give it
> another "error: " prefix), but all the above does is to send a
> message to standard error stream.

Ah, that would be nicer; thanks for the pointer.  I would still need
to prefix it with an
    strbuf_addch(&sb, '\n');
but two lines certainly beats six.

>
> >               strbuf_release(&sb);
> >               return -1;
> >       }

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 5/7] merge: make restore_state() restore staged state too
  2022-07-21 16:24       ` Junio C Hamano
@ 2022-07-21 19:52         ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21 19:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Ævar Arnfjörð Bjarmason

On Thu, Jul 21, 2022 at 9:24 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >    * stash the changes, in order to clean up after the strategies
> >    * try all the merge strategies in turn, each of which report they
> >      cannot function due to the index not matching HEAD
> >    * restore the changes via "git stash apply"
>
> A tangent that does not make much difference in the end, but I am
> finding these lines curious and somewhat annoying.  Why do we have
> &nbsp; sandwitching the plain whitespace only on lines that begin
> with an asterisk?

Eek!  That's nasty.  I was switching back and forth between responding
to emails and coding up fixes, and copy-pasted part of my email into
the commit message.  So, apparently gmail hates me, and I just didn't
notice.

I'll clean these up; I'm rerolling for the "error()" cleanup you
mentioned anyway.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-21 19:51         ` Elijah Newren
@ 2022-07-21 20:05           ` Junio C Hamano
  2022-07-21 21:14             ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-21 20:05 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Ævar Arnfjörð Bjarmason

Elijah Newren <newren@gmail.com> writes:

> On Thu, Jul 21, 2022 at 8:47 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>> >       if (head && repo_index_has_changes(opt->repo, head, &sb)) {
>> > -             fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
>> > +             struct strbuf err = STRBUF_INIT;
>> > +             strbuf_addstr(&err, "error: ");
>> > +             strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
>> >                   sb.buf);
>> > +             strbuf_addch(&err, '\n');
>> > +             fputs(err.buf, stderr);
>> > +             strbuf_release(&err);
>>
>> Makes me wonder why this is not a mere
>>
>>         error(_("Your local chagnes ... by merge:\n  %s"), sb.buf);
>>
>> that reuses the exact string.  The err() function in merge-recursive.c
>> is strangely complex (and probably buggy---if it is not buffering
>> output, it adds "error: " prefix to opt->obuf before calling vaddf
>> to add the message, and then sends that to error() to give it
>> another "error: " prefix), but all the above does is to send a
>> message to standard error stream.
>
> Ah, that would be nicer; thanks for the pointer.  I would still need
> to prefix it with an
>     strbuf_addch(&sb, '\n');
> but two lines certainly beats six.

Your "strbuf" version uses the same format string as my error()
thing and then manually add one LF at the end, before sending it to
fputs(), which, unlike puts() does not add any extra LF at the end.

error() gives a terminating newline at the end.

Do you still need to add one more?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-21 20:05           ` Junio C Hamano
@ 2022-07-21 21:14             ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-21 21:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Ævar Arnfjörð Bjarmason

On Thu, Jul 21, 2022 at 1:05 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Thu, Jul 21, 2022 at 8:47 AM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >>
> >> >       if (head && repo_index_has_changes(opt->repo, head, &sb)) {
> >> > -             fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> >> > +             struct strbuf err = STRBUF_INIT;
> >> > +             strbuf_addstr(&err, "error: ");
> >> > +             strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> >> >                   sb.buf);
> >> > +             strbuf_addch(&err, '\n');
> >> > +             fputs(err.buf, stderr);
> >> > +             strbuf_release(&err);
> >>
> >> Makes me wonder why this is not a mere
> >>
> >>         error(_("Your local chagnes ... by merge:\n  %s"), sb.buf);
> >>
> >> that reuses the exact string.  The err() function in merge-recursive.c
> >> is strangely complex (and probably buggy---if it is not buffering
> >> output, it adds "error: " prefix to opt->obuf before calling vaddf
> >> to add the message, and then sends that to error() to give it
> >> another "error: " prefix), but all the above does is to send a
> >> message to standard error stream.
> >
> > Ah, that would be nicer; thanks for the pointer.  I would still need
> > to prefix it with an
> >     strbuf_addch(&sb, '\n');
> > but two lines certainly beats six.
>
> Your "strbuf" version uses the same format string as my error()
> thing and then manually add one LF at the end, before sending it to
> fputs(), which, unlike puts() does not add any extra LF at the end.
>
> error() gives a terminating newline at the end.
>
> Do you still need to add one more?

Ah, sorry, my mistake.  I somehow thought error() just added the
"error: " prefix.  So, indeed, this is just a change from
fprintf(stderr, ...) to error(...).  No second newline needed.

Thanks!

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v4 0/7] Fix merge restore state
  2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                       ` (6 preceding siblings ...)
  2022-07-21  8:16     ` [PATCH v3 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-22  5:15     ` Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
                         ` (7 more replies)
  7 siblings, 8 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git; +Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren

This started as a simple series to fix restore_state() in builtin/merge.c,
fixing an issue reported by ZheNing Hu[3]. It's grown so much it's hard to
call it simple. Anyway...

Changes since v3:

 * Removed some accidental &nbsp; characters from a commit message
 * Made use of the error() function to simplify the first patch

[1]
https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

Elijah Newren (7):
  merge-ort-wrappers: make printed message match the one from recursive
  merge-resolve: abort if index does not match HEAD
  merge: do not abort early if one strategy fails to handle the merge
  merge: fix save_state() to work when there are stat-dirty files
  merge: make restore_state() restore staged state too
  merge: ensure we can actually restore pre-merge state
  merge: do not exit restore_state() prematurely

 builtin/merge.c                          | 59 ++++++++++++++++++------
 git-merge-resolve.sh                     | 10 ++++
 merge-ort-wrappers.c                     |  4 +-
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 58 +++++++++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 t/t7607-merge-state.sh                   | 32 +++++++++++++
 7 files changed, 150 insertions(+), 16 deletions(-)
 create mode 100755 t/t7607-merge-state.sh


base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1231%2Fnewren%2Ffix-merge-restore-state-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1231/newren/fix-merge-restore-state-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1231

Range-diff vs v3:

 1:  e39b2e15ece ! 1:  bd36d16c8d9 merge-ort-wrappers: make printed message match the one from recursive
     @@ Commit message
          being processed by another function that made additional changes:
            * It added an implicit "error: " prefix
            * It added an implicit trailing newline
     -
     -    Add these things, but do so in a couple extra steps to avoid having
     -    translators need to translate another not-quite-identical string.
     +    We can get these things by making use of the error() function.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ merge-ort-wrappers.c: static int unclean(struct merge_options *opt, struct tree
       
       	if (head && repo_index_has_changes(opt->repo, head, &sb)) {
      -		fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
     -+		struct strbuf err = STRBUF_INIT;
     -+		strbuf_addstr(&err, "error: ");
     -+		strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
     - 		    sb.buf);
     -+		strbuf_addch(&err, '\n');
     -+		fputs(err.buf, stderr);
     -+		strbuf_release(&err);
     +-		    sb.buf);
     ++		error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
     ++		      sb.buf);
       		strbuf_release(&sb);
       		return -1;
       	}
 2:  2810dec7608 = 2:  b79f44e54b9 merge-resolve: abort if index does not match HEAD
 3:  b41853e3f99 = 3:  02930448ea1 merge: do not abort early if one strategy fails to handle the merge
 4:  64700338a28 = 4:  daf8d224160 merge: fix save_state() to work when there are stat-dirty files
 5:  91c495c770e ! 5:  f401bd5ad0d merge: make restore_state() restore staged state too
     @@ Commit message
          Unfortunately, if users had staged changes before calling `git merge`,
          builtin/merge.c could do the following:
      
     -       * stash the changes, in order to clean up after the strategies
     -       * try all the merge strategies in turn, each of which report they
     +       * stash the changes, in order to clean up after the strategies
     +       * try all the merge strategies in turn, each of which report they
               cannot function due to the index not matching HEAD
     -       * restore the changes via "git stash apply"
     +       * restore the changes via "git stash apply"
      
          But that last step would have the net effect of unstaging the user's
          changes.  Fix this by adding the "--index" option to "git stash apply".
 6:  887967c1f3f = 6:  ad5354c219c merge: ensure we can actually restore pre-merge state
 7:  81c40492a62 = 7:  6212d572604 merge: do not exit restore_state() prematurely

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When the index does not match HEAD, the merge strategies are responsible
to detect that condition and abort.  The merge-ort-wrappers had code to
implement this and meant to copy the error message from merge-recursive
but deviated in two ways, both due to the message in merge-recursive
being processed by another function that made additional changes:
  * It added an implicit "error: " prefix
  * It added an implicit trailing newline
We can get these things by making use of the error() function.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort-wrappers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort-wrappers.c b/merge-ort-wrappers.c
index ad041061695..748924a69ba 100644
--- a/merge-ort-wrappers.c
+++ b/merge-ort-wrappers.c
@@ -10,8 +10,8 @@ static int unclean(struct merge_options *opt, struct tree *head)
 	struct strbuf sb = STRBUF_INIT;
 
 	if (head && repo_index_has_changes(opt->repo, head, &sb)) {
-		fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
-		    sb.buf);
+		error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
+		      sb.buf);
 		strbuf_release(&sb);
 		return -1;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22 10:27         ` Ævar Arnfjörð Bjarmason
  2022-07-22  5:15       ` [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
                         ` (5 subsequent siblings)
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

As noted in commit 9822175d2b ("Ensure index matches head before
invoking merge machinery, round N", 2019-08-17), we have had a very
long history of problems with failing to enforce the requirement that
index matches HEAD when starting a merge.  One of the commits
referenced in the long tale of issues arising from lax enforcement of
this requirement was commit 55f39cf755 ("merge: fix misleading
pre-merge check documentation", 2018-06-30), which tried to document
the requirement and noted there were some exceptions.  As mentioned in
that commit message, the `resolve` strategy was the one strategy that
did not have an explicit index matching HEAD check, and the reason it
didn't was that I wasn't able to discover any cases where the
implementation would fail to catch the problem and abort, and didn't
want to introduce unnecessary performance overhead of adding another
check.

Well, today I discovered a testcase where the implementation does not
catch the problem and so an explicit check is needed.  Add a testcase
that previously would have failed, and update git-merge-resolve.sh to
have an explicit check.  Note that the code is copied from 3ec62ad9ff
("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
that we reuse the same message and avoid making translators need to
translate some new message.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 20 ++++++++++++++++++
 git-merge-resolve.sh                     | 10 +++++++++
 t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 23170f2d2a6..13884b8e836 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		 */
 		refresh_cache(REFRESH_QUIET);
 		if (allow_trivial && fast_forward != FF_ONLY) {
+			/*
+			 * Must first ensure that index matches HEAD before
+			 * attempting a trivial merge.
+			 */
+			struct tree *head_tree = get_commit_tree(head_commit);
+			struct strbuf sb = STRBUF_INIT;
+
+			if (repo_index_has_changes(the_repository, head_tree,
+						   &sb)) {
+				struct strbuf err = STRBUF_INIT;
+				strbuf_addstr(&err, "error: ");
+				strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
+					    sb.buf);
+				strbuf_addch(&err, '\n');
+				fputs(err.buf, stderr);
+				strbuf_release(&err);
+				strbuf_release(&sb);
+				return -1;
+			}
+
 			/* See if it is really trivial. */
 			git_committer_info(IDENT_STRICT);
 			printf(_("Trying really trivial in-index merge...\n"));
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 343fe7bccd0..77e93121bf8 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -5,6 +5,16 @@
 #
 # Resolve two trees, using enhanced multi-base read-tree.
 
+. git-sh-setup
+
+# Abort if index does not match HEAD
+if ! git diff-index --quiet --cached HEAD --
+then
+    gettextln "Error: Your local changes to the following files would be overwritten by merge"
+    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
+    exit 2
+fi
+
 # The first parameters up to -- are merge bases; the rest are heads.
 bases= head= remotes= sep_seen=
 for arg
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index b6e424a427b..f35d3182b86 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve, trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve C^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_expect_success 'resolve, non-trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve D^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
 test_expect_success 'recursive' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22 10:47         ` Ævar Arnfjörð Bjarmason
  2022-07-22  5:15       ` [PATCH v4 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  6 ++++--
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 13884b8e836..dec7375bf2a 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		else
 			clean = merge_recursive(&o, head, remoteheads->item,
 						reversed, &result);
-		if (clean < 0)
-			exit(128);
+		if (clean < 0) {
+			rollback_lock_file(&lock);
+			return 2;
+		}
 		if (write_locked_index(&the_index, &lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
index 3a32b1a45cf..772238e582c 100755
--- a/t/t6402-merge-rename.sh
+++ b/t/t6402-merge-rename.sh
@@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
 	echo >>M one line addition &&
 	cat M >M.saved &&
 	git update-index M &&
-	test_expect_code 128 git pull --no-rebase . yellow &&
+	test_expect_code 2 git pull --no-rebase . yellow &&
 	test_cmp M M.saved &&
 	rm -f M.saved
 '
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index f35d3182b86..8b749e19083 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -268,4 +268,20 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve && recursive && ort' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	test_seq 0 10 >a &&
+	git add a &&
+
+	sane_unset GIT_TEST_MERGE_ALGORITHM &&
+	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
+
+	grep "Trying merge strategy resolve..." output &&
+	grep "Trying merge strategy recursive..." output &&
+	grep "Trying merge strategy ort..." output &&
+	grep "No merge strategy handled the merge." output
+'
+
 test_done
diff --git a/t/t6439-merge-co-error-msgs.sh b/t/t6439-merge-co-error-msgs.sh
index 5bfb027099a..52cf0c87690 100755
--- a/t/t6439-merge-co-error-msgs.sh
+++ b/t/t6439-merge-co-error-msgs.sh
@@ -47,6 +47,7 @@ test_expect_success 'untracked files overwritten by merge (fast and non-fast for
 		export GIT_MERGE_VERBOSITY &&
 		test_must_fail git merge branch 2>out2
 	) &&
+	echo "Merge with strategy ${GIT_TEST_MERGE_ALGORITHM:-ort} failed." >>expect &&
 	test_cmp out2 expect &&
 	git reset --hard HEAD^
 '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 4/7] merge: fix save_state() to work when there are stat-dirty files
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-07-22  5:15       ` [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  8 ++++++++
 t/t6424-merge-unrelated-index-changes.sh | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index dec7375bf2a..4170c30317e 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
 	int len;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buffer = STRBUF_INIT;
+	struct lock_file lock_file = LOCK_INIT;
+	int fd;
 	int rc = -1;
 
+	fd = repo_hold_locked_index(the_repository, &lock_file, 0);
+	refresh_cache(REFRESH_QUIET);
+	if (0 <= fd)
+		repo_update_index_if_able(the_repository, &lock_file);
+	rollback_lock_file(&lock_file);
+
 	strvec_pushl(&cp.args, "stash", "create", NULL);
 	cp.out = -1;
 	cp.git_cmd = 1;
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 8b749e19083..3019d030e07 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -268,6 +268,17 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'avoid failure due to stat-dirty files' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	# Make "a" be stat-dirty
+	test-tool chmtime =+1 a &&
+
+	# stat-dirty file should not prevent stash creation in builtin/merge.c
+	git merge -s resolve -s recursive D^0
+'
+
 test_expect_success 'resolve && recursive && ort' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 5/7] merge: make restore_state() restore staged state too
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-07-22  5:15       ` [PATCH v4 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22 10:53         ` Ævar Arnfjörð Bjarmason
  2022-07-22  5:15       ` [PATCH v4 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
                         ` (2 subsequent siblings)
  7 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 5 +++--
 t/t6424-merge-unrelated-index-changes.sh | 7 ++++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 4170c30317e..f807bf335bd 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -383,14 +383,15 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	const char *args[] = { "stash", "apply", NULL, NULL };
+	const char *args[] = { "stash", "apply", "--index", "--quiet",
+			       NULL, NULL };
 
 	if (is_null_oid(stash))
 		return;
 
 	reset_hard(head, 1);
 
-	args[2] = oid_to_hex(stash);
+	args[4] = oid_to_hex(stash);
 
 	/*
 	 * It is OK to ignore error here, for example when there was
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 3019d030e07..c96649448fa 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -285,6 +285,7 @@ test_expect_success 'resolve && recursive && ort' '
 
 	test_seq 0 10 >a &&
 	git add a &&
+	git rev-parse :a >expect &&
 
 	sane_unset GIT_TEST_MERGE_ALGORITHM &&
 	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
@@ -292,7 +293,11 @@ test_expect_success 'resolve && recursive && ort' '
 	grep "Trying merge strategy resolve..." output &&
 	grep "Trying merge strategy recursive..." output &&
 	grep "Trying merge strategy ort..." output &&
-	grep "No merge strategy handled the merge." output
+	grep "No merge strategy handled the merge." output &&
+
+	# Changes to "a" should remain staged
+	git rev-parse :a >actual &&
+	test_cmp expect actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 6/7] merge: ensure we can actually restore pre-merge state
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-07-22  5:15       ` [PATCH v4 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-22  5:15       ` [PATCH v4 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index f807bf335bd..11bb4bab0a1 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1686,12 +1686,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	 * tree in the index -- this means that the index must be in
 	 * sync with the head commit.  The strategies are responsible
 	 * to ensure this.
+	 *
+	 * Stash away the local changes so that we can try more than one
+	 * and/or recover from merge strategies bailing while leaving the
+	 * index and working tree polluted.
 	 */
-	if (use_strategies_nr == 1 ||
-	    /*
-	     * Stash away the local changes so that we can try more than one.
-	     */
-	    save_state(&stash))
+	if (save_state(&stash))
 		oidclr(&stash);
 
 	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v4 7/7] merge: do not exit restore_state() prematurely
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-07-22  5:15       ` [PATCH v4 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-07-22  5:15       ` Elijah Newren via GitGitGadget
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
  7 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-22  5:15 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c        | 10 ++++++----
 t/t7607-merge-state.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100755 t/t7607-merge-state.sh

diff --git a/builtin/merge.c b/builtin/merge.c
index 11bb4bab0a1..7fb4414ebb7 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -386,11 +386,11 @@ static void restore_state(const struct object_id *head,
 	const char *args[] = { "stash", "apply", "--index", "--quiet",
 			       NULL, NULL };
 
-	if (is_null_oid(stash))
-		return;
-
 	reset_hard(head, 1);
 
+	if (is_null_oid(stash))
+		goto refresh_cache;
+
 	args[4] = oid_to_hex(stash);
 
 	/*
@@ -399,7 +399,9 @@ static void restore_state(const struct object_id *head,
 	 */
 	run_command_v_opt(args, RUN_GIT_CMD);
 
-	refresh_cache(REFRESH_QUIET);
+refresh_cache:
+	if (discard_cache() < 0 || read_cache() < 0)
+		die(_("could not read index"));
 }
 
 /* This is called when no merge was necessary. */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
new file mode 100755
index 00000000000..fc33d57357b
--- /dev/null
+++ b/t/t7607-merge-state.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description="Test that merge state is as expected after failed merge"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+. ./test-lib.sh
+
+test_expect_success 'set up custom strategy' '
+	test_commit --no-tag "Initial" base base &&
+
+	for b in branch1 branch2 branch3
+	do
+		git checkout -b $b main &&
+		test_commit --no-tag "Change on $b" base $b || return 1
+	done &&
+
+	git checkout branch1 &&
+	# This is a merge that octopus cannot handle.  Note, that it does not
+	# just hit conflicts, it completely fails and says that it cannot
+	# handle this type of merge.
+	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
+	grep "fatal: merge program failed" output &&
+	grep "Should not be doing an octopus" output &&
+
+	# Make sure we did not leave stray changes around when no appropriate
+	# merge strategy was found
+	git diff --exit-code --name-status &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-22  5:15       ` [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
@ 2022-07-22 10:27         ` Ævar Arnfjörð Bjarmason
  2022-07-23  0:28           ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-22 10:27 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren


On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> As noted in commit 9822175d2b ("Ensure index matches head before
> invoking merge machinery, round N", 2019-08-17), we have had a very
> long history of problems with failing to enforce the requirement that
> index matches HEAD when starting a merge.  One of the commits
> referenced in the long tale of issues arising from lax enforcement of
> this requirement was commit 55f39cf755 ("merge: fix misleading
> pre-merge check documentation", 2018-06-30), which tried to document
> the requirement and noted there were some exceptions.  As mentioned in
> that commit message, the `resolve` strategy was the one strategy that
> did not have an explicit index matching HEAD check, and the reason it
> didn't was that I wasn't able to discover any cases where the
> implementation would fail to catch the problem and abort, and didn't
> want to introduce unnecessary performance overhead of adding another
> check.
>
> Well, today I discovered a testcase where the implementation does not
> catch the problem and so an explicit check is needed.  Add a testcase
> that previously would have failed, and update git-merge-resolve.sh to
> have an explicit check.  Note that the code is copied from 3ec62ad9ff
> ("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
> that we reuse the same message and avoid making translators need to
> translate some new message.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c                          | 20 ++++++++++++++++++
>  git-merge-resolve.sh                     | 10 +++++++++
>  t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++
>  3 files changed, 56 insertions(+)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 23170f2d2a6..13884b8e836 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>  		 */
>  		refresh_cache(REFRESH_QUIET);
>  		if (allow_trivial && fast_forward != FF_ONLY) {
> +			/*
> +			 * Must first ensure that index matches HEAD before
> +			 * attempting a trivial merge.
> +			 */
> +			struct tree *head_tree = get_commit_tree(head_commit);
> +			struct strbuf sb = STRBUF_INIT;
> +
> +			if (repo_index_has_changes(the_repository, head_tree,
> +						   &sb)) {
> +				struct strbuf err = STRBUF_INIT;
> +				strbuf_addstr(&err, "error: ");
> +				strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> +					    sb.buf);
> +				strbuf_addch(&err, '\n');

At first glance I was expecting this to construct an error message to
emit it somewhere else that stderr, so I wondered if you couldn't use
the "error_routine" facility to avoid re-inventing "error: " etc.,
but...

> +				fputs(err.buf, stderr);

...we emit it to stderr anyway...?

> +				strbuf_release(&err);
> +				strbuf_release(&sb);
> +				return -1;
> +			}
> +
>  			/* See if it is really trivial. */
>  			git_committer_info(IDENT_STRICT);
>  			printf(_("Trying really trivial in-index merge...\n"));
> diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
> index 343fe7bccd0..77e93121bf8 100755
> --- a/git-merge-resolve.sh
> +++ b/git-merge-resolve.sh
> @@ -5,6 +5,16 @@
>  #
>  # Resolve two trees, using enhanced multi-base read-tree.
>  
> +. git-sh-setup
> +
> +# Abort if index does not match HEAD
> +if ! git diff-index --quiet --cached HEAD --
> +then
> +    gettextln "Error: Your local changes to the following files would be overwritten by merge"
> +    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
> +    exit 2
> +fi

(The "..." continued below)

Just in trying to poke holes in this I made this an "exit 0", and
neither of the tests you added failed, but the last one ("resolve &&
recursive && ort") in the t6424*.sh will fail, is that intentional?

I don't know enough about the context here, but given our *.sh->C
migration elsewhere it's a bit unfortunate to see more *.sh code added
back. We have "git merge" driving this, isn't it OK to have it make this
check before invoking "resolve" (may be a stupid question).

For this code in particular it:

 * Uses spaces, not tabs
 * We lose the diff-index .. --name-only exit code (segfault), but so
   did the older version
 * I wonder if bending over backwards to emit the exact message we
   emitted before is worth it

If you just make this something like (untested):

	{
		gettext "error: " &&
		gettextln "Your local..."
	}

You could re-use the translation from the *.c one (and the "error: " one
we'll get from usage.c).

That leaves "\n %s" as the difference, but we could just remove that
from the _() and emit it unconditionally, no?


>  # The first parameters up to -- are merge bases; the rest are heads.
>  bases= head= remotes= sep_seen=
>  for arg
> diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> index b6e424a427b..f35d3182b86 100755
> --- a/t/t6424-merge-unrelated-index-changes.sh
> +++ b/t/t6424-merge-unrelated-index-changes.sh
> @@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' '
>  	test_path_is_missing .git/MERGE_HEAD
>  '
>  
> +test_expect_success 'resolve, trivial, related file removed' '
> +	git reset --hard &&
> +	git checkout B^0 &&
> +
> +	git rm a &&
> +	test_path_is_missing a &&
> +
> +	test_must_fail git merge -s resolve C^0 &&
> +
> +	test_path_is_missing a &&
> +	test_path_is_missing .git/MERGE_HEAD
> +'
> +
> +test_expect_success 'resolve, non-trivial, related file removed' '
> +	git reset --hard &&
> +	git checkout B^0 &&
> +
> +	git rm a &&
> +	test_path_is_missing a &&
> +
> +	test_must_fail git merge -s resolve D^0 &&
> +
> +	test_path_is_missing a &&
> +	test_path_is_missing .git/MERGE_HEAD
> +'
> +
>  test_expect_success 'recursive' '
>  	git reset --hard &&
>  	git checkout B^0 &&

...I tried with this change on top, it seems to me like you'd want this
in any case, it passes the tests both with & without the C code change,
so can't we just use error() here?
	
	diff --git a/builtin/merge.c b/builtin/merge.c
	index 7fb4414ebb7..64def49734a 100644
	--- a/builtin/merge.c
	+++ b/builtin/merge.c
	@@ -1621,13 +1621,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
	 
	 			if (repo_index_has_changes(the_repository, head_tree,
	 						   &sb)) {
	-				struct strbuf err = STRBUF_INIT;
	-				strbuf_addstr(&err, "error: ");
	-				strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
	-					    sb.buf);
	-				strbuf_addch(&err, '\n');
	-				fputs(err.buf, stderr);
	-				strbuf_release(&err);
	+				error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
	+				      sb.buf);
	 				strbuf_release(&sb);
	 				return -1;
	 			}
	diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
	index c96649448fa..1df130b9ee6 100755
	--- a/t/t6424-merge-unrelated-index-changes.sh
	+++ b/t/t6424-merge-unrelated-index-changes.sh
	@@ -96,7 +96,12 @@ test_expect_success 'resolve, trivial' '
	 
	 	touch random_file && git add random_file &&
	 
	-	test_must_fail git merge -s resolve C^0 &&
	+	sed -e "s/^> //g" >expect <<-\EOF &&
	+	> error: Your local changes to the following files would be overwritten by merge:
	+	>   random_file
	+	EOF
	+	test_must_fail git merge -s resolve C^0 2>actual &&
	+	test_cmp expect actual &&
	 	test_path_is_file random_file &&
	 	git rev-parse --verify :random_file &&
	 	test_path_is_missing .git/MERGE_HEAD
	

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-22  5:15       ` [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
@ 2022-07-22 10:47         ` Ævar Arnfjörð Bjarmason
  2022-07-23  0:36           ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-22 10:47 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren


On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> builtin/merge is setup to allow multiple strategies to be specified,
> and it will find the "best" result and use it.  This is defeated if
> some of the merge strategies abort early when they cannot handle the
> merge.  Fix the logic that calls recursive and ort to not do such an
> early abort, but instead return "2" or "unhandled" so that the next
> strategy can try to handle the merge.
>
> Coming up with a testcase for this is somewhat difficult, since
> recursive and ort both handle nearly any two-headed merge (there is
> a separate code path that checks for non-two-headed merges and
> already returns "2" for them).  So use a somewhat synthetic testcase
> of having the index not match HEAD before the merge starts, since all
> merge strategies will abort for that.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c                          |  6 ++++--
>  t/t6402-merge-rename.sh                  |  2 +-
>  t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
>  t/t6439-merge-co-error-msgs.sh           |  1 +
>  4 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 13884b8e836..dec7375bf2a 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>  		else
>  			clean = merge_recursive(&o, head, remoteheads->item,
>  						reversed, &result);
> -		if (clean < 0)
> -			exit(128);
> +		if (clean < 0) {
> +			rollback_lock_file(&lock);
> +			return 2;
> +		}
>  		if (write_locked_index(&the_index, &lock,
>  				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  			die(_("unable to write %s"), get_index_file());
> diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
> index 3a32b1a45cf..772238e582c 100755
> --- a/t/t6402-merge-rename.sh
> +++ b/t/t6402-merge-rename.sh
> @@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
>  	echo >>M one line addition &&
>  	cat M >M.saved &&
>  	git update-index M &&
> -	test_expect_code 128 git pull --no-rebase . yellow &&
> +	test_expect_code 2 git pull --no-rebase . yellow &&
>  	test_cmp M M.saved &&
>  	rm -f M.saved
>  '
> diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> index f35d3182b86..8b749e19083 100755
> --- a/t/t6424-merge-unrelated-index-changes.sh
> +++ b/t/t6424-merge-unrelated-index-changes.sh
> @@ -268,4 +268,20 @@ test_expect_success 'subtree' '
>  	test_path_is_missing .git/MERGE_HEAD
>  '
>  
> +test_expect_success 'resolve && recursive && ort' '
> +	git reset --hard &&
> +	git checkout B^0 &&
> +
> +	test_seq 0 10 >a &&
> +	git add a &&
> +
> +	sane_unset GIT_TEST_MERGE_ALGORITHM &&
> +	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
> +
> +	grep "Trying merge strategy resolve..." output &&
> +	grep "Trying merge strategy recursive..." output &&
> +	grep "Trying merge strategy ort..." output &&
> +	grep "No merge strategy handled the merge." output
> +'

Ah, re my feedback on 2/7 I hadn't read ahead. This is the test I
mentioned as failing with the code added in 2/7 if it's tweaked to be
s/exit 2/exit 0/.

So it's a bit odd to have code added in 2/7 that's tested in 3/7. I
think this would be much easier to understand if these tests came before
all these code changes, so then as the changes are made we can see how
the behavior changes.

But short of that at least having the relevant part of this for 2/7 in
that commit would be better, i.e. the thing that tests that new
"diff-index" check in some way...


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 5/7] merge: make restore_state() restore staged state too
  2022-07-22  5:15       ` [PATCH v4 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-22 10:53         ` Ævar Arnfjörð Bjarmason
  2022-07-23  1:56           ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-22 10:53 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren


On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> There are multiple issues at play here:
>
>   1) If `git merge` is invoked with staged changes, it should abort
>      without doing any merging, and the user's working tree and index
>      should be the same as before merge was invoked.
>   2) Merge strategies are responsible for enforcing the index == HEAD
>      requirement. (See 9822175d2b ("Ensure index matches head before
>      invoking merge machinery, round N", 2019-08-17) for some history
>      around this.)
>   3) Merge strategies can bail saying they are not an appropriate
>      handler for the merge in question (possibly allowing other
>      strategies to be used instead).
>   4) Merge strategies can make changes to the index and working tree,
>      and have no expectation to clean up after themselves, *even* if
>      they bail out and say they are not an appropriate handler for
>      the merge in question.  (The `octopus` merge strategy does this,
>      for example.)
>   5) Because of (3) and (4), builtin/merge.c stashes state before
>      trying merge strategies and restores it afterward.
>
> Unfortunately, if users had staged changes before calling `git merge`,
> builtin/merge.c could do the following:
>
>    * stash the changes, in order to clean up after the strategies
>    * try all the merge strategies in turn, each of which report they
>      cannot function due to the index not matching HEAD
>    * restore the changes via "git stash apply"
>
> But that last step would have the net effect of unstaging the user's
> changes.  Fix this by adding the "--index" option to "git stash apply".
> While at it, also squelch the stash apply output; we already report
> "Rewinding the tree to pristine..." and don't need a detailed `git
> status` report afterwards.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c                          | 5 +++--
>  t/t6424-merge-unrelated-index-changes.sh | 7 ++++++-
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 4170c30317e..f807bf335bd 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -383,14 +383,15 @@ static void reset_hard(const struct object_id *oid, int verbose)
>  static void restore_state(const struct object_id *head,
>  			  const struct object_id *stash)
>  {
> -	const char *args[] = { "stash", "apply", NULL, NULL };
> +	const char *args[] = { "stash", "apply", "--index", "--quiet",
> +			       NULL, NULL };
>  
>  	if (is_null_oid(stash))
>  		return;
>  
>  	reset_hard(head, 1);
>  
> -	args[2] = oid_to_hex(stash);
> +	args[4] = oid_to_hex(stash);
>  
>  	/*
>  	 * It is OK to ignore error here, for example when there was

Just a nit/side comment: This is one of these older-style arg
constructions that we've replaced with strvec in most other places.

Let's leave this alone for now (especially in a v4), but FWIW I wouldn't
mind if these sort of changes were strvec converted while at it:
	
	diff --git a/builtin/merge.c b/builtin/merge.c
	index 64def49734a..c3a3a1fde50 100644
	--- a/builtin/merge.c
	+++ b/builtin/merge.c
	@@ -383,21 +383,23 @@ static void reset_hard(const struct object_id *oid, int verbose)
	 static void restore_state(const struct object_id *head,
	 			  const struct object_id *stash)
	 {
	-	const char *args[] = { "stash", "apply", "--index", "--quiet",
	-			       NULL, NULL };
	+	struct strvec args = STRVEC_INIT;
	+
	+	strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
	 
	 	reset_hard(head, 1);
	 
	 	if (is_null_oid(stash))
	 		goto refresh_cache;
	 
	-	args[4] = oid_to_hex(stash);
	+	strvec_push(&args, oid_to_hex(stash));
	 
	 	/*
	 	 * It is OK to ignore error here, for example when there was
	 	 * nothing to restore.
	 	 */
	-	run_command_v_opt(args, RUN_GIT_CMD);
	+	run_command_v_opt(args.v, RUN_GIT_CMD);
	+	strvec_clear(&args);
	 
	 refresh_cache:
	 	if (discard_cache() < 0 || read_cache() < 0)

I.e. it takes about as much mental energy to review that as counting the
args elements and seeing that 2 to 4 is correct :)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-22 10:27         ` Ævar Arnfjörð Bjarmason
@ 2022-07-23  0:28           ` Elijah Newren
  2022-07-23  5:44             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren @ 2022-07-23  0:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano

On Fri, Jul 22, 2022 at 3:46 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > As noted in commit 9822175d2b ("Ensure index matches head before
> > invoking merge machinery, round N", 2019-08-17), we have had a very
> > long history of problems with failing to enforce the requirement that
> > index matches HEAD when starting a merge.  One of the commits
> > referenced in the long tale of issues arising from lax enforcement of
> > this requirement was commit 55f39cf755 ("merge: fix misleading
> > pre-merge check documentation", 2018-06-30), which tried to document
> > the requirement and noted there were some exceptions.  As mentioned in
> > that commit message, the `resolve` strategy was the one strategy that
> > did not have an explicit index matching HEAD check, and the reason it
> > didn't was that I wasn't able to discover any cases where the
> > implementation would fail to catch the problem and abort, and didn't
> > want to introduce unnecessary performance overhead of adding another
> > check.
> >
> > Well, today I discovered a testcase where the implementation does not
> > catch the problem and so an explicit check is needed.  Add a testcase
> > that previously would have failed, and update git-merge-resolve.sh to
> > have an explicit check.  Note that the code is copied from 3ec62ad9ff
> > ("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
> > that we reuse the same message and avoid making translators need to
> > translate some new message.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/merge.c                          | 20 ++++++++++++++++++
> >  git-merge-resolve.sh                     | 10 +++++++++
> >  t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++
> >  3 files changed, 56 insertions(+)
> >
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index 23170f2d2a6..13884b8e836 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
> >                */
> >               refresh_cache(REFRESH_QUIET);
> >               if (allow_trivial && fast_forward != FF_ONLY) {
> > +                     /*
> > +                      * Must first ensure that index matches HEAD before
> > +                      * attempting a trivial merge.
> > +                      */
> > +                     struct tree *head_tree = get_commit_tree(head_commit);
> > +                     struct strbuf sb = STRBUF_INIT;
> > +
> > +                     if (repo_index_has_changes(the_repository, head_tree,
> > +                                                &sb)) {
> > +                             struct strbuf err = STRBUF_INIT;
> > +                             strbuf_addstr(&err, "error: ");
> > +                             strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> > +                                         sb.buf);
> > +                             strbuf_addch(&err, '\n');
>
> At first glance I was expecting this to construct an error message to
> emit it somewhere else that stderr, so I wondered if you couldn't use
> the "error_routine" facility to avoid re-inventing "error: " etc.,
> but...
>
> > +                             fputs(err.buf, stderr);
>
> ...we emit it to stderr anyway...?
>
> > +                             strbuf_release(&err);
> > +                             strbuf_release(&sb);
> > +                             return -1;
> > +                     }
> > +

UGH!  I fixed the other one of these in my reroll yesterday[1].  I
_knew_ I had copied that code somewhere else, but for some reason I
thought it was in a different series and went searching for it.  Don't
know why I couldn't remember that it was in the same series, and I'm
not sure how I missed it when I went looking.  I mean, I know I was
tired yesterday, but that's still kinda bad.

Anyway, thanks for catching; I'll fix this one too.

[1] https://lore.kernel.org/git/xmqqsfmulb6w.fsf@gitster.g/

> >                       /* See if it is really trivial. */
> >                       git_committer_info(IDENT_STRICT);
> >                       printf(_("Trying really trivial in-index merge...\n"));
> > diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
> > index 343fe7bccd0..77e93121bf8 100755
> > --- a/git-merge-resolve.sh
> > +++ b/git-merge-resolve.sh
> > @@ -5,6 +5,16 @@
> >  #
> >  # Resolve two trees, using enhanced multi-base read-tree.
> >
> > +. git-sh-setup
> > +
> > +# Abort if index does not match HEAD
> > +if ! git diff-index --quiet --cached HEAD --
> > +then
> > +    gettextln "Error: Your local changes to the following files would be overwritten by merge"
> > +    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
> > +    exit 2
> > +fi
>
> (The "..." continued below)
>
> Just in trying to poke holes in this I made this an "exit 0", and
> neither of the tests you added failed, but the last one ("resolve &&
> recursive && ort") in the t6424*.sh will fail, is that intentional?

Nope it's not intentional.  I had tested one fix (for these
git-merge-resolve.sh changes) and verified they were good (and
necessary), then found another bug (the one fixed by the
builtin/merge.c changes) and added a test for it, and just decided to
amend it into the same commit.  Turns out the builtin/merge.c changes
mask the fix for git-merge-resolve.sh here and makes that code go
unexercised.  I'll split these two bugfixes into separate patches, and
tweak one of the two testcases to make sure it continues to exercise
the new codepath added to git-merge-resolve.sh.

> I don't know enough about the context here, but given our *.sh->C
> migration elsewhere it's a bit unfortunate to see more *.sh code added
> back.

This seems like a curious objection.  "We are trying to get rid of
shell scripts, so don't even fix bugs in any of the existing ones." ?

> We have "git merge" driving this, isn't it OK to have it make this
> check before invoking "resolve" (may be a stupid question).

Ah, I can kind of see where you're coming from now, but that seems to
me to be bending over backwards in attempting to fix a component
written in shell without actually modifying the shell.
builtin/merge.c is some glue code that can call multiple different
strategies, but isn't the place for the implementation of the
strategies themselves, and I'd hate to see us put half the
implementation in one place and half in another.  In addition, besides
the separation of concerns issue::

   * We document that users can add their own merge strategies (a
shell or executable named git-merge-$USERNAME and "git merge -s
$USERNAME" will call them)
   * git-merge-resolve and git-merge-octopus serve as examples
   * Our examples should demonstrate correct behavior and perform
documented, required steps.  This particular check is important:

    /*
     * At this point, we need a real merge.  No matter what strategy
     * we use, it would operate on the index, possibly affecting the
     * working tree, and when resolved cleanly, have the desired
     * tree in the index -- this means that the index must be in
     * sync with the head commit.  The strategies are responsible
     * to ensure this.
     */

So, even if someone were to reimplement git-merge-resolve.sh in C, and
start the deprecation process with some merge.useBuiltinResolve config
setting (similar to rebase.useBuiltin), I'd still want this shell fix
added to git-merge-resolve.sh in the meantime, both as an important
bugfix, and so that people looking for merge strategy examples who
find this script hopefully find a new enough version with this
important check included.

In general, if merge strategies do not perform this check, we have
observed that they often will either (a) discard users' staged changes
(best case) or (b) smash staged changes into the created commit and
thus create some kind of evil merge (making it look like they created
a merge normally, and then amended the merge with additional changes).

We're lucky that the way resolve was implemented, other git calls
would usually incidentally catch such issues for us without an
explicit check.  We were also lucky that the observed behavior was
'(a)' rather than '(b)' for resolve.  But the issue should still be
fixed.

> For this code in particular it:
>
>  * Uses spaces, not tabs

Yes, that's fair, but as I mentioned in the commit message, it was
copied from git-merge-octopus.sh.  So, as you say below, "so did the
older version".

>  * We lose the diff-index .. --name-only exit code (segfault), but so
>    did the older version

Um, I don't understand this objection.  I think you are referring to
the pipe to sed, but if so...who cares?  The exit code would be lost
anyway because we aren't running under errexit, and the next line of
code ignores any and all previous exit codes when it runs "exit 2".
And if you're not referring to the pipe to sed but the fact that it
unconditionally returns an exit code of 2 on the next line, then yes
that is the expected return code.  Whatever the diff-index segfault
returns would be the wrong exit status and could fool the
builtin/merge.c into doing the wrong thing.  It expects merge
strategies to return one of three exit codes: 0, 1, or 2:

    /*
     * The backend exits with 1 when conflicts are
     * left to be resolved, with 2 when it does not
     * handle the given merge at all.
     */

So, ignoring the return code from diff-index is correct behavior here.

Were you thinking this was a test script or something?

>  * I wonder if bending over backwards to emit the exact message we
>    emitted before is worth it
>
> If you just make this something like (untested):
>
>         {
>                 gettext "error: " &&
>                 gettextln "Your local..."
>         }
>
> You could re-use the translation from the *.c one (and the "error: " one
> we'll get from usage.c).
>
> That leaves "\n %s" as the difference, but we could just remove that
> from the _() and emit it unconditionally, no?

??

Copying a few lines from git-merge-octopus.sh to get the same fix it
has is "bending over backwards"?  That's what I call "doing the
easiest thing possible" (and which _also_ has the benefit of being
battle tested code), and then you describe a bunch of gymnastics as an
alternative?  I see your suggestion as running afoul of the objection
you are raising, and the code I'm adding as being a solution to that
particular objection.  So this particular flag you are raising is
confusing to me.

> >  # The first parameters up to -- are merge bases; the rest are heads.
> >  bases= head= remotes= sep_seen=
> >  for arg
> > diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> > index b6e424a427b..f35d3182b86 100755
> > --- a/t/t6424-merge-unrelated-index-changes.sh
> > +++ b/t/t6424-merge-unrelated-index-changes.sh
> > @@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' '
> >       test_path_is_missing .git/MERGE_HEAD
> >  '
> >
> > +test_expect_success 'resolve, trivial, related file removed' '
> > +     git reset --hard &&
> > +     git checkout B^0 &&
> > +
> > +     git rm a &&
> > +     test_path_is_missing a &&
> > +
> > +     test_must_fail git merge -s resolve C^0 &&
> > +
> > +     test_path_is_missing a &&
> > +     test_path_is_missing .git/MERGE_HEAD
> > +'
> > +
> > +test_expect_success 'resolve, non-trivial, related file removed' '
> > +     git reset --hard &&
> > +     git checkout B^0 &&
> > +
> > +     git rm a &&
> > +     test_path_is_missing a &&
> > +
> > +     test_must_fail git merge -s resolve D^0 &&
> > +
> > +     test_path_is_missing a &&
> > +     test_path_is_missing .git/MERGE_HEAD
> > +'
> > +
> >  test_expect_success 'recursive' '
> >       git reset --hard &&
> >       git checkout B^0 &&
>
> ...I tried with this change on top, it seems to me like you'd want this
> in any case, it passes the tests both with & without the C code change,
> so can't we just use error() here?
>
>         diff --git a/builtin/merge.c b/builtin/merge.c
>         index 7fb4414ebb7..64def49734a 100644
>         --- a/builtin/merge.c
>         +++ b/builtin/merge.c
>         @@ -1621,13 +1621,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>
>                                 if (repo_index_has_changes(the_repository, head_tree,
>                                                            &sb)) {
>         -                               struct strbuf err = STRBUF_INIT;
>         -                               strbuf_addstr(&err, "error: ");
>         -                               strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
>         -                                           sb.buf);
>         -                               strbuf_addch(&err, '\n');
>         -                               fputs(err.buf, stderr);
>         -                               strbuf_release(&err);
>         +                               error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
>         +                                     sb.buf);
>                                         strbuf_release(&sb);
>                                         return -1;

Yes, this is the same change suggested by Junio for patch 1 which I
should have also applied here.  Thanks for catching it!

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-22 10:47         ` Ævar Arnfjörð Bjarmason
@ 2022-07-23  0:36           ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-23  0:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano

On Fri, Jul 22, 2022 at 3:49 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > builtin/merge is setup to allow multiple strategies to be specified,
> > and it will find the "best" result and use it.  This is defeated if
> > some of the merge strategies abort early when they cannot handle the
> > merge.  Fix the logic that calls recursive and ort to not do such an
> > early abort, but instead return "2" or "unhandled" so that the next
> > strategy can try to handle the merge.
> >
> > Coming up with a testcase for this is somewhat difficult, since
> > recursive and ort both handle nearly any two-headed merge (there is
> > a separate code path that checks for non-two-headed merges and
> > already returns "2" for them).  So use a somewhat synthetic testcase
> > of having the index not match HEAD before the merge starts, since all
> > merge strategies will abort for that.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/merge.c                          |  6 ++++--
> >  t/t6402-merge-rename.sh                  |  2 +-
> >  t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
> >  t/t6439-merge-co-error-msgs.sh           |  1 +
> >  4 files changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index 13884b8e836..dec7375bf2a 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
> >               else
> >                       clean = merge_recursive(&o, head, remoteheads->item,
> >                                               reversed, &result);
> > -             if (clean < 0)
> > -                     exit(128);
> > +             if (clean < 0) {
> > +                     rollback_lock_file(&lock);
> > +                     return 2;
> > +             }
> >               if (write_locked_index(&the_index, &lock,
> >                                      COMMIT_LOCK | SKIP_IF_UNCHANGED))
> >                       die(_("unable to write %s"), get_index_file());
> > diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
> > index 3a32b1a45cf..772238e582c 100755
> > --- a/t/t6402-merge-rename.sh
> > +++ b/t/t6402-merge-rename.sh
> > @@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
> >       echo >>M one line addition &&
> >       cat M >M.saved &&
> >       git update-index M &&
> > -     test_expect_code 128 git pull --no-rebase . yellow &&
> > +     test_expect_code 2 git pull --no-rebase . yellow &&
> >       test_cmp M M.saved &&
> >       rm -f M.saved
> >  '
> > diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> > index f35d3182b86..8b749e19083 100755
> > --- a/t/t6424-merge-unrelated-index-changes.sh
> > +++ b/t/t6424-merge-unrelated-index-changes.sh
> > @@ -268,4 +268,20 @@ test_expect_success 'subtree' '
> >       test_path_is_missing .git/MERGE_HEAD
> >  '
> >
> > +test_expect_success 'resolve && recursive && ort' '
> > +     git reset --hard &&
> > +     git checkout B^0 &&
> > +
> > +     test_seq 0 10 >a &&
> > +     git add a &&
> > +
> > +     sane_unset GIT_TEST_MERGE_ALGORITHM &&
> > +     test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
> > +
> > +     grep "Trying merge strategy resolve..." output &&
> > +     grep "Trying merge strategy recursive..." output &&
> > +     grep "Trying merge strategy ort..." output &&
> > +     grep "No merge strategy handled the merge." output
> > +'

Oops, 'resolve' should really be at the end of the list rather than at
the beginning.  And the test description should be better.

> Ah, re my feedback on 2/7 I hadn't read ahead. This is the test I
> mentioned as failing with the code added in 2/7 if it's tweaked to be
> s/exit 2/exit 0/.
>
> So it's a bit odd to have code added in 2/7 that's tested in 3/7. I
> think this would be much easier to understand if these tests came before
> all these code changes, so then as the changes are made we can see how
> the behavior changes.

This testcase belongs in this patch.  The use of "resolve" here was
totally incidental to the testcase in question; I could have used
"octopus" or "ours" or created a new strategy and used it.

(Actually, using 'ours' here runs into the problem we fix in the final
patch.  So maybe just like 'resolve', using 'ours' might be confusing
to readers of the series as they think issues from other patches are
involved.)

> But short of that at least having the relevant part of this for 2/7 in
> that commit would be better, i.e. the thing that tests that new
> "diff-index" check in some way...

I'll switch this test to using 'octopus' instead of 'resolve' just so
it doesn't get confused in this way.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v5 0/8] Fix merge restore state
  2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-07-22  5:15       ` [PATCH v4 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-23  1:53       ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
                           ` (9 more replies)
  7 siblings, 10 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren

This started as a simple series to fix restore_state() in builtin/merge.c,
fixing an issue reported by ZheNing Hu[3]. It now fixes several bugs and has
grown so much it's hard to call it simple. Anyway...

Changes since v4:

 * Made use of the error() function in another place to simplify code
   (should have caught this in v3)
 * Split the fixes for 'resolve' and the trivial merge into separate
   patches, and make sure one doesn't mask the other but both codepaths are
   exercises in the testsuite
 * better test descriptions
 * use strvec to simplify some code

[1]
https://lore.kernel.org/git/CAOLTT8R7QmpvaFPTRs3xTpxr7eiuxF-ZWtvUUSC0-JOo9Y+SqA@mail.gmail.com/

Elijah Newren (8):
  merge-ort-wrappers: make printed message match the one from recursive
  merge-resolve: abort if index does not match HEAD
  merge: abort if index does not match HEAD for trivial merges
  merge: do not abort early if one strategy fails to handle the merge
  merge: fix save_state() to work when there are stat-dirty files
  merge: make restore_state() restore staged state too
  merge: ensure we can actually restore pre-merge state
  merge: do not exit restore_state() prematurely

 builtin/merge.c                          | 57 ++++++++++++++++-----
 git-merge-resolve.sh                     | 10 ++++
 merge-ort-wrappers.c                     |  4 +-
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 65 ++++++++++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 t/t7607-merge-state.sh                   | 32 ++++++++++++
 7 files changed, 154 insertions(+), 17 deletions(-)
 create mode 100755 t/t7607-merge-state.sh


base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1231%2Fnewren%2Ffix-merge-restore-state-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1231/newren/fix-merge-restore-state-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1231

Range-diff vs v4:

 1:  bd36d16c8d9 = 1:  bd36d16c8d9 merge-ort-wrappers: make printed message match the one from recursive
 2:  b79f44e54b9 ! 2:  b656756fd37 merge-resolve: abort if index does not match HEAD
     @@ Commit message
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     - ## builtin/merge.c ##
     -@@ builtin/merge.c: int cmd_merge(int argc, const char **argv, const char *prefix)
     - 		 */
     - 		refresh_cache(REFRESH_QUIET);
     - 		if (allow_trivial && fast_forward != FF_ONLY) {
     -+			/*
     -+			 * Must first ensure that index matches HEAD before
     -+			 * attempting a trivial merge.
     -+			 */
     -+			struct tree *head_tree = get_commit_tree(head_commit);
     -+			struct strbuf sb = STRBUF_INIT;
     -+
     -+			if (repo_index_has_changes(the_repository, head_tree,
     -+						   &sb)) {
     -+				struct strbuf err = STRBUF_INIT;
     -+				strbuf_addstr(&err, "error: ");
     -+				strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
     -+					    sb.buf);
     -+				strbuf_addch(&err, '\n');
     -+				fputs(err.buf, stderr);
     -+				strbuf_release(&err);
     -+				strbuf_release(&sb);
     -+				return -1;
     -+			}
     -+
     - 			/* See if it is really trivial. */
     - 			git_committer_info(IDENT_STRICT);
     - 			printf(_("Trying really trivial in-index merge...\n"));
     -
       ## git-merge-resolve.sh ##
      @@
       #
     @@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'resolve, non-triv
       	test_path_is_missing .git/MERGE_HEAD
       '
       
     -+test_expect_success 'resolve, trivial, related file removed' '
     -+	git reset --hard &&
     -+	git checkout B^0 &&
     -+
     -+	git rm a &&
     -+	test_path_is_missing a &&
     -+
     -+	test_must_fail git merge -s resolve C^0 &&
     -+
     -+	test_path_is_missing a &&
     -+	test_path_is_missing .git/MERGE_HEAD
     -+'
     -+
      +test_expect_success 'resolve, non-trivial, related file removed' '
      +	git reset --hard &&
      +	git checkout B^0 &&
 -:  ----------- > 3:  3adfd921995 merge: abort if index does not match HEAD for trivial merges
 3:  02930448ea1 ! 4:  c5755271cf1 merge: do not abort early if one strategy fails to handle the merge
     @@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'subtree' '
       	test_path_is_missing .git/MERGE_HEAD
       '
       
     -+test_expect_success 'resolve && recursive && ort' '
     ++test_expect_success 'with multiple strategies, recursive or ort failure do not early abort' '
      +	git reset --hard &&
      +	git checkout B^0 &&
      +
     @@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'subtree' '
      +	git add a &&
      +
      +	sane_unset GIT_TEST_MERGE_ALGORITHM &&
     -+	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
     ++	test_must_fail git merge -s recursive -s ort -s octopus C^0 >output 2>&1 &&
      +
     -+	grep "Trying merge strategy resolve..." output &&
      +	grep "Trying merge strategy recursive..." output &&
      +	grep "Trying merge strategy ort..." output &&
     ++	grep "Trying merge strategy octopus..." output &&
      +	grep "No merge strategy handled the merge." output
      +'
      +
 4:  daf8d224160 ! 5:  e7c6de9e0c1 merge: fix save_state() to work when there are stat-dirty files
     @@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'subtree' '
      +	git merge -s resolve -s recursive D^0
      +'
      +
     - test_expect_success 'resolve && recursive && ort' '
     + test_expect_success 'with multiple strategies, recursive or ort failure do not early abort' '
       	git reset --hard &&
       	git checkout B^0 &&
 5:  f401bd5ad0d ! 6:  d39d6472455 merge: make restore_state() restore staged state too
     @@ Commit message
          changes.  Fix this by adding the "--index" option to "git stash apply".
          While at it, also squelch the stash apply output; we already report
          "Rewinding the tree to pristine..." and don't need a detailed `git
     -    status` report afterwards.
     +    status` report afterwards.  Also while at it, switch to using strvec
     +    so folks don't have to count the arguments to ensure we avoided an
     +    off-by-one error, and so it's easier to add additional arguments to
     +    the command.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ builtin/merge.c: static void reset_hard(const struct object_id *oid, int verbose
       			  const struct object_id *stash)
       {
      -	const char *args[] = { "stash", "apply", NULL, NULL };
     -+	const char *args[] = { "stash", "apply", "--index", "--quiet",
     -+			       NULL, NULL };
     ++	struct strvec args = STRVEC_INIT;
       
       	if (is_null_oid(stash))
       		return;
     @@ builtin/merge.c: static void reset_hard(const struct object_id *oid, int verbose
       	reset_hard(head, 1);
       
      -	args[2] = oid_to_hex(stash);
     -+	args[4] = oid_to_hex(stash);
     ++	strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
     ++	strvec_push(&args, oid_to_hex(stash));
       
       	/*
       	 * It is OK to ignore error here, for example when there was
     + 	 * nothing to restore.
     + 	 */
     +-	run_command_v_opt(args, RUN_GIT_CMD);
     ++	run_command_v_opt(args.v, RUN_GIT_CMD);
     ++	strvec_clear(&args);
     + 
     + 	refresh_cache(REFRESH_QUIET);
     + }
      
       ## t/t6424-merge-unrelated-index-changes.sh ##
     -@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'resolve && recursive && ort' '
     +@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'with multiple strategies, recursive or ort failure do not e
       
       	test_seq 0 10 >a &&
       	git add a &&
      +	git rev-parse :a >expect &&
       
       	sane_unset GIT_TEST_MERGE_ALGORITHM &&
     - 	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
     -@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'resolve && recursive && ort' '
     - 	grep "Trying merge strategy resolve..." output &&
     + 	test_must_fail git merge -s recursive -s ort -s octopus C^0 >output 2>&1 &&
     +@@ t/t6424-merge-unrelated-index-changes.sh: test_expect_success 'with multiple strategies, recursive or ort failure do not e
       	grep "Trying merge strategy recursive..." output &&
       	grep "Trying merge strategy ort..." output &&
     + 	grep "Trying merge strategy octopus..." output &&
      -	grep "No merge strategy handled the merge." output
      +	grep "No merge strategy handled the merge." output &&
      +
 6:  ad5354c219c = 7:  7f5c6884d68 merge: ensure we can actually restore pre-merge state
 7:  6212d572604 ! 8:  954dec526a2 merge: do not exit restore_state() prematurely
     @@ Commit message
      
       ## builtin/merge.c ##
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,
     - 	const char *args[] = { "stash", "apply", "--index", "--quiet",
     - 			       NULL, NULL };
     + {
     + 	struct strvec args = STRVEC_INIT;
       
      -	if (is_null_oid(stash))
      -		return;
     @@ builtin/merge.c: static void restore_state(const struct object_id *head,
      +	if (is_null_oid(stash))
      +		goto refresh_cache;
      +
     - 	args[4] = oid_to_hex(stash);
     + 	strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
     + 	strvec_push(&args, oid_to_hex(stash));
       
     - 	/*
      @@ builtin/merge.c: static void restore_state(const struct object_id *head,
     - 	 */
     - 	run_command_v_opt(args, RUN_GIT_CMD);
     + 	run_command_v_opt(args.v, RUN_GIT_CMD);
     + 	strvec_clear(&args);
       
      -	refresh_cache(REFRESH_QUIET);
      +refresh_cache:
     @@ t/t7607-merge-state.sh (new)
      +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      +. ./test-lib.sh
      +
     -+test_expect_success 'set up custom strategy' '
     ++test_expect_success 'Ensure we restore original state if no merge strategy handles it' '
      +	test_commit --no-tag "Initial" base base &&
      +
      +	for b in branch1 branch2 branch3

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 2/8] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
                           ` (8 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

When the index does not match HEAD, the merge strategies are responsible
to detect that condition and abort.  The merge-ort-wrappers had code to
implement this and meant to copy the error message from merge-recursive
but deviated in two ways, both due to the message in merge-recursive
being processed by another function that made additional changes:
  * It added an implicit "error: " prefix
  * It added an implicit trailing newline
We can get these things by making use of the error() function.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort-wrappers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort-wrappers.c b/merge-ort-wrappers.c
index ad041061695..748924a69ba 100644
--- a/merge-ort-wrappers.c
+++ b/merge-ort-wrappers.c
@@ -10,8 +10,8 @@ static int unclean(struct merge_options *opt, struct tree *head)
 	struct strbuf sb = STRBUF_INIT;
 
 	if (head && repo_index_has_changes(opt->repo, head, &sb)) {
-		fprintf(stderr, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
-		    sb.buf);
+		error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
+		      sb.buf);
 		strbuf_release(&sb);
 		return -1;
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 2/8] merge-resolve: abort if index does not match HEAD
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 3/8] merge: abort if index does not match HEAD for trivial merges Elijah Newren via GitGitGadget
                           ` (7 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

As noted in commit 9822175d2b ("Ensure index matches head before
invoking merge machinery, round N", 2019-08-17), we have had a very
long history of problems with failing to enforce the requirement that
index matches HEAD when starting a merge.  One of the commits
referenced in the long tale of issues arising from lax enforcement of
this requirement was commit 55f39cf755 ("merge: fix misleading
pre-merge check documentation", 2018-06-30), which tried to document
the requirement and noted there were some exceptions.  As mentioned in
that commit message, the `resolve` strategy was the one strategy that
did not have an explicit index matching HEAD check, and the reason it
didn't was that I wasn't able to discover any cases where the
implementation would fail to catch the problem and abort, and didn't
want to introduce unnecessary performance overhead of adding another
check.

Well, today I discovered a testcase where the implementation does not
catch the problem and so an explicit check is needed.  Add a testcase
that previously would have failed, and update git-merge-resolve.sh to
have an explicit check.  Note that the code is copied from 3ec62ad9ff
("merge-octopus: abort if index does not match HEAD", 2016-04-09), so
that we reuse the same message and avoid making translators need to
translate some new message.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 git-merge-resolve.sh                     | 10 ++++++++++
 t/t6424-merge-unrelated-index-changes.sh | 13 +++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 343fe7bccd0..77e93121bf8 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -5,6 +5,16 @@
 #
 # Resolve two trees, using enhanced multi-base read-tree.
 
+. git-sh-setup
+
+# Abort if index does not match HEAD
+if ! git diff-index --quiet --cached HEAD --
+then
+    gettextln "Error: Your local changes to the following files would be overwritten by merge"
+    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
+    exit 2
+fi
+
 # The first parameters up to -- are merge bases; the rest are heads.
 bases= head= remotes= sep_seen=
 for arg
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index b6e424a427b..eabe6bda832 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -114,6 +114,19 @@ test_expect_success 'resolve, non-trivial' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve, non-trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve D^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
 test_expect_success 'recursive' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 3/8] merge: abort if index does not match HEAD for trivial merges
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 2/8] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 4/8] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
                           ` (6 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

As noted in the last commit and the links therein (especially commit
9822175d2b ("Ensure index matches head before invoking merge machinery,
round N", 2019-08-17), we have had a very long history of problems with
failing to enforce the requirement that index matches HEAD when starting
a merge.

The "trivial merge" logic in builtin/merge.c is yet another such case
we previously missed.  Add a check for it to ensure it aborts if the
index does not match HEAD, and add a testcase where this fix is needed.

Note that the fix here would also incidentally be an alternative fix
for the testcase added in the last patch, but the fix in the last patch
is still needed when multiple merge strategies are in use, so tweak the
testcase from the previous commit so that it continues to exercise the
codepath added in the last commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 15 +++++++++++++++
 t/t6424-merge-unrelated-index-changes.sh | 22 +++++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 23170f2d2a6..b43876f68e4 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1599,6 +1599,21 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		 */
 		refresh_cache(REFRESH_QUIET);
 		if (allow_trivial && fast_forward != FF_ONLY) {
+			/*
+			 * Must first ensure that index matches HEAD before
+			 * attempting a trivial merge.
+			 */
+			struct tree *head_tree = get_commit_tree(head_commit);
+			struct strbuf sb = STRBUF_INIT;
+
+			if (repo_index_has_changes(the_repository, head_tree,
+						   &sb)) {
+				error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
+				      sb.buf);
+				strbuf_release(&sb);
+				return 2;
+			}
+
 			/* See if it is really trivial. */
 			git_committer_info(IDENT_STRICT);
 			printf(_("Trying really trivial in-index merge...\n"));
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index eabe6bda832..187c761ad84 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -114,6 +114,19 @@ test_expect_success 'resolve, non-trivial' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'resolve, trivial, related file removed' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	git rm a &&
+	test_path_is_missing a &&
+
+	test_must_fail git merge -s resolve C^0 &&
+
+	test_path_is_missing a &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
 test_expect_success 'resolve, non-trivial, related file removed' '
 	git reset --hard &&
 	git checkout B^0 &&
@@ -121,7 +134,14 @@ test_expect_success 'resolve, non-trivial, related file removed' '
 	git rm a &&
 	test_path_is_missing a &&
 
-	test_must_fail git merge -s resolve D^0 &&
+	# We also ask for recursive in order to turn off the "allow_trivial"
+	# setting in builtin/merge.c, and ensure that resolve really does
+	# correctly fail the merge (I guess this also tests that recursive
+	# correctly fails the merge, but the main thing we are attempting
+	# to test here is resolve and are just using the side effect of
+	# adding recursive to ensure that resolve is actually tested rather
+	# than the trivial merge codepath)
+	test_must_fail git merge -s resolve -s recursive D^0 &&
 
 	test_path_is_missing a &&
 	test_path_is_missing .git/MERGE_HEAD
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 4/8] merge: do not abort early if one strategy fails to handle the merge
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (2 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 3/8] merge: abort if index does not match HEAD for trivial merges Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 5/8] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
                           ` (5 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  6 ++++--
 t/t6402-merge-rename.sh                  |  2 +-
 t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
 t/t6439-merge-co-error-msgs.sh           |  1 +
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index b43876f68e4..c120ad619c4 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		else
 			clean = merge_recursive(&o, head, remoteheads->item,
 						reversed, &result);
-		if (clean < 0)
-			exit(128);
+		if (clean < 0) {
+			rollback_lock_file(&lock);
+			return 2;
+		}
 		if (write_locked_index(&the_index, &lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
index 3a32b1a45cf..772238e582c 100755
--- a/t/t6402-merge-rename.sh
+++ b/t/t6402-merge-rename.sh
@@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
 	echo >>M one line addition &&
 	cat M >M.saved &&
 	git update-index M &&
-	test_expect_code 128 git pull --no-rebase . yellow &&
+	test_expect_code 2 git pull --no-rebase . yellow &&
 	test_cmp M M.saved &&
 	rm -f M.saved
 '
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 187c761ad84..615061c7af4 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -275,4 +275,20 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'with multiple strategies, recursive or ort failure do not early abort' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	test_seq 0 10 >a &&
+	git add a &&
+
+	sane_unset GIT_TEST_MERGE_ALGORITHM &&
+	test_must_fail git merge -s recursive -s ort -s octopus C^0 >output 2>&1 &&
+
+	grep "Trying merge strategy recursive..." output &&
+	grep "Trying merge strategy ort..." output &&
+	grep "Trying merge strategy octopus..." output &&
+	grep "No merge strategy handled the merge." output
+'
+
 test_done
diff --git a/t/t6439-merge-co-error-msgs.sh b/t/t6439-merge-co-error-msgs.sh
index 5bfb027099a..52cf0c87690 100755
--- a/t/t6439-merge-co-error-msgs.sh
+++ b/t/t6439-merge-co-error-msgs.sh
@@ -47,6 +47,7 @@ test_expect_success 'untracked files overwritten by merge (fast and non-fast for
 		export GIT_MERGE_VERBOSITY &&
 		test_must_fail git merge branch 2>out2
 	) &&
+	echo "Merge with strategy ${GIT_TEST_MERGE_ALGORITHM:-ort} failed." >>expect &&
 	test_cmp out2 expect &&
 	git reset --hard HEAD^
 '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 5/8] merge: fix save_state() to work when there are stat-dirty files
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (3 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 4/8] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 6/8] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
                           ` (4 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          |  8 ++++++++
 t/t6424-merge-unrelated-index-changes.sh | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index c120ad619c4..780b4b9100a 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -313,8 +313,16 @@ static int save_state(struct object_id *stash)
 	int len;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	struct strbuf buffer = STRBUF_INIT;
+	struct lock_file lock_file = LOCK_INIT;
+	int fd;
 	int rc = -1;
 
+	fd = repo_hold_locked_index(the_repository, &lock_file, 0);
+	refresh_cache(REFRESH_QUIET);
+	if (0 <= fd)
+		repo_update_index_if_able(the_repository, &lock_file);
+	rollback_lock_file(&lock_file);
+
 	strvec_pushl(&cp.args, "stash", "create", NULL);
 	cp.out = -1;
 	cp.git_cmd = 1;
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 615061c7af4..2c83210f9fd 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -275,6 +275,17 @@ test_expect_success 'subtree' '
 	test_path_is_missing .git/MERGE_HEAD
 '
 
+test_expect_success 'avoid failure due to stat-dirty files' '
+	git reset --hard &&
+	git checkout B^0 &&
+
+	# Make "a" be stat-dirty
+	test-tool chmtime =+1 a &&
+
+	# stat-dirty file should not prevent stash creation in builtin/merge.c
+	git merge -s resolve -s recursive D^0
+'
+
 test_expect_success 'with multiple strategies, recursive or ort failure do not early abort' '
 	git reset --hard &&
 	git checkout B^0 &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 6/8] merge: make restore_state() restore staged state too
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (4 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 5/8] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 7/8] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
                           ` (3 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.  Also while at it, switch to using strvec
so folks don't have to count the arguments to ensure we avoided an
off-by-one error, and so it's easier to add additional arguments to
the command.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c                          | 8 +++++---
 t/t6424-merge-unrelated-index-changes.sh | 7 ++++++-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 780b4b9100a..e0a3299e92e 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -383,20 +383,22 @@ static void reset_hard(const struct object_id *oid, int verbose)
 static void restore_state(const struct object_id *head,
 			  const struct object_id *stash)
 {
-	const char *args[] = { "stash", "apply", NULL, NULL };
+	struct strvec args = STRVEC_INIT;
 
 	if (is_null_oid(stash))
 		return;
 
 	reset_hard(head, 1);
 
-	args[2] = oid_to_hex(stash);
+	strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
+	strvec_push(&args, oid_to_hex(stash));
 
 	/*
 	 * It is OK to ignore error here, for example when there was
 	 * nothing to restore.
 	 */
-	run_command_v_opt(args, RUN_GIT_CMD);
+	run_command_v_opt(args.v, RUN_GIT_CMD);
+	strvec_clear(&args);
 
 	refresh_cache(REFRESH_QUIET);
 }
diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
index 2c83210f9fd..a61f20c22fe 100755
--- a/t/t6424-merge-unrelated-index-changes.sh
+++ b/t/t6424-merge-unrelated-index-changes.sh
@@ -292,6 +292,7 @@ test_expect_success 'with multiple strategies, recursive or ort failure do not e
 
 	test_seq 0 10 >a &&
 	git add a &&
+	git rev-parse :a >expect &&
 
 	sane_unset GIT_TEST_MERGE_ALGORITHM &&
 	test_must_fail git merge -s recursive -s ort -s octopus C^0 >output 2>&1 &&
@@ -299,7 +300,11 @@ test_expect_success 'with multiple strategies, recursive or ort failure do not e
 	grep "Trying merge strategy recursive..." output &&
 	grep "Trying merge strategy ort..." output &&
 	grep "Trying merge strategy octopus..." output &&
-	grep "No merge strategy handled the merge." output
+	grep "No merge strategy handled the merge." output &&
+
+	# Changes to "a" should remain staged
+	git rev-parse :a >actual &&
+	test_cmp expect actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 7/8] merge: ensure we can actually restore pre-merge state
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (5 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 6/8] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-23  1:53         ` [PATCH v5 8/8] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
                           ` (2 subsequent siblings)
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index e0a3299e92e..3c4f415d87e 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1682,12 +1682,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	 * tree in the index -- this means that the index must be in
 	 * sync with the head commit.  The strategies are responsible
 	 * to ensure this.
+	 *
+	 * Stash away the local changes so that we can try more than one
+	 * and/or recover from merge strategies bailing while leaving the
+	 * index and working tree polluted.
 	 */
-	if (use_strategies_nr == 1 ||
-	    /*
-	     * Stash away the local changes so that we can try more than one.
-	     */
-	    save_state(&stash))
+	if (save_state(&stash))
 		oidclr(&stash);
 
 	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [PATCH v5 8/8] merge: do not exit restore_state() prematurely
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (6 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 7/8] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
@ 2022-07-23  1:53         ` Elijah Newren via GitGitGadget
  2022-07-25 19:03         ` [PATCH v5 0/8] Fix merge restore state Junio C Hamano
  2022-07-26  4:03         ` ZheNing Hu
  9 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-07-23  1:53 UTC (permalink / raw)
  To: git
  Cc: ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c        | 10 ++++++----
 t/t7607-merge-state.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100755 t/t7607-merge-state.sh

diff --git a/builtin/merge.c b/builtin/merge.c
index 3c4f415d87e..f7c92c0e64f 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -385,11 +385,11 @@ static void restore_state(const struct object_id *head,
 {
 	struct strvec args = STRVEC_INIT;
 
-	if (is_null_oid(stash))
-		return;
-
 	reset_hard(head, 1);
 
+	if (is_null_oid(stash))
+		goto refresh_cache;
+
 	strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
 	strvec_push(&args, oid_to_hex(stash));
 
@@ -400,7 +400,9 @@ static void restore_state(const struct object_id *head,
 	run_command_v_opt(args.v, RUN_GIT_CMD);
 	strvec_clear(&args);
 
-	refresh_cache(REFRESH_QUIET);
+refresh_cache:
+	if (discard_cache() < 0 || read_cache() < 0)
+		die(_("could not read index"));
 }
 
 /* This is called when no merge was necessary. */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
new file mode 100755
index 00000000000..89a62ac53b3
--- /dev/null
+++ b/t/t7607-merge-state.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+
+test_description="Test that merge state is as expected after failed merge"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+. ./test-lib.sh
+
+test_expect_success 'Ensure we restore original state if no merge strategy handles it' '
+	test_commit --no-tag "Initial" base base &&
+
+	for b in branch1 branch2 branch3
+	do
+		git checkout -b $b main &&
+		test_commit --no-tag "Change on $b" base $b || return 1
+	done &&
+
+	git checkout branch1 &&
+	# This is a merge that octopus cannot handle.  Note, that it does not
+	# just hit conflicts, it completely fails and says that it cannot
+	# handle this type of merge.
+	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
+	grep "fatal: merge program failed" output &&
+	grep "Should not be doing an octopus" output &&
+
+	# Make sure we did not leave stray changes around when no appropriate
+	# merge strategy was found
+	git diff --exit-code --name-status &&
+	test_path_is_missing .git/MERGE_HEAD
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 5/7] merge: make restore_state() restore staged state too
  2022-07-22 10:53         ` Ævar Arnfjörð Bjarmason
@ 2022-07-23  1:56           ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-23  1:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano

On Fri, Jul 22, 2022 at 3:55 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > There are multiple issues at play here:
> >
> >   1) If `git merge` is invoked with staged changes, it should abort
> >      without doing any merging, and the user's working tree and index
> >      should be the same as before merge was invoked.
> >   2) Merge strategies are responsible for enforcing the index == HEAD
> >      requirement. (See 9822175d2b ("Ensure index matches head before
> >      invoking merge machinery, round N", 2019-08-17) for some history
> >      around this.)
> >   3) Merge strategies can bail saying they are not an appropriate
> >      handler for the merge in question (possibly allowing other
> >      strategies to be used instead).
> >   4) Merge strategies can make changes to the index and working tree,
> >      and have no expectation to clean up after themselves, *even* if
> >      they bail out and say they are not an appropriate handler for
> >      the merge in question.  (The `octopus` merge strategy does this,
> >      for example.)
> >   5) Because of (3) and (4), builtin/merge.c stashes state before
> >      trying merge strategies and restores it afterward.
> >
> > Unfortunately, if users had staged changes before calling `git merge`,
> > builtin/merge.c could do the following:
> >
> >    * stash the changes, in order to clean up after the strategies
> >    * try all the merge strategies in turn, each of which report they
> >      cannot function due to the index not matching HEAD
> >    * restore the changes via "git stash apply"
> >
> > But that last step would have the net effect of unstaging the user's
> > changes.  Fix this by adding the "--index" option to "git stash apply".
> > While at it, also squelch the stash apply output; we already report
> > "Rewinding the tree to pristine..." and don't need a detailed `git
> > status` report afterwards.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/merge.c                          | 5 +++--
> >  t/t6424-merge-unrelated-index-changes.sh | 7 ++++++-
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index 4170c30317e..f807bf335bd 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -383,14 +383,15 @@ static void reset_hard(const struct object_id *oid, int verbose)
> >  static void restore_state(const struct object_id *head,
> >                         const struct object_id *stash)
> >  {
> > -     const char *args[] = { "stash", "apply", NULL, NULL };
> > +     const char *args[] = { "stash", "apply", "--index", "--quiet",
> > +                            NULL, NULL };
> >
> >       if (is_null_oid(stash))
> >               return;
> >
> >       reset_hard(head, 1);
> >
> > -     args[2] = oid_to_hex(stash);
> > +     args[4] = oid_to_hex(stash);
> >
> >       /*
> >        * It is OK to ignore error here, for example when there was
>
> Just a nit/side comment: This is one of these older-style arg
> constructions that we've replaced with strvec in most other places.
>
> Let's leave this alone for now (especially in a v4), but FWIW I wouldn't
> mind if these sort of changes were strvec converted while at it:
>
>         diff --git a/builtin/merge.c b/builtin/merge.c
>         index 64def49734a..c3a3a1fde50 100644
>         --- a/builtin/merge.c
>         +++ b/builtin/merge.c
>         @@ -383,21 +383,23 @@ static void reset_hard(const struct object_id *oid, int verbose)
>          static void restore_state(const struct object_id *head,
>                                   const struct object_id *stash)
>          {
>         -       const char *args[] = { "stash", "apply", "--index", "--quiet",
>         -                              NULL, NULL };
>         +       struct strvec args = STRVEC_INIT;
>         +
>         +       strvec_pushl(&args, "stash", "apply", "--index", "--quiet", NULL);
>
>                 reset_hard(head, 1);
>
>                 if (is_null_oid(stash))
>                         goto refresh_cache;
>
>         -       args[4] = oid_to_hex(stash);
>         +       strvec_push(&args, oid_to_hex(stash));
>
>                 /*
>                  * It is OK to ignore error here, for example when there was
>                  * nothing to restore.
>                  */
>         -       run_command_v_opt(args, RUN_GIT_CMD);
>         +       run_command_v_opt(args.v, RUN_GIT_CMD);
>         +       strvec_clear(&args);
>
>          refresh_cache:
>                 if (discard_cache() < 0 || read_cache() < 0)
>
> I.e. it takes about as much mental energy to review that as counting the
> args elements and seeing that 2 to 4 is correct :)

I like this change and included it in my reroll.  Thanks!

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-23  0:28           ` Elijah Newren
@ 2022-07-23  5:44             ` Ævar Arnfjörð Bjarmason
  2022-07-26  1:58               ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-23  5:44 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano


On Fri, Jul 22 2022, Elijah Newren wrote:

> On Fri, Jul 22, 2022 at 3:46 AM Ævar Arnfjörð Bjarmason
> [...]
>> I don't know enough about the context here, but given our *.sh->C
>> migration elsewhere it's a bit unfortunate to see more *.sh code added
>> back.
>
> This seems like a curious objection.  "We are trying to get rid of
> shell scripts, so don't even fix bugs in any of the existing ones." ?
>
>> We have "git merge" driving this, isn't it OK to have it make this
>> check before invoking "resolve" (may be a stupid question).
>
> Ah, I can kind of see where you're coming from now, but that seems to
> me to be bending over backwards in attempting to fix a component
> written in shell without actually modifying the shell.
> builtin/merge.c is some glue code that can call multiple different
> strategies, but isn't the place for the implementation of the
> strategies themselves, and I'd hate to see us put half the
> implementation in one place and half in another.  In addition, besides
> the separation of concerns issue::
>
>    * We document that users can add their own merge strategies (a
> shell or executable named git-merge-$USERNAME and "git merge -s
> $USERNAME" will call them)
>    * git-merge-resolve and git-merge-octopus serve as examples
>    * Our examples should demonstrate correct behavior and perform
> documented, required steps.  This particular check is important:
>
>     /*
>      * At this point, we need a real merge.  No matter what strategy
>      * we use, it would operate on the index, possibly affecting the
>      * working tree, and when resolved cleanly, have the desired
>      * tree in the index -- this means that the index must be in
>      * sync with the head commit.  The strategies are responsible
>      * to ensure this.
>      */
>
> So, even if someone were to reimplement git-merge-resolve.sh in C, and
> start the deprecation process with some merge.useBuiltinResolve config
> setting (similar to rebase.useBuiltin), I'd still want this shell fix
> added to git-merge-resolve.sh in the meantime, both as an important
> bugfix, and so that people looking for merge strategy examples who
> find this script hopefully find a new enough version with this
> important check included.
>
> In general, if merge strategies do not perform this check, we have
> observed that they often will either (a) discard users' staged changes
> (best case) or (b) smash staged changes into the created commit and
> thus create some kind of evil merge (making it look like they created
> a merge normally, and then amended the merge with additional changes).
>
> We're lucky that the way resolve was implemented, other git calls
> would usually incidentally catch such issues for us without an
> explicit check.  We were also lucky that the observed behavior was
> '(a)' rather than '(b)' for resolve.  But the issue should still be
> fixed.

Makes sense I guess, yeah I was wondering if we could just assume that
"git merge" would always invoke those, and therefore could offload more
of the pre-flight checks over there.

>> For this code in particular it:
>>
>>  * Uses spaces, not tabs
>
> Yes, that's fair, but as I mentioned in the commit message, it was
> copied from git-merge-octopus.sh.  So, as you say below, "so did the
> older version".

*nod*

>>  * We lose the diff-index .. --name-only exit code (segfault), but so
>>    did the older version
>
> Um, I don't understand this objection.  I think you are referring to
> the pipe to sed, but if so...who cares?  The exit code would be lost
> anyway because we aren't running under errexit, and the next line of
> code ignores any and all previous exit codes when it runs "exit 2".
> And if you're not referring to the pipe to sed but the fact that it
> unconditionally returns an exit code of 2 on the next line, then yes
> that is the expected return code.  Whatever the diff-index segfault
> returns would be the wrong exit status and could fool the
> builtin/merge.c into doing the wrong thing.  It expects merge
> strategies to return one of three exit codes: 0, 1, or 2:
>
>     /*
>      * The backend exits with 1 when conflicts are
>      * left to be resolved, with 2 when it does not
>      * handle the given merge at all.
>      */
>
> So, ignoring the return code from diff-index is correct behavior here.
>
> Were you thinking this was a test script or something?

We can leave this for now.

But no. Whatever the merge driver is documenting as its normal return
values we really should be ferrying up abort() and segfault, per the
"why do we miss..." in:
https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/

I.e. this is one of the cases in the test suite where we haven't closed
that gap, and could hide segfaults as a "normal" exit 2.

So I think your v5 is fine as-is, but in general I'd be really
interested if you want to double-down on this view for the merge drivers
for some reason, because my current plan for addressing these blindspots
outlined in the above wouldn't work then...

 >>  * I wonder if bending over backwards to emit the exact message we
>>    emitted before is worth it
>>
>> If you just make this something like (untested):
>>
>>         {
>>                 gettext "error: " &&
>>                 gettextln "Your local..."
>>         }
>>
>> You could re-use the translation from the *.c one (and the "error: " one
>> we'll get from usage.c).
>>
>> That leaves "\n %s" as the difference, but we could just remove that
>> from the _() and emit it unconditionally, no?
>
> ??
>
> Copying a few lines from git-merge-octopus.sh to get the same fix it
> has is "bending over backwards"?  That's what I call "doing the
> easiest thing possible" (and which _also_ has the benefit of being
> battle tested code), and then you describe a bunch of gymnastics as an
> alternative?  I see your suggestion as running afoul of the objection
> you are raising, and the code I'm adding as being a solution to that
> particular objection.  So this particular flag you are raising is
> confusing to me.

I wasn't aware of some greater context vis-as-vis octopus, but it just
seemed to me that you were trying to maintain the "Error" v.s. "error"
distinction, i.e. the C code you're adding uses lower case, the *.sh
upper-case.

Which I see is for consistency with some existing message we have a
translation for, so if that was the main goal (and not bug-for-bug
message compatibility) the above gettext/gettextln use would allow you
to re-use the C i18n.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
  2022-07-21 16:09       ` Junio C Hamano
@ 2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
  2022-07-26  1:31         ` Elijah Newren
  1 sibling, 1 reply; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-25 10:38 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine, Junio C Hamano, Elijah Newren


On Thu, Jul 21 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> builtin/merge is setup to allow multiple strategies to be specified,
> and it will find the "best" result and use it.  This is defeated if
> some of the merge strategies abort early when they cannot handle the
> merge.  Fix the logic that calls recursive and ort to not do such an
> early abort, but instead return "2" or "unhandled" so that the next
> strategy can try to handle the merge.
>
> Coming up with a testcase for this is somewhat difficult, since
> recursive and ort both handle nearly any two-headed merge (there is
> a separate code path that checks for non-two-headed merges and
> already returns "2" for them).  So use a somewhat synthetic testcase
> of having the index not match HEAD before the merge starts, since all
> merge strategies will abort for that.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/merge.c                          |  6 ++++--
>  t/t6402-merge-rename.sh                  |  2 +-
>  t/t6424-merge-unrelated-index-changes.sh | 16 ++++++++++++++++
>  t/t6439-merge-co-error-msgs.sh           |  1 +
>  4 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 13884b8e836..dec7375bf2a 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -754,8 +754,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>  		else
>  			clean = merge_recursive(&o, head, remoteheads->item,
>  						reversed, &result);
> -		if (clean < 0)
> -			exit(128);
> +		if (clean < 0) {
> +			rollback_lock_file(&lock);
> +			return 2;
> +		}
>  		if (write_locked_index(&the_index, &lock,
>  				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  			die(_("unable to write %s"), get_index_file());
> diff --git a/t/t6402-merge-rename.sh b/t/t6402-merge-rename.sh
> index 3a32b1a45cf..772238e582c 100755
> --- a/t/t6402-merge-rename.sh
> +++ b/t/t6402-merge-rename.sh
> @@ -210,7 +210,7 @@ test_expect_success 'updated working tree file should prevent the merge' '
>  	echo >>M one line addition &&
>  	cat M >M.saved &&
>  	git update-index M &&
> -	test_expect_code 128 git pull --no-rebase . yellow &&
> +	test_expect_code 2 git pull --no-rebase . yellow &&
>  	test_cmp M M.saved &&
>  	rm -f M.saved
>  '
> diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh
> index f35d3182b86..8b749e19083 100755
> --- a/t/t6424-merge-unrelated-index-changes.sh
> +++ b/t/t6424-merge-unrelated-index-changes.sh
> @@ -268,4 +268,20 @@ test_expect_success 'subtree' '
>  	test_path_is_missing .git/MERGE_HEAD
>  '
>  
> +test_expect_success 'resolve && recursive && ort' '
> +	git reset --hard &&
> +	git checkout B^0 &&
> +
> +	test_seq 0 10 >a &&
> +	git add a &&
> +
> +	sane_unset GIT_TEST_MERGE_ALGORITHM &&
> +	test_must_fail git merge -s resolve -s recursive -s ort C^0 >output 2>&1 &&
> +
> +	grep "Trying merge strategy resolve..." output &&
> +	grep "Trying merge strategy recursive..." output &&
> +	grep "Trying merge strategy ort..." output &&
> +	grep "No merge strategy handled the merge." output
> +'
> +
>  test_done
> diff --git a/t/t6439-merge-co-error-msgs.sh b/t/t6439-merge-co-error-msgs.sh
> index 5bfb027099a..52cf0c87690 100755
> --- a/t/t6439-merge-co-error-msgs.sh
> +++ b/t/t6439-merge-co-error-msgs.sh
> @@ -47,6 +47,7 @@ test_expect_success 'untracked files overwritten by merge (fast and non-fast for
>  		export GIT_MERGE_VERBOSITY &&
>  		test_must_fail git merge branch 2>out2
>  	) &&
> +	echo "Merge with strategy ${GIT_TEST_MERGE_ALGORITHM:-ort} failed." >>expect &&
>  	test_cmp out2 expect &&
>  	git reset --hard HEAD^
>  '

I'm re-rolling ab/leak-check, and came up with the below (at the very
end) to "fix" a report in builtin/merge.c, reading your commit message
your fix seems obviously better.

Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the
&lock file, which is correct.

I *could* say "that's not my problem", i.e. we didn't unlock it before
(we rely on atexit). The truth is I just missed it, but having said that
it *is* true that we could do without it, or do it as a separate chaneg.

I'm just posting my version below to help move yours forward, i.e. to
show that someone else has carefully at least this part.

But it is worth noting from staring at the two that your version is
mixing several different behavior changes into one, which *could* be
split up (but whether you think that's worth it I leave to you).

Maybe I'm the only one initially confused by it, and that's probably
just from being mentally biased towards my own "solution". Those are (at
least):

 1. Before we didn't explicitly unlock() before exit(), but had atexit()
    do it, that could be a one-line first commit. This change is
    obviously good.

 2. A commit like mine could come next, i.e. we bug-for-bug do what we
    do do now, but just run the "post-builtin" logic when we return from
    cmd_merge().

    Doing it as an in-between would be some churn, as we'll need to get
    rid of "early_exit" again, but would allow us to incrementally move
    forward to...

 3. ...then we'd say "but it actually makes sense not to early abort",
     i.e. you want to change this so that we'll run the logic between
     try_merge_strategy() exiting with 128 now and the return from
     cmd_merge().

     This bit is my main sticking point in reviewing your change,
     i.e. your "a testcase for this is somewhat difficult" somewhat
     addresses this, but (and maybe I'm wrong) it seems to me that 

     Editing that code the post-image looks like this, with my
     commentary & most of the code removed, i.e. just focusing on the
     branches we do and don't potentially have tests for:

     		/* Before this we fall through from ret == 128 (or ret == 2...) */
		if (automerge_was_ok) { // not tested?
		if (!best_strategy) {
			// we test this...
			if (use_strategies_nr > 1)
				// And this: _("No merge strategy handled the merge.\n"));
			else
				// And this: _("Merge with strategy %s failed.\n"),
		} else if (best_strategy == wt_strategy)
			// but not this?
		else
			// Or this, where we e.g. say "Rewinding the tree to pristene..."?
	
		if (squash) {
			// this?
		} else
			// this? (probably, yes)
			write_merge_state(remoteheads);
	
		if (merge_was_ok)
			// this? (probably, yes, we just don't grep it?)
		else
			// this? maybe yes because it's covered by the
			// "failed" above too?
			ret = suggest_conflicts();
	
	done:
		if (!automerge_was_ok) {
			// this? ditto the first "not tested?"
		}

   I.e. are you confident that we want to continue now in these various
   cases, where we have squash, !automerge_was_ok etc. I think it would
   be really useful to comment on (perhaps by amending the above
   pseudocode) what test cases we're not testing / test already etc.

 4. Having done all that (or maybe this can't be split up / needs to
    come earlier) you say that we'd like to not generically call this
    exit state 128, but have it under the "exit(2)" umbrella.

Again, all just food for thought, and a way to step-by-step go through
how I came about reviewing this in detail, I hope it and the below
version I came up with before seeing yours helps.

P.s.: The last paragraph in my commit message does not point to some
hidden edge case in the code behavior here, it's just that clang/gcc are
funny about exit() and die() control flow when combined with
-fsanitize=address and higher optimization levels.

-- >8 --
Subject: [PATCH] merge: return, don't use exit()

Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare
the builtins for a libified merge_recursive(), 2016-07-26) to ferry up
an "early return" state, rather than having try_merge_strategy() call
exit() itself.

This is a follow-up to dda31145d79 (Merge branch
'ab/usage-die-message' into gc/branch-recurse-submodules-fix,
2022-03-31).

The only behavior change here is that we'll now properly catch other
issues on our way out, see e.g. [1] and the interaction with /dev/full
for an example.

The immediate reason to do this change is because it's one of the
cases where clang and gcc's SANITIZE=leak behavior differs. Under
clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers
a leak, but gcc spots it.

1. https://lore.kernel.org/git/87im2n3gje.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 23170f2d2a6..a8d5d04f622 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid)
 
 static int try_merge_strategy(const char *strategy, struct commit_list *common,
 			      struct commit_list *remoteheads,
-			      struct commit *head)
+			      struct commit *head, int *early_exit)
 {
 	const char *head_arg = "HEAD";
 
+	*early_exit = 0;
+
 	if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
 		return error(_("Unable to write index."));
 
@@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 		else
 			clean = merge_recursive(&o, head, remoteheads->item,
 						reversed, &result);
-		if (clean < 0)
-			exit(128);
+		if (clean < 0) {
+			*early_exit = 1;
+			return 128;
+		}
 		if (write_locked_index(&the_index, &lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
@@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
 		int ret, cnt;
+		int early_exit;
+
 		if (i) {
 			printf(_("Rewinding the tree to pristine...\n"));
 			restore_state(&head_commit->object.oid, &stash);
@@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 		ret = try_merge_strategy(use_strategies[i]->name,
 					 common, remoteheads,
-					 head_commit);
+					 head_commit, &early_exit);
+		if (early_exit)
+			goto done;
+
 		/*
 		 * The backend exits with 1 when conflicts are
 		 * left to be resolved, with 2 when it does not
@@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	} else if (best_strategy == wt_strategy)
 		; /* We already have its result in the working tree. */
 	else {
+		int new_ret, early_exit;
+
 		printf(_("Rewinding the tree to pristine...\n"));
 		restore_state(&head_commit->object.oid, &stash);
 		printf(_("Using the %s strategy to prepare resolving by hand.\n"),
 			best_strategy);
-		try_merge_strategy(best_strategy, common, remoteheads,
-				   head_commit);
+		new_ret = try_merge_strategy(best_strategy, common, remoteheads,
+					     head_commit, &early_exit);
+		if (early_exit) {
+			ret = new_ret;
+			goto done;
+		}
 	}
 
 	if (squash) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [PATCH v5 0/8] Fix merge restore state
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (7 preceding siblings ...)
  2022-07-23  1:53         ` [PATCH v5 8/8] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
@ 2022-07-25 19:03         ` Junio C Hamano
  2022-07-26  1:59           ` Elijah Newren
  2022-07-26  4:03         ` ZheNing Hu
  9 siblings, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2022-07-25 19:03 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, ZheNing Hu, Eric Sunshine, Elijah Newren,
	Ævar Arnfjörð Bjarmason

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This started as a simple series to fix restore_state() in builtin/merge.c,
> fixing an issue reported by ZheNing Hu[3]. It now fixes several bugs and has
> grown so much it's hard to call it simple. Anyway...

Thanks.  Unless we hear any more comments, let's start merging it
down to 'next' soonish.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
@ 2022-07-26  1:31         ` Elijah Newren
  2022-07-26  6:54           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren @ 2022-07-26  1:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano

On Mon, Jul 25, 2022 at 4:06 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
[...]
>
> I'm re-rolling ab/leak-check, and came up with the below (at the very
> end) to "fix" a report in builtin/merge.c, reading your commit message
> your fix seems obviously better.
>
> Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the
> &lock file, which is correct.
>
> I *could* say "that's not my problem", i.e. we didn't unlock it before
> (we rely on atexit). The truth is I just missed it, but having said that
> it *is* true that we could do without it, or do it as a separate chaneg.
>
> I'm just posting my version below to help move yours forward, i.e. to
> show that someone else has carefully at least this part.

"has carefully ... at least this part" ?

I think you have a missing verb there.

> But it is worth noting from staring at the two that your version is
> mixing several different behavior changes into one, which *could* be
> split up (but whether you think that's worth it I leave to you).
>
> Maybe I'm the only one initially confused by it, and that's probably
> just from being mentally biased towards my own "solution". Those are (at
> least):
>
>  1. Before we didn't explicitly unlock() before exit(), but had atexit()
>     do it, that could be a one-line first commit. This change is
>     obviously good.

That'd be fine.  (Though at this point, I'd rather not mess with the
series more.)

>  2. A commit like mine could come next, i.e. we bug-for-bug do what we
>     do do now, but just run the "post-builtin" logic when we return from
>     cmd_merge().
>
>     Doing it as an in-between would be some churn, as we'll need to get
>     rid of "early_exit" again, but would allow us to incrementally move
>     forward to...

So, add a step that makes it glaringly obvious that the code is not
only buggy but totally at odds with itself?

builtin/merge.c was designed to allow pluggable backends and to
automatically pick the "best" one if more than one is specified.  We
had a bug in one line of code that defeated the design, by making it
not bother consulting beyond the first failed backend in some cases.
That's the bug I'm trying to address.  Your patch would make the
inconsistency with the design both bigger and more obvious; I don't
see how it's a useful step to take.

Now, the existing design might be questionable.  In fact, I'm not sure
I like it.  But I think we should either change the design, or fix
things in a way that improves towards the existing design.

>  3. ...then we'd say "but it actually makes sense not to early abort",
>      i.e. you want to change this so that we'll run the logic between
>      try_merge_strategy() exiting with 128 now and the return from
>      cmd_merge().
>
>      This bit is my main sticking point in reviewing your change,
>      i.e. your "a testcase for this is somewhat difficult" somewhat
>      addresses this, but (and maybe I'm wrong) it seems to me that
>
>      Editing that code the post-image looks like this, with my
>      commentary & most of the code removed, i.e. just focusing on the
>      branches we do and don't potentially have tests for:
>
>                 /* Before this we fall through from ret == 128 (or ret == 2...) */
>                 if (automerge_was_ok) { // not tested?
>                 if (!best_strategy) {
>                         // we test this...
>                         if (use_strategies_nr > 1)
>                                 // And this: _("No merge strategy handled the merge.\n"));
>                         else
>                                 // And this: _("Merge with strategy %s failed.\n"),
>                 } else if (best_strategy == wt_strategy)
>                         // but not this?
>                 else
>                         // Or this, where we e.g. say "Rewinding the tree to pristene..."?
>
>                 if (squash) {
>                         // this?
>                 } else
>                         // this? (probably, yes)
>                         write_merge_state(remoteheads);
>
>                 if (merge_was_ok)
>                         // this? (probably, yes, we just don't grep it?)
>                 else
>                         // this? maybe yes because it's covered by the
>                         // "failed" above too?
>                         ret = suggest_conflicts();
>
>         done:
>                 if (!automerge_was_ok) {
>                         // this? ditto the first "not tested?"
>                 }
>
>    I.e. are you confident that we want to continue now in these various
>    cases, where we have squash, !automerge_was_ok etc. I think it would
>    be really useful to comment on (perhaps by amending the above
>    pseudocode) what test cases we're not testing / test already etc.

To be honest, I'm confused by what looks like a make-work project.
Perhaps if I understood your frame of reference better, maybe they
wouldn't bother me, but there's a few things here that seem a little
funny to me.

You've highlighted that you are worried about the case where ret is 2
(or 128) at the point of all these branches in question.  However,
three of those branches can almost trivially be deduced to never be
taken unless ret is 0.  One of the other codepaths, for freeing
memory, is correct regardless of the value of ret -- the memory is
conditionally freed earlier and the "if"-check exists only to avoid a
double free (and checking the recent commit message where those lines
were added would explain this, though I'm not sure why it'd even need
explaining separately for e.g. ret == 2 compared to any other value).
Three of the other code paths involve nothing more than print
statements.  Now, there are many codepaths you highlighted, and
perhaps there are some where it's not trivial to determine whether
they are okay in combination with a different return value.  And it
may also be easy to miss some of the "almost trivial" cases.  I'd
understand better if you asked about the tougher ones or only some of
the easier ones, but it feels like you didn't try to check any of them
and instead wanted me to just spend time commenting on every single
code branch?

I hope that doesn't come across harshly.  I'm just struggling to
understand where the detailed request is coming from.

However, perhaps I can obviate the whole set of requests by just
pointing out that I don't think any of them are relevant.  The premise
for the audit request seems to be that you are worrying that the
change from die() to "return 2" (or 128) in try_merge_strategy() will
result in the calling code getting into a state it has never
experienced before which might uncover latent bugs.  We can trivially
point out that it's not a new state, though: such return values were
already possible from try_merge_strategy() via the final line of code
in the function -- namely, the "return try_merge_command(...)" code
path.  And a return value of 2 from try_merge_command() and
try_merge_strategy() isn't merely theoretical either -- you can easily
trigger it multiple ways; the easiest is perhaps by passing `-s
octopus` when doing a non-octopus merge.  (You can also make the
testcase more complex by combining that with as many other additional
merge strategies as you want, e.g. "git merge -s octopus -s resolve -s
recursive -s mySpecialStrategy $BRANCH".  You can also move the
octopus to the end if you want to test what happens if it is tried
last.  Lots of possibilities exist).

>  4. Having done all that (or maybe this can't be split up / needs to
>     come earlier) you say that we'd like to not generically call this
>     exit state 128, but have it under the "exit(2)" umbrella.

I don't see how reading your set of steps 1-4 logically restores the
design of builtin/merge.c, though.  An example: what if the user ran
"git merge -s resolve -s recursive $BRANCH", and `resolve` handled the
merge but had some conflicts, while `recursive` just failed and
returned a value of clean < 0?  In such a case, builtin/merge.c is
supposed to restore the tree to pristine after attempting the
recursive backend, and then redo using the best_strategy (which would
be `resolve` in this example case).  Your steps 1-4 never address such
a thing.  Your steps might incidentally address such a case as a side
effect of their implementation, but that's not at all clear.  Since
the whole point is fixing the code to match the existing design, it
seems really odd to split things into a set of steps that obscures
that fix.

> Again, all just food for thought, and a way to step-by-step go through
> how I came about reviewing this in detail, I hope it and the below
> version I came up with before seeing yours helps.
>
> P.s.: The last paragraph in my commit message does not point to some
> hidden edge case in the code behavior here, it's just that clang/gcc are
> funny about exit() and die() control flow when combined with
> -fsanitize=address and higher optimization levels.
>
> -- >8 --
> Subject: [PATCH] merge: return, don't use exit()
>
> Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare
> the builtins for a libified merge_recursive(), 2016-07-26) to ferry up
> an "early return" state, rather than having try_merge_strategy() call
> exit() itself.
>
> This is a follow-up to dda31145d79 (Merge branch
> 'ab/usage-die-message' into gc/branch-recurse-submodules-fix,
> 2022-03-31).
>
> The only behavior change here is that we'll now properly catch other
> issues on our way out, see e.g. [1] and the interaction with /dev/full
> for an example.
>
> The immediate reason to do this change is because it's one of the
> cases where clang and gcc's SANITIZE=leak behavior differs. Under
> clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers
> a leak, but gcc spots it.
>
> 1. https://lore.kernel.org/git/87im2n3gje.fsf@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/merge.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 23170f2d2a6..a8d5d04f622 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid)
>
>  static int try_merge_strategy(const char *strategy, struct commit_list *common,
>                               struct commit_list *remoteheads,
> -                             struct commit *head)
> +                             struct commit *head, int *early_exit)
>  {
>         const char *head_arg = "HEAD";
>
> +       *early_exit = 0;
> +
>         if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
>                 return error(_("Unable to write index."));
>
> @@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>                 else
>                         clean = merge_recursive(&o, head, remoteheads->item,
>                                                 reversed, &result);
> -               if (clean < 0)
> -                       exit(128);
> +               if (clean < 0) {
> +                       *early_exit = 1;
> +                       return 128;
> +               }
>                 if (write_locked_index(&the_index, &lock,
>                                        COMMIT_LOCK | SKIP_IF_UNCHANGED))
>                         die(_("unable to write %s"), get_index_file());
> @@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>
>         for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
>                 int ret, cnt;
> +               int early_exit;
> +
>                 if (i) {
>                         printf(_("Rewinding the tree to pristine...\n"));
>                         restore_state(&head_commit->object.oid, &stash);
> @@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>
>                 ret = try_merge_strategy(use_strategies[i]->name,
>                                          common, remoteheads,
> -                                        head_commit);
> +                                        head_commit, &early_exit);
> +               if (early_exit)
> +                       goto done;
> +
>                 /*
>                  * The backend exits with 1 when conflicts are
>                  * left to be resolved, with 2 when it does not
> @@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>         } else if (best_strategy == wt_strategy)
>                 ; /* We already have its result in the working tree. */
>         else {
> +               int new_ret, early_exit;
> +
>                 printf(_("Rewinding the tree to pristine...\n"));
>                 restore_state(&head_commit->object.oid, &stash);
>                 printf(_("Using the %s strategy to prepare resolving by hand.\n"),
>                         best_strategy);
> -               try_merge_strategy(best_strategy, common, remoteheads,
> -                                  head_commit);
> +               new_ret = try_merge_strategy(best_strategy, common, remoteheads,
> +                                            head_commit, &early_exit);
> +               if (early_exit) {
> +                       ret = new_ret;
> +                       goto done;
> +               }

Incidentally, this is essentially dead code being added in this last
hunk.  This final `else` block can only be triggered when
best_strategy has been set, and best_strategy will only be set after a
merge strategy works (possibly with conflicts, but was at least
appropriate for the problem).  Anyway, by the point of this `else`
block, we've run multiple strategies already, at least one worked, and
the "best" one was not the last one tried.  So, at this point, we
simply rewind the tree and rerun the known-working merge strategy.  (I
guess you could say that maybe the merge strategy might not return the
same result despite being given the same inputs, so theoretically
early_exit could come back true and this hunk isn't quite dead.  Just
mostly dead, I guess.)

>         }
>
>         if (squash) {
> --
> 2.36.1

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-23  5:44             ` Ævar Arnfjörð Bjarmason
@ 2022-07-26  1:58               ` Elijah Newren
  2022-07-26  6:35                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 87+ messages in thread
From: Elijah Newren @ 2022-07-26  1:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano

On Fri, Jul 22, 2022 at 10:53 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Fri, Jul 22 2022, Elijah Newren wrote:
>
[...]
> > So, ignoring the return code from diff-index is correct behavior here.
> >
> > Were you thinking this was a test script or something?
>
> We can leave this for now.
>
> But no. Whatever the merge driver is documenting as its normal return
> values we really should be ferrying up abort() and segfault, per the
> "why do we miss..." in:
> https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/
>
> I.e. this is one of the cases in the test suite where we haven't closed
> that gap, and could hide segfaults as a "normal" exit 2.
>
> So I think your v5 is fine as-is, but in general I'd be really
> interested if you want to double-down on this view for the merge drivers
> for some reason, because my current plan for addressing these blindspots
> outlined in the above wouldn't work then...

Quoting from there:

> * We have in-tree shellscripts like "git-merge-one-file.sh" invoking
>   git commands, they'll usually return their own exit codes on "git"
>   failure, rather then ferrying up segfault or abort() exit code.
>
>   E.g. these invocations in git-merge-one-file.sh leak, but aren't
>   reflected in the "git merge" exit code:
>
>src1=$(git unpack-file $2)
>src2=$(git unpack-file $3)
>
>   That case would be easily "fixed" by adding a line like this after
>   each assignment:
>
>test $? -ne 0 && exit $?
>
>   But we'd then in e.g. "t6407-merge-binary.sh" run into
>   write_tree_trivial() in "builtin/merge.c" calling die() instead of
>   ferrying up the relevant exit code.

Sidenote, but I don't think t6407-merge-binary.sh calls into
write_tree_trivial().  Doesn't in my testing, anyway.

Are you really planning on auditing every line of git-bisect.sh,
git-merge*.sh, git-sh-setup.sh, git-submodule.sh, git-web--browse.sh,
and others, and munging every area that invokes git to check the exit
status?  Yuck.  A few points:

  * Will this really buy you anything?  Don't we have other regression
tests of all these commands (e.g. "git unpack-file") which ought to
show the same memory leaks?  This seems like high lift, low value to
me, and just fixing direct invocations in the regression tests is
where the value comes.  (If direct coverage is lacking in the
regression tests, shouldn't the solution be to add coverage?)
  * Won't this be a huge review and support burden to maintain the
extra checking?
  * Some of these scripts, such as git-merge-resolve.sh and
git-merge-octopus.sh are used as examples of e.g. merge drivers, and
invasive checks whose sole purpose is memory leak checking seems to
run counter to the purpose of being a simple example for users
  * Wouldn't using errexit and pipefail be an easier way to accomplish
checking the exit status (avoiding the problems from the last few
bullets)?  You'd still have to audit the code and write e.g.
shutupgrep wrappers (since grep reports whether it found certain
patterns in the input, rather than whether it was able to perform the
search on the input, and we often only care about the latter), but it
at least would automatically check future git invocations.
  * Are we running the risk of overloading special return codes (e.g.
125 in git-bisect)

I do still think that "2" is the correct return code for the
shell-script merge strategies here, though I think it's feasible in
their cases to change the documentation to relax the return code
requirements in such a way to allow those scripts to utilize errexit
and pipefail.

>  >>  * I wonder if bending over backwards to emit the exact message we
> >>    emitted before is worth it
> >>
> >> If you just make this something like (untested):
> >>
> >>         {
> >>                 gettext "error: " &&
> >>                 gettextln "Your local..."
> >>         }
> >>
> >> You could re-use the translation from the *.c one (and the "error: " one
> >> we'll get from usage.c).
> >>
> >> That leaves "\n %s" as the difference, but we could just remove that
> >> from the _() and emit it unconditionally, no?
> >
> > ??
> >
> > Copying a few lines from git-merge-octopus.sh to get the same fix it
> > has is "bending over backwards"?  That's what I call "doing the
> > easiest thing possible" (and which _also_ has the benefit of being
> > battle tested code), and then you describe a bunch of gymnastics as an
> > alternative?  I see your suggestion as running afoul of the objection
> > you are raising, and the code I'm adding as being a solution to that
> > particular objection.  So this particular flag you are raising is
> > confusing to me.
>
> I wasn't aware of some greater context vis-as-vis octopus,

I didn't expect everyone to be, but that's why I put it in the commit
message.  ;-)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v5 0/8] Fix merge restore state
  2022-07-25 19:03         ` [PATCH v5 0/8] Fix merge restore state Junio C Hamano
@ 2022-07-26  1:59           ` Elijah Newren
  0 siblings, 0 replies; 87+ messages in thread
From: Elijah Newren @ 2022-07-26  1:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Ævar Arnfjörð Bjarmason

On Mon, Jul 25, 2022 at 12:03 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > This started as a simple series to fix restore_state() in builtin/merge.c,
> > fixing an issue reported by ZheNing Hu[3]. It now fixes several bugs and has
> > grown so much it's hard to call it simple. Anyway...
>
> Thanks.  Unless we hear any more comments, let's start merging it
> down to 'next' soonish.

Sounds good; let's plan on that.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v5 0/8] Fix merge restore state
  2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
                           ` (8 preceding siblings ...)
  2022-07-25 19:03         ` [PATCH v5 0/8] Fix merge restore state Junio C Hamano
@ 2022-07-26  4:03         ` ZheNing Hu
  9 siblings, 0 replies; 87+ messages in thread
From: ZheNing Hu @ 2022-07-26  4:03 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: Git List, Eric Sunshine, Junio C Hamano, Elijah Newren,
	Ævar Arnfjörð Bjarmason

Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> 于2022年7月23日周六 09:53写道:
>
> This started as a simple series to fix restore_state() in builtin/merge.c,
> fixing an issue reported by ZheNing Hu[3]. It now fixes several bugs and has
> grown so much it's hard to call it simple. Anyway...
>

Thanks. This patch is good enough for me :)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD
  2022-07-26  1:58               ` Elijah Newren
@ 2022-07-26  6:35                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-26  6:35 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano


On Mon, Jul 25 2022, Elijah Newren wrote:

> On Fri, Jul 22, 2022 at 10:53 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Fri, Jul 22 2022, Elijah Newren wrote:
>>
> [...]
>> > So, ignoring the return code from diff-index is correct behavior here.
>> >
>> > Were you thinking this was a test script or something?
>>
>> We can leave this for now.
>>
>> But no. Whatever the merge driver is documenting as its normal return
>> values we really should be ferrying up abort() and segfault, per the
>> "why do we miss..." in:
>> https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/
>>
>> I.e. this is one of the cases in the test suite where we haven't closed
>> that gap, and could hide segfaults as a "normal" exit 2.
>>
>> So I think your v5 is fine as-is, but in general I'd be really
>> interested if you want to double-down on this view for the merge drivers
>> for some reason, because my current plan for addressing these blindspots
>> outlined in the above wouldn't work then...
>
> Quoting from there:
>
>> * We have in-tree shellscripts like "git-merge-one-file.sh" invoking
>>   git commands, they'll usually return their own exit codes on "git"
>>   failure, rather then ferrying up segfault or abort() exit code.
>>
>>   E.g. these invocations in git-merge-one-file.sh leak, but aren't
>>   reflected in the "git merge" exit code:
>>
>>src1=$(git unpack-file $2)
>>src2=$(git unpack-file $3)
>>
>>   That case would be easily "fixed" by adding a line like this after
>>   each assignment:
>>
>>test $? -ne 0 && exit $?
>>
>>   But we'd then in e.g. "t6407-merge-binary.sh" run into
>>   write_tree_trivial() in "builtin/merge.c" calling die() instead of
>>   ferrying up the relevant exit code.
>
> Sidenote, but I don't think t6407-merge-binary.sh calls into
> write_tree_trivial().  Doesn't in my testing, anyway.

I haven't gone back & looked at that, but I vaguely recall that if you
"get past" that one it was returning 128 somewhere, either directly or
indirectly.

In any case, for the purposes of that summary it's a common pattern
elsewhere...

> Are you really planning on auditing every line of git-bisect.sh,
> git-merge*.sh, git-sh-setup.sh, git-submodule.sh, git-web--browse.sh,
> and others, and munging every area that invokes git to check the exit
> status?

Not really, but just for that change explaining why it's required to
have this log-on-the-side to munge the exit code.

Although I think you might have not kept up with just how close we are
to "git rm"-ing most of our non-trivial amounts of
shellscript. git-{submodule,bisect}.sh is going away, hopefully in this
release.

*If* we go for that approach I'd think it would be relatively easy to
 add a helper to git-sh-setup.sh to wrap the various "git" callse.

>   Yuck.  A few points:
>   * Will this really buy you anything?  Don't we have other regression
> tests of all these commands (e.g. "git unpack-file") which ought to
> show the same memory leaks?  This seems like high lift, low value to
> me, and just fixing direct invocations in the regression tests is
> where the value comes.

It's a long-tail problem, and we don't need to fix it all now. I'm just
commenting on it here because there's an *addition* of a hidden exit
code, and more importantly I wanted to clear up if you thought ferrying
up abort() or segfaults wouldn't be desired in those cases.

>  (If direct coverage is lacking in the
> regression tests, shouldn't the solution be to add coverage?)


But how do we find out what we're covering? Yes "make coverage", but
that's just going to give you all C lines we "visit", but you don't know
if those lines were visited by parts of our test suite where we're
checking the exit code.

>   * Won't this be a huge review and support burden to maintain the
> extra checking?

I think the end result mostly makes things easier to deal with &
maintaine, e.g. consistently using &&-chaining, or test_cmp instead of
"test "$(...)" etc.

>   * Some of these scripts, such as git-merge-resolve.sh and
> git-merge-octopus.sh are used as examples of e.g. merge drivers, and
> invasive checks whose sole purpose is memory leak checking seems to
> run counter to the purpose of being a simple example for users

I'm not doing any of this now, but I'd think we could do something like
this (i.e. new helpers in git-sh-setup.sh):
	
	diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
	index 343fe7bccd0..f23344dfec8 100755
	--- a/git-merge-resolve.sh
	+++ b/git-merge-resolve.sh
	@@ -37,15 +37,16 @@ then
	 	exit 2
	 fi
	 
	-git update-index -q --refresh
	-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
	+run_git git update-index -q --refresh
	+run_git --exit-on-failure 2 git read-tree -u -m --aggressive $bases $head $remotes
	 echo "Trying simple merge."
	-if result_tree=$(git write-tree 2>/dev/null)
	+result_tree=$(git write-tree 2>/dev/null)
	+if check_last_git_cmd $?
	 then
	 	exit 0
	 else
	 	echo "Simple merge failed, trying Automatic merge."
	-	if git merge-index -o git-merge-one-file -a
	+	if run_git git merge-index -o git-merge-one-file -a
	 	then
	 		exit 0
	 	else

>   * Wouldn't using errexit and pipefail be an easier way to accomplish
> checking the exit status (avoiding the problems from the last few
> bullets)?  You'd still have to audit the code and write e.g.
> shutupgrep wrappers (since grep reports whether it found certain
> patterns in the input, rather than whether it was able to perform the
> search on the input, and we often only care about the latter), but it
> at least would automatically check future git invocations.

Somewhat, but unless we're going to depend on "bash" I think we can at
most use those during development to locate various issues, e.g. as I
did in this series:
https://lore.kernel.org/git/20210123130046.21975-1-avarab@gmail.com/

>   * Are we running the risk of overloading special return codes (e.g.
> 125 in git-bisect)

No, we already deal with that as a potential problem in test_must_fail
in the test suite, i.e. we just catch abort(), segfaults & the like. In
bourn shells those are 134.

> I do still think that "2" is the correct return code for the
> shell-script merge strategies here, though I think it's feasible in
> their cases to change the documentation to relax the return code
> requirements in such a way to allow those scripts to utilize errexit
> and pipefail.

I think for any such interface it makes sense to exit with 0, 1 and 2 or
whatever during normal circumstances.

But if the program you just called segfaulted I think it makes sense to
treat that as an exception. I.e. it's not just that the merge failed,
but we should really abort() in the calling program too..

Anyway, this is all quite academic at this point, but since you asked...

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge
  2022-07-26  1:31         ` Elijah Newren
@ 2022-07-26  6:54           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 87+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-26  6:54 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, ZheNing Hu,
	Eric Sunshine, Junio C Hamano


On Mon, Jul 25 2022, Elijah Newren wrote:

> On Mon, Jul 25, 2022 at 4:06 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
> [...]
>>
>> I'm re-rolling ab/leak-check, and came up with the below (at the very
>> end) to "fix" a report in builtin/merge.c, reading your commit message
>> your fix seems obviously better.
>>
>> Mine's early WIP, and I e.g. didn't notice that I forgot to unlock the
>> &lock file, which is correct.
>>
>> I *could* say "that's not my problem", i.e. we didn't unlock it before
>> (we rely on atexit). The truth is I just missed it, but having said that
>> it *is* true that we could do without it, or do it as a separate chaneg.
>>
>> I'm just posting my version below to help move yours forward, i.e. to
>> show that someone else has carefully at least this part.
>
> "has carefully ... at least this part" ?
>
> I think you have a missing verb there.

"reviewed", sorry.

>> But it is worth noting from staring at the two that your version is
>> mixing several different behavior changes into one, which *could* be
>> split up (but whether you think that's worth it I leave to you).
>>
>> Maybe I'm the only one initially confused by it, and that's probably
>> just from being mentally biased towards my own "solution". Those are (at
>> least):
>>
>>  1. Before we didn't explicitly unlock() before exit(), but had atexit()
>>     do it, that could be a one-line first commit. This change is
>>     obviously good.
>
> That'd be fine.  (Though at this point, I'd rather not mess with the
> series more.)

That's completely fine, to be clear(er) what I'm walking through here is
my review process & potential edge cases I discovered, but I think your
patch as-is does it all correctly.

>>  2. A commit like mine could come next, i.e. we bug-for-bug do what we
>>     do do now, but just run the "post-builtin" logic when we return from
>>     cmd_merge().
>>
>>     Doing it as an in-between would be some churn, as we'll need to get
>>     rid of "early_exit" again, but would allow us to incrementally move
>>     forward to...
>
> So, add a step that makes it glaringly obvious that the code is not
> only buggy but totally at odds with itself?
>
> builtin/merge.c was designed to allow pluggable backends and to
> automatically pick the "best" one if more than one is specified.  We
> had a bug in one line of code that defeated the design, by making it
> not bother consulting beyond the first failed backend in some cases.
> That's the bug I'm trying to address.  Your patch would make the
> inconsistency with the design both bigger and more obvious; I don't
> see how it's a useful step to take.
>
> Now, the existing design might be questionable.  In fact, I'm not sure
> I like it.  But I think we should either change the design, or fix
> things in a way that improves towards the existing design.

Yes, I don't think it's useful, sorry about the confusion. I was walking
through what the observable changes from the outside are here, and how
they *might* be split up.

Not as a practical suggestion, but as a way to understand your commit...

>>  3. ...then we'd say "but it actually makes sense not to early abort",
>>      i.e. you want to change this so that we'll run the logic between
>>      try_merge_strategy() exiting with 128 now and the return from
>>      cmd_merge().
>>
>>      This bit is my main sticking point in reviewing your change,
>>      i.e. your "a testcase for this is somewhat difficult" somewhat
>>      addresses this, but (and maybe I'm wrong) it seems to me that
>>
>>      Editing that code the post-image looks like this, with my
>>      commentary & most of the code removed, i.e. just focusing on the
>>      branches we do and don't potentially have tests for:
>>
>>                 /* Before this we fall through from ret == 128 (or ret == 2...) */
>>                 if (automerge_was_ok) { // not tested?
>>                 if (!best_strategy) {
>>                         // we test this...
>>                         if (use_strategies_nr > 1)
>>                                 // And this: _("No merge strategy handled the merge.\n"));
>>                         else
>>                                 // And this: _("Merge with strategy %s failed.\n"),
>>                 } else if (best_strategy == wt_strategy)
>>                         // but not this?
>>                 else
>>                         // Or this, where we e.g. say "Rewinding the tree to pristene..."?
>>
>>                 if (squash) {
>>                         // this?
>>                 } else
>>                         // this? (probably, yes)
>>                         write_merge_state(remoteheads);
>>
>>                 if (merge_was_ok)
>>                         // this? (probably, yes, we just don't grep it?)
>>                 else
>>                         // this? maybe yes because it's covered by the
>>                         // "failed" above too?
>>                         ret = suggest_conflicts();
>>
>>         done:
>>                 if (!automerge_was_ok) {
>>                         // this? ditto the first "not tested?"
>>                 }
>>
>>    I.e. are you confident that we want to continue now in these various
>>    cases, where we have squash, !automerge_was_ok etc. I think it would
>>    be really useful to comment on (perhaps by amending the above
>>    pseudocode) what test cases we're not testing / test already etc.
>
> To be honest, I'm confused by what looks like a make-work project.
> Perhaps if I understood your frame of reference better, maybe they
> wouldn't bother me, but there's a few things here that seem a little
> funny to me.
>
> You've highlighted that you are worried about the case where ret is 2
> (or 128) at the point of all these branches in question.  However,
> three of those branches can almost trivially be deduced to never be
> taken unless ret is 0.  One of the other codepaths, for freeing
> memory, is correct regardless of the value of ret -- the memory is
> conditionally freed earlier and the "if"-check exists only to avoid a
> double free (and checking the recent commit message where those lines
> were added would explain this, though I'm not sure why it'd even need
> explaining separately for e.g. ret == 2 compared to any other value).
> Three of the other code paths involve nothing more than print
> statements.  Now, there are many codepaths you highlighted, and
> perhaps there are some where it's not trivial to determine whether
> they are okay in combination with a different return value.  And it
> may also be easy to miss some of the "almost trivial" cases.  I'd
> understand better if you asked about the tougher ones or only some of
> the easier ones, but it feels like you didn't try to check any of them
> and instead wanted me to just spend time commenting on every single
> code branch?
>
> I hope that doesn't come across harshly.  I'm just struggling to
> understand where the detailed request is coming from.
>
> However, perhaps I can obviate the whole set of requests by just
> pointing out that I don't think any of them are relevant.  The premise
> for the audit request seems to be that you are worrying that the
> change from die() to "return 2" (or 128) in try_merge_strategy() will
> result in the calling code getting into a state it has never
> experienced before which might uncover latent bugs.  We can trivially
> point out that it's not a new state, though: such return values were
> already possible from try_merge_strategy() via the final line of code
> in the function -- namely, the "return try_merge_command(...)" code
> path.  And a return value of 2 from try_merge_command() and
> try_merge_strategy() isn't merely theoretical either -- you can easily
> trigger it multiple ways; the easiest is perhaps by passing `-s
> octopus` when doing a non-octopus merge.  (You can also make the
> testcase more complex by combining that with as many other additional
> merge strategies as you want, e.g. "git merge -s octopus -s resolve -s
> recursive -s mySpecialStrategy $BRANCH".  You can also move the
> octopus to the end if you want to test what happens if it is tried
> last.  Lots of possibilities exist).

Thanks, that all makes sense & addresses any questions I had, which were
just if we were sure that this behavior was expected.

>>  4. Having done all that (or maybe this can't be split up / needs to
>>     come earlier) you say that we'd like to not generically call this
>>     exit state 128, but have it under the "exit(2)" umbrella.
>
> I don't see how reading your set of steps 1-4 logically restores the
> design of builtin/merge.c, though.  An example: what if the user ran
> "git merge -s resolve -s recursive $BRANCH", and `resolve` handled the
> merge but had some conflicts, while `recursive` just failed and
> returned a value of clean < 0?  In such a case, builtin/merge.c is
> supposed to restore the tree to pristine after attempting the
> recursive backend, and then redo using the best_strategy (which would
> be `resolve` in this example case).  Your steps 1-4 never address such
> a thing.  Your steps might incidentally address such a case as a side
> effect of their implementation, but that's not at all clear.  Since
> the whole point is fixing the code to match the existing design, it
> seems really odd to split things into a set of steps that obscures
> that fix.

You know more about the wider context here, I was just poking at the
narrow code change & seeing what the side-effects were of the changes
being made.

>> Again, all just food for thought, and a way to step-by-step go through
>> how I came about reviewing this in detail, I hope it and the below
>> version I came up with before seeing yours helps.
>>
>> P.s.: The last paragraph in my commit message does not point to some
>> hidden edge case in the code behavior here, it's just that clang/gcc are
>> funny about exit() and die() control flow when combined with
>> -fsanitize=address and higher optimization levels.
>>
>> -- >8 --
>> Subject: [PATCH] merge: return, don't use exit()
>>
>> Change some of the builtin/merge.c code added in f241ff0d0a9 (prepare
>> the builtins for a libified merge_recursive(), 2016-07-26) to ferry up
>> an "early return" state, rather than having try_merge_strategy() call
>> exit() itself.
>>
>> This is a follow-up to dda31145d79 (Merge branch
>> 'ab/usage-die-message' into gc/branch-recurse-submodules-fix,
>> 2022-03-31).
>>
>> The only behavior change here is that we'll now properly catch other
>> issues on our way out, see e.g. [1] and the interaction with /dev/full
>> for an example.
>>
>> The immediate reason to do this change is because it's one of the
>> cases where clang and gcc's SANITIZE=leak behavior differs. Under
>> clang we don't detect that "t/t6415-merge-dir-to-symlink.sh" triggers
>> a leak, but gcc spots it.
>>
>> 1. https://lore.kernel.org/git/87im2n3gje.fsf@evledraar.gmail.com/
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/merge.c | 27 +++++++++++++++++++++------
>>  1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/builtin/merge.c b/builtin/merge.c
>> index 23170f2d2a6..a8d5d04f622 100644
>> --- a/builtin/merge.c
>> +++ b/builtin/merge.c
>> @@ -709,10 +709,12 @@ static void write_tree_trivial(struct object_id *oid)
>>
>>  static int try_merge_strategy(const char *strategy, struct commit_list *common,
>>                               struct commit_list *remoteheads,
>> -                             struct commit *head)
>> +                             struct commit *head, int *early_exit)
>>  {
>>         const char *head_arg = "HEAD";
>>
>> +       *early_exit = 0;
>> +
>>         if (refresh_and_write_cache(REFRESH_QUIET, SKIP_IF_UNCHANGED, 0) < 0)
>>                 return error(_("Unable to write index."));
>>
>> @@ -754,8 +756,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>>                 else
>>                         clean = merge_recursive(&o, head, remoteheads->item,
>>                                                 reversed, &result);
>> -               if (clean < 0)
>> -                       exit(128);
>> +               if (clean < 0) {
>> +                       *early_exit = 1;
>> +                       return 128;
>> +               }
>>                 if (write_locked_index(&the_index, &lock,
>>                                        COMMIT_LOCK | SKIP_IF_UNCHANGED))
>>                         die(_("unable to write %s"), get_index_file());
>> @@ -1665,6 +1669,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>
>>         for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
>>                 int ret, cnt;
>> +               int early_exit;
>> +
>>                 if (i) {
>>                         printf(_("Rewinding the tree to pristine...\n"));
>>                         restore_state(&head_commit->object.oid, &stash);
>> @@ -1680,7 +1686,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>
>>                 ret = try_merge_strategy(use_strategies[i]->name,
>>                                          common, remoteheads,
>> -                                        head_commit);
>> +                                        head_commit, &early_exit);
>> +               if (early_exit)
>> +                       goto done;
>> +
>>                 /*
>>                  * The backend exits with 1 when conflicts are
>>                  * left to be resolved, with 2 when it does not
>> @@ -1732,12 +1741,18 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>         } else if (best_strategy == wt_strategy)
>>                 ; /* We already have its result in the working tree. */
>>         else {
>> +               int new_ret, early_exit;
>> +
>>                 printf(_("Rewinding the tree to pristine...\n"));
>>                 restore_state(&head_commit->object.oid, &stash);
>>                 printf(_("Using the %s strategy to prepare resolving by hand.\n"),
>>                         best_strategy);
>> -               try_merge_strategy(best_strategy, common, remoteheads,
>> -                                  head_commit);
>> +               new_ret = try_merge_strategy(best_strategy, common, remoteheads,
>> +                                            head_commit, &early_exit);
>> +               if (early_exit) {
>> +                       ret = new_ret;
>> +                       goto done;
>> +               }
>
> Incidentally, this is essentially dead code being added in this last
> hunk.  This final `else` block can only be triggered when
> best_strategy has been set, and best_strategy will only be set after a
> merge strategy works (possibly with conflicts, but was at least
> appropriate for the problem).  Anyway, by the point of this `else`
> block, we've run multiple strategies already, at least one worked, and
> the "best" one was not the last one tried.  So, at this point, we
> simply rewind the tree and rerun the known-working merge strategy.  (I
> guess you could say that maybe the merge strategy might not return the
> same result despite being given the same inputs, so theoretically
> early_exit could come back true and this hunk isn't quite dead.  Just
> mostly dead, I guess.)

The idea here was to leave the "is it really dead?" question aside, and
just move the code towards running the post-builtin code we run when we
don't call an early exit().


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2022-07-21 16:31       ` Junio C Hamano
@ 2023-03-02  7:17         ` Ben Humphreys
  2023-03-02 15:35           ` Elijah Newren
  0 siblings, 1 reply; 87+ messages in thread
From: Ben Humphreys @ 2023-03-02  7:17 UTC (permalink / raw)
  To: Junio C Hamano, newren; +Cc: git

Hi Junio / Elijah,

On Thu, Jul 21, 2022 at 09:31:44AM -0700, Junio C Hamano wrote:
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index f807bf335bd..11bb4bab0a1 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -1686,12 +1686,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
> >  	 * tree in the index -- this means that the index must be in
> >  	 * sync with the head commit.  The strategies are responsible
> >  	 * to ensure this.
> > +	 *
> > +	 * Stash away the local changes so that we can try more than one
> > +	 * and/or recover from merge strategies bailing while leaving the
> > +	 * index and working tree polluted.
> >  	 */
> 
> Makes sense.  We may want to special-case strategies that are known
> not to have the buggy "leave contaminated tree when bailing out"
> behaviour to avoid waste.  I expect that more than 99.99% of the
> time people are feeding a single other commit to ort or recursive,
> and if these are known to be safe, a lot will be saved by not saving
> "just in case".  But that can be left for later, after the series
> solidifies.

This may stretch your memory a bit since the above was many months ago,
but I'm wondering if you know of any effort since to build the above
described optimisations?

We've seen when Git 2.38.0 (which introduced this change) is used with
Bitbucket Server it results in a severe performance regression due to an
sharp increase in disk and CPU load. Our code that tests the mergeability
of a pull request is one such affected codepath.

If there isn't any existing efforts to build the optimisations you
mention above I will have a shot at it.

> > -	if (use_strategies_nr == 1 ||
> > -	    /*
> > -	     * Stash away the local changes so that we can try more than one.
> > -	     */
> > -	    save_state(&stash))
> > +	if (save_state(&stash))
> >  		oidclr(&stash);
> >  
> >  	for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {

Best Regards,
Ben Humphreys

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2023-03-02  7:17         ` Ben Humphreys
@ 2023-03-02 15:35           ` Elijah Newren
  2023-03-02 16:19             ` Junio C Hamano
  2023-03-06 22:19             ` Ben Humphreys
  0 siblings, 2 replies; 87+ messages in thread
From: Elijah Newren @ 2023-03-02 15:35 UTC (permalink / raw)
  To: Ben Humphreys; +Cc: Junio C Hamano, git

Hi Ben,

On Wed, Mar 1, 2023 at 11:17 PM Ben Humphreys <behumphreys@atlassian.com> wrote:
>
> Hi Junio / Elijah,
>
> On Thu, Jul 21, 2022 at 09:31:44AM -0700, Junio C Hamano wrote:
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > diff --git a/builtin/merge.c b/builtin/merge.c
> > > index f807bf335bd..11bb4bab0a1 100644
> > > --- a/builtin/merge.c
> > > +++ b/builtin/merge.c
> > > @@ -1686,12 +1686,12 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
> > >      * tree in the index -- this means that the index must be in
> > >      * sync with the head commit.  The strategies are responsible
> > >      * to ensure this.
> > > +    *
> > > +    * Stash away the local changes so that we can try more than one
> > > +    * and/or recover from merge strategies bailing while leaving the
> > > +    * index and working tree polluted.
> > >      */
> >
> > Makes sense.  We may want to special-case strategies that are known
> > not to have the buggy "leave contaminated tree when bailing out"
> > behaviour to avoid waste.  I expect that more than 99.99% of the
> > time people are feeding a single other commit to ort or recursive,
> > and if these are known to be safe, a lot will be saved by not saving
> > "just in case".  But that can be left for later, after the series
> > solidifies.
>
> This may stretch your memory a bit since the above was many months ago,
> but I'm wondering if you know of any effort since to build the above
> described optimisations?
>
> We've seen when Git 2.38.0 (which introduced this change) is used with
> Bitbucket Server it results in a severe performance regression due to an
> sharp increase in disk and CPU load. Our code that tests the mergeability
> of a pull request is one such affected codepath.
>
> If there isn't any existing efforts to build the optimisations you
> mention above I will have a shot at it.

I've got bad news for you and great news for you.

The bad news: there have not yet been any efforts to build these
optimizations mentioned above.

The great news: the fact that this affects you means you are using
non-bare clones in your mergeability checks, and being forced with
every merge to first checkout the appropriate branch, and pay for the
penalty of updating both the index and the working tree both in that
checkout and during the merge (and perhaps in doing a hard reset
afterwards) in your mergeability check, despite the fact that a
mergeability check really only needs a boolean: "does it merge
cleanly?".  Doing a full worktree-tied merge like this is really
expensive, and while the above Git changes may have made it even more
expensive for you, the real savings comes from switching to a bare
clone and not writing any working tree files or the index.  That's
available via running `git merge-tree`; see the documentation for the
--write-tree option in particular.  GitHub switched over to it last
year and GitLab should be switching soon (or may have already
completed it; I haven't checked in a bit).

You are, of course, more than welcome to build the optimizations Junio
alludes to.  It'd help out various end users.  But for improving
server side operations, I think switching to `git merge-tree` would
provide you _much_ bigger benefits.


Hope that helps,
Elijah

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2023-03-02 15:35           ` Elijah Newren
@ 2023-03-02 16:19             ` Junio C Hamano
  2023-03-04 16:18               ` Rudy Rigot
  2023-03-06 22:19             ` Ben Humphreys
  1 sibling, 1 reply; 87+ messages in thread
From: Junio C Hamano @ 2023-03-02 16:19 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Ben Humphreys, git

Elijah Newren <newren@gmail.com> writes:

> The great news: the fact that this affects you means you are using
> non-bare clones in your mergeability checks, and being forced with
> every merge to first checkout the appropriate branch, and pay for the
> penalty of updating both the index and the working tree both in that
> checkout and during the merge (and perhaps in doing a hard reset
> afterwards) in your mergeability check, despite the fact that a
> mergeability check really only needs a boolean: "does it merge
> cleanly?".  Doing a full worktree-tied merge like this is really
> expensive, and while the above Git changes may have made it even more
> expensive for you, the real savings comes from switching to a bare
> clone and not writing any working tree files or the index.  That's
> available via running `git merge-tree`; see the documentation for the
> --write-tree option in particular.  GitHub switched over to it last
> year and GitLab should be switching soon (or may have already
> completed it; I haven't checked in a bit).

Nice.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2023-03-02 16:19             ` Junio C Hamano
@ 2023-03-04 16:18               ` Rudy Rigot
  0 siblings, 0 replies; 87+ messages in thread
From: Rudy Rigot @ 2023-03-04 16:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren, Ben Humphreys, git

> I think switching to `git merge-tree` would provide you _much_ bigger benefits.

Supporting evidence: we have an automerge pipeline running
on our gigantic monolith at Salesforce to keep some shared
branches automatically up-to-date with their upstream. We first
built it with a version of Git older than 2.38, and we were doing
checkouts and running good old 'git merge' and 'git push'
commands. We didn't measure since it wasn't in production yet,
but from memory, each merge could take maybe 30s to a minute,
the checkouts could be really heavy sometimes. And of course,
after our initial fetch (which we still do of course), there was also an
incompressible multi-minute initial checkout.

We updated our pipeline to run Git 2.39, and we switched to doing
no checkout and running 'git merge-tree', followed by 'git commit-tree',
and then 'git update-ref'. Now, each of those is about 5 seconds,
including re-fetching the branches before starting in case they very
recently changed. It made a gigantic difference to us, thanks a lot
to everyone who worked on this.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state
  2023-03-02 15:35           ` Elijah Newren
  2023-03-02 16:19             ` Junio C Hamano
@ 2023-03-06 22:19             ` Ben Humphreys
  1 sibling, 0 replies; 87+ messages in thread
From: Ben Humphreys @ 2023-03-06 22:19 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Thu, Mar 02, 2023 at 07:35:30AM -0800, Elijah Newren wrote:
> I've got bad news for you and great news for you.
> 
> The bad news: there have not yet been any efforts to build these
> optimizations mentioned above.
> 
> The great news: the fact that this affects you means you are using
> non-bare clones in your mergeability checks, and being forced with
> every merge to first checkout the appropriate branch, and pay for the
> penalty of updating both the index and the working tree both in that
> checkout and during the merge (and perhaps in doing a hard reset
> afterwards) in your mergeability check, despite the fact that a
> mergeability check really only needs a boolean: "does it merge
> cleanly?".  Doing a full worktree-tied merge like this is really
> expensive, and while the above Git changes may have made it even more
> expensive for you, the real savings comes from switching to a bare
> clone and not writing any working tree files or the index.  That's
> available via running `git merge-tree`; see the documentation for the
> --write-tree option in particular.  GitHub switched over to it last
> year and GitLab should be switching soon (or may have already
> completed it; I haven't checked in a bit).
> 
> You are, of course, more than welcome to build the optimizations Junio
> alludes to.  It'd help out various end users.  But for improving
> server side operations, I think switching to `git merge-tree` would
> provide you _much_ bigger benefits.

Many thanks for the detailed reply Elijah; indeed the good news
outweighs the bad news! I've started migrating to merge-tree and it
looks great. Once complete I might take a look at the other
optimizations anyway, as a fun project.

Thanks again!

Best Regards,
Ben Humphreys

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2023-03-06 22:19 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-19 16:26 [PATCH 0/2] Fix merge restore state Elijah Newren via GitGitGadget
2022-05-19 16:26 ` [PATCH 1/2] merge: remove unused variable Elijah Newren via GitGitGadget
2022-05-19 17:45   ` Junio C Hamano
2022-05-19 16:26 ` [PATCH 2/2] merge: make restore_state() do as its name says Elijah Newren via GitGitGadget
2022-05-19 17:44   ` Junio C Hamano
2022-05-19 18:32     ` Junio C Hamano
2022-06-12  6:58 ` [PATCH 0/2] Fix merge restore state Elijah Newren
2022-06-12  8:54   ` ZheNing Hu
2022-06-19  6:50 ` [PATCH v2 0/6] " Elijah Newren via GitGitGadget
2022-06-19  6:50   ` [PATCH v2 1/6] t6424: make sure a failed merge preserves local changes Junio C Hamano via GitGitGadget
2022-06-19  6:50   ` [PATCH v2 2/6] merge: remove unused variable Elijah Newren via GitGitGadget
2022-07-19 23:14     ` Junio C Hamano
2022-06-19  6:50   ` [PATCH v2 3/6] merge: fix save_state() to work when there are racy-dirty files Elijah Newren via GitGitGadget
2022-07-17 16:28     ` ZheNing Hu
2022-07-19 22:49       ` Junio C Hamano
2022-07-21  1:09         ` Elijah Newren
2022-07-19 22:43     ` Junio C Hamano
2022-06-19  6:50   ` [PATCH v2 4/6] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-17 16:37     ` ZheNing Hu
2022-07-19 23:14     ` Junio C Hamano
2022-07-19 23:28       ` Junio C Hamano
2022-07-21  1:37         ` Elijah Newren
2022-06-19  6:50   ` [PATCH v2 5/6] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-17 16:41     ` ZheNing Hu
2022-07-19 22:57     ` Junio C Hamano
2022-07-21  2:03       ` Elijah Newren
2022-06-19  6:50   ` [PATCH v2 6/6] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-17 16:44     ` ZheNing Hu
2022-07-19 23:13     ` Junio C Hamano
2022-07-20  0:09       ` Eric Sunshine
2022-07-21  2:03         ` Elijah Newren
2022-07-21  3:27       ` Elijah Newren
2022-07-21  8:16   ` [PATCH v3 0/7] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-21 15:47       ` Junio C Hamano
2022-07-21 19:51         ` Elijah Newren
2022-07-21 20:05           ` Junio C Hamano
2022-07-21 21:14             ` Elijah Newren
2022-07-21  8:16     ` [PATCH v3 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-21 16:09       ` Junio C Hamano
2022-07-25 10:38       ` Ævar Arnfjörð Bjarmason
2022-07-26  1:31         ` Elijah Newren
2022-07-26  6:54           ` Ævar Arnfjörð Bjarmason
2022-07-21  8:16     ` [PATCH v3 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-21  8:16     ` [PATCH v3 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-21 16:16       ` Junio C Hamano
2022-07-21 16:24       ` Junio C Hamano
2022-07-21 19:52         ` Elijah Newren
2022-07-21  8:16     ` [PATCH v3 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-21 16:31       ` Junio C Hamano
2023-03-02  7:17         ` Ben Humphreys
2023-03-02 15:35           ` Elijah Newren
2023-03-02 16:19             ` Junio C Hamano
2023-03-04 16:18               ` Rudy Rigot
2023-03-06 22:19             ` Ben Humphreys
2022-07-21  8:16     ` [PATCH v3 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-21 16:34       ` Junio C Hamano
2022-07-22  5:15     ` [PATCH v4 0/7] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 1/7] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 2/7] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-22 10:27         ` Ævar Arnfjörð Bjarmason
2022-07-23  0:28           ` Elijah Newren
2022-07-23  5:44             ` Ævar Arnfjörð Bjarmason
2022-07-26  1:58               ` Elijah Newren
2022-07-26  6:35                 ` Ævar Arnfjörð Bjarmason
2022-07-22  5:15       ` [PATCH v4 3/7] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-22 10:47         ` Ævar Arnfjörð Bjarmason
2022-07-23  0:36           ` Elijah Newren
2022-07-22  5:15       ` [PATCH v4 4/7] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 5/7] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-22 10:53         ` Ævar Arnfjörð Bjarmason
2022-07-23  1:56           ` Elijah Newren
2022-07-22  5:15       ` [PATCH v4 6/7] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-22  5:15       ` [PATCH v4 7/7] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-23  1:53       ` [PATCH v5 0/8] Fix merge restore state Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 1/8] merge-ort-wrappers: make printed message match the one from recursive Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 2/8] merge-resolve: abort if index does not match HEAD Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 3/8] merge: abort if index does not match HEAD for trivial merges Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 4/8] merge: do not abort early if one strategy fails to handle the merge Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 5/8] merge: fix save_state() to work when there are stat-dirty files Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 6/8] merge: make restore_state() restore staged state too Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 7/8] merge: ensure we can actually restore pre-merge state Elijah Newren via GitGitGadget
2022-07-23  1:53         ` [PATCH v5 8/8] merge: do not exit restore_state() prematurely Elijah Newren via GitGitGadget
2022-07-25 19:03         ` [PATCH v5 0/8] Fix merge restore state Junio C Hamano
2022-07-26  1:59           ` Elijah Newren
2022-07-26  4:03         ` ZheNing Hu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).