git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [Git] recursive merge on 'master' severely broken?
@ 2018-04-11  7:19 Junio C Hamano
  2018-04-11  9:06 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-04-11  7:19 UTC (permalink / raw)
  To: git, Elijah Newren

It appears that a topic recently graduated to 'master' introduces a
severe regression to "git merge" when it merges a side branch that
renames paths while the trunk has further updates.

The symptom can easily be seen by trying to recreate the merge I
made at the tip of 'pu' 29dea678 ("Merge branch
'sb/filenames-with-dashes' into pu", 2018-04-11) that I'll be.
pushing out shortly.  The side branch renames a file exec_cmd.h to
exec-cmd.h (an underscore changed to a dash) without changing any
contents, while the branch being merged to has made some changes to
the contents while keeping the original pathname.

A clean automerged result should leave the identical contents from
HEAD:exec_cmd.h in :exec-cmd.h in the index, which is what happens
when using Git v2.17.0 proper, but with today's master', there are
content changes that cannot be explained--the merge is simply broken
and worse yet, the command pretends that everything went well and
merged cleanly in that path.  Overly clever tool that behaves in a
buggy and unexplainable way is bad enough, doing so silently is
unexcusable.

I suspect that the culprit is the merge e4bb62fa ("Merge branch
'en/rename-directory-detection'", 2018-04-10), and I am planning to
revert it out of 'master' as soon as possible, but it is already
buried deep in other topics, so I may not be able to get to it until
later this week.  Sorry about that.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Git] recursive merge on 'master' severely broken?
  2018-04-11  7:19 [Git] recursive merge on 'master' severely broken? Junio C Hamano
@ 2018-04-11  9:06 ` Junio C Hamano
  2018-04-11  9:57   ` Junio C Hamano
  2018-04-11 15:51 ` Elijah Newren
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
  2 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2018-04-11  9:06 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

> It appears that a topic recently graduated to 'master' introduces a
> severe regression to "git merge" when it merges a side branch that
> renames paths while the trunk has further updates.
> ...
> I suspect that the culprit is the merge e4bb62fa ("Merge branch
> 'en/rename-directory-detection'", 2018-04-10), and I am planning to
> revert it out of 'master' as soon as possible, but it is already
> buried deep in other topics, so I may not be able to get to it until
> later this week.  Sorry about that.

An interim report.

It indeed is that merge with the topic.  I rebuilt an equivalent of
'master' without that topic on top of v2.17.0 and tried the same
merge of sb/filenames-with-dashes fd055186 ("replace_object.c:
rename to use dash in file name", 2018-04-10) topic into pu^
35f7512e ("Merge branch 'fg/completion-external' into pu",
2018-04-11), and that version of Git does not exhibit the problem.

I do not yet know which one of the ~30 individual commmits is the
real culprit, but in the meantime, I'll revert the merge out of
'master' and push the result out, so that real Git-using projects
won't get silent mismerges.  Those who run in-house version of Git
that is based on anything newer than the released version may want
to do the same.

Sorry for the crappy 'master' X-<.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Git] recursive merge on 'master' severely broken?
  2018-04-11  9:06 ` Junio C Hamano
@ 2018-04-11  9:57   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-04-11  9:57 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Junio C Hamano <gitster@pobox.com> writes:

> Another interim report.
>
> I do not yet know which one of the ~30 individual commmits is the
> real culprit, ...

It bisects down to c5b761fb ("merge-recursive: ensure we write
updates for directory-renamed file", 2018-02-14); given that a part
of the symptom I saw was a few messages like these:

        ...
        CONFLICT (content): Merge conflict in upload-pack.c
        error: addinfo_cache failed for path 'sha1-name.c'
        error: addinfo_cache failed for path 'sha1-file.c'
        Auto-merging sequencer.c
        ...

and the patch does change code around the callsites to
add_cacheinfo(), it does look like a plausible culprit.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Git] recursive merge on 'master' severely broken?
  2018-04-11  7:19 [Git] recursive merge on 'master' severely broken? Junio C Hamano
  2018-04-11  9:06 ` Junio C Hamano
@ 2018-04-11 15:51 ` Elijah Newren
  2018-04-12  1:37   ` Junio C Hamano
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
  2 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-04-11 15:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Wed, Apr 11, 2018 at 12:19 AM, Junio C Hamano <gitster@pobox.com> wrote:
> It appears that a topic recently graduated to 'master' introduces a
> severe regression to "git merge" when it merges a side branch that
> renames paths while the trunk has further updates.
>
> The symptom can easily be seen by trying to recreate the merge I
> made at the tip of 'pu' 29dea678 ("Merge branch
> 'sb/filenames-with-dashes' into pu", 2018-04-11) that I'll be.
> pushing out shortly.  The side branch renames a file exec_cmd.h to
> exec-cmd.h (an underscore changed to a dash) without changing any
> contents, while the branch being merged to has made some changes to
> the contents while keeping the original pathname.
>
> A clean automerged result should leave the identical contents from
> HEAD:exec_cmd.h in :exec-cmd.h in the index, which is what happens
> when using Git v2.17.0 proper, but with today's master', there are
> content changes that cannot be explained--the merge is simply broken
> and worse yet, the command pretends that everything went well and
> merged cleanly in that path.  Overly clever tool that behaves in a
> buggy and unexplainable way is bad enough, doing so silently is
> unexcusable.

I agree, that is _really_ bad.  My sincerest apologies.  I'll dig into it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Git] recursive merge on 'master' severely broken?
  2018-04-11 15:51 ` Elijah Newren
@ 2018-04-12  1:37   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2018-04-12  1:37 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> On Wed, Apr 11, 2018 at 12:19 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> It appears that a topic recently graduated to 'master' introduces a
>> severe regression to "git merge" when it merges a side branch that
>> renames paths while the trunk has further updates.
>>
>> The symptom can easily be seen by trying to recreate the merge I
>> made at the tip of 'pu' 29dea678 ("Merge branch
>> 'sb/filenames-with-dashes' into pu", 2018-04-11) that I'll be.
>> pushing out shortly.  The side branch renames a file exec_cmd.h to
>> exec-cmd.h (an underscore changed to a dash) without changing any
>> contents, while the branch being merged to has made some changes to
>> the contents while keeping the original pathname.
> ...
> I agree, that is _really_ bad.  My sincerest apologies.  I'll dig into it.

Thanks.  It is not unusual for a moderately large set of changes to
have glitches that need time to be discovered.  I was hoping that
placing it in 'next' and keeping it there for several weeks would
have been sufficient for people who do use stuff from there in real
life but apparently that wasn't the case.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v9 0/30] Add directory rename detection to git
  2018-04-11  7:19 [Git] recursive merge on 'master' severely broken? Junio C Hamano
  2018-04-11  9:06 ` Junio C Hamano
  2018-04-11 15:51 ` Elijah Newren
@ 2018-04-13 19:56 ` Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates Elijah Newren
                     ` (3 more replies)
  2 siblings, 4 replies; 13+ messages in thread
From: Elijah Newren @ 2018-04-13 19:56 UTC (permalink / raw)
  To: git; +Cc: torvalds, gitster, sbeller, Elijah Newren

This series builds on commit febb3a86098f ("merge-recursive: avoid
spurious rename/rename conflict from dir renames", 2018-02-14), also known
as patch 29/30 of en/rename-directory-detection.  That patch series has
been reverted from master[1], due to a bug with patch 30/30, so does not
apply to current master.  But I didn't want to resend the whole series for
an RFC.

These four patches replace that patch and:
  - fixes Linus' rewriting-of-unchanged-files bug[1]
  - fixes the problems that broke Junio's merges after my series[2]
  - fixes the problem the original patch 30/30 was intended to solve
  - adds lots of testcases to make sure this doesn't regress.

Linus' alternative of stupid-brute-force[3], would also work here,
though I feel the first three patches are useful even if we take some
form of his patch.  Long term, the most correct solution would involve
a rewrite to merge-recursive that would simplify this code[4], though
I think the changes in this series brings this part of the code closer
to that end state.

The big questions here are:
  1) The last time my rename-directory-detection series was merged into
     master it bit Junio badly.  I'm planning to redo all merges of
     git.git and linux.git and comparing v2.17.0 to what I get after
     my changes.  What else should I test?
  2) What do folks thing about stupid-brute-force vs. the explanation in
     my final patch?


[1] https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
[3] https://public-inbox.org/git/CA+55aFwi9pTAJT_qtv=vHLgu=B1fdXBoD96i8Y5xnbS=zrfSzg@mail.gmail.com/
[4] https://public-inbox.org/git/xmqqd147kpdm.fsf@gitster.mtv.corp.google.com/


Elijah Newren (4):
  merge-recursive: improve output precision around skipping updates
  t6046: testcases checking whether updates can be skipped in a merge
  merge-recursive: Fix was_tracked() to quit lying with some renamed
    paths
  merge-recursive: fix check for skipability of working tree updates

 merge-recursive.c                      | 109 +++++---
 merge-recursive.h                      |   1 +
 t/t6022-merge-rename.sh                |   2 +-
 t/t6043-merge-rename-directories.sh    |   2 +-
 t/t6046-merge-skip-unneeded-updates.sh | 497 +++++++++++++++++++++++++++++++++
 5 files changed, 575 insertions(+), 36 deletions(-)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

-- 
2.16.0.35.g6dd7ede834


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
@ 2018-04-13 19:56   ` Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-04-13 19:56 UTC (permalink / raw)
  To: git; +Cc: torvalds, gitster, sbeller, Elijah Newren

When a merge results in contents that already existed in HEAD (because the
changes on a side branch were a subset of what was changed by HEAD), and
those contents were already at the right path, the working directory
updates can be skipped.  However, the relevant code would sometimes claim
the working directory update could be skipped despite the fact that the
contents were at the wrong path.  Make the output reflect the situation
more precisely, including an additional message that tests can check for
to make sure we are getting correct behavior.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c       | 11 ++++++-----
 t/t6022-merge-rename.sh |  2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 4a1ecdea03..05250939c7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2761,21 +2761,22 @@ static int merge_content(struct merge_options *o,
 				       o->branch2, path2, &mfi))
 		return -1;
 
-	if (mfi.clean && !df_conflict_remains &&
-	    oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {
+	if (mfi.clean && oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {
 		int path_renamed_outside_HEAD;
-		output(o, 3, _("Skipped %s (merged same as existing)"), path);
 		/*
 		 * The content merge resulted in the same file contents we
 		 * already had.  We can return early if those file contents
 		 * are recorded at the correct path (which may not be true
-		 * if the merge involves a rename).
+		 * if the merge involves a rename or there's a D/F conflict).
 		 */
 		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-		if (!path_renamed_outside_HEAD) {
+		if (!df_conflict_remains && !path_renamed_outside_HEAD) {
+			output(o, 3, _("Skipped %s (merged same as existing)"), path);
 			add_cacheinfo(o, mfi.mode, &mfi.oid, path,
 				      0, (!o->call_depth), 0);
 			return mfi.clean;
+		} else {
+			output(o, 3, _("Had correct contents for %s, but not at right path"), path);
 		}
 	} else
 		output(o, 2, _("Auto-merging %s"), path);
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index 05ebba7afa..574088bfaf 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -245,7 +245,7 @@ test_expect_success 'merge of identical changes in a renamed file' '
 	GIT_MERGE_VERBOSITY=3 git merge change | test_i18ngrep "^Skipped B" &&
 	git reset --hard HEAD^ &&
 	git checkout change &&
-	GIT_MERGE_VERBOSITY=3 git merge change+rename | test_i18ngrep "^Skipped B"
+	GIT_MERGE_VERBOSITY=3 git merge change+rename | test_i18ngrep "^Had correct contents for B, but not at right path"
 '
 
 test_expect_success 'setup for rename + d/f conflicts' '
-- 
2.16.0.35.g6dd7ede834


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates Elijah Newren
@ 2018-04-13 19:56   ` Elijah Newren
  2018-04-13 21:03     ` Stefan Beller
  2018-04-13 19:56   ` [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 30/30] merge-recursive: fix check for skipability of working tree updates Elijah Newren
  3 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-04-13 19:56 UTC (permalink / raw)
  To: git; +Cc: torvalds, gitster, sbeller, Elijah Newren

Add several tests checking whether updates can be skipped in a merge.
Also add several similar testcases for where updates cannot be skipped in
a merge to make sure that we skip if and only if we should.

In particular:

  * Testcase 1a (particularly 1a-check-L) would have pointed out the
    problem Linus has been dealing with for year with his merges[1].

  * Testcase 2a (particularly 2a-check-L) would have pointed out the
    problem with my directory-rename-series before it broke master[2].

  * Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
    than 12b of t6043 making that one easier to understand.

  * There are several complementary testcases to make sure we're not just
    fixing those particular issues while regressing in the opposite
    direction.

[1] https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6046-merge-skip-unneeded-updates.sh | 497 +++++++++++++++++++++++++++++++++
 1 file changed, 497 insertions(+)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh
new file mode 100755
index 0000000000..89c3a953ae
--- /dev/null
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -0,0 +1,497 @@
+#!/bin/sh
+
+test_description="merge cases"
+
+# The setup for all of them, pictorially, is:
+#
+#      A
+#      o
+#     / \
+#  O o   ?
+#     \ /
+#      o
+#      B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#    z/{b,c}   means  files z/b and z/c both exist
+#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
+#                     underscore notation is to differentiate different
+#                     files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###########################################################################
+# SECTION 1: Cases involving no renames (one side has subset of changes of
+#            the other side)
+###########################################################################
+
+# Testcase 1a, Changes on A, subset of changes on B
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: b_3
+#   Expected: b_2
+
+test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
+	test_create_repo 1a &&
+	(
+		cd 1a &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 1a reset --hard" &&
+	(
+		cd 1a &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:b &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object b   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success '1a-check-R: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 1a reset --hard" &&
+	(
+		cd 1a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:b &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object b   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# SECTION 2: Cases involving basic renames
+###########################################################################
+
+# Testcase 2a, Changes on A, rename on B
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: c_1
+#   Expected: c_2
+
+test_expect_success '2a-setup: Modify(A)/rename(B)' '
+	test_create_repo 2a &&
+	(
+		cd 2a &&
+
+		test_seq 1 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_seq 1 11 > b &&
+		git add b &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '2a-check-L: Modify/rename, merge into modify side' '
+	test_when_finished "git -C 2a reset --hard" &&
+	(
+		cd 2a &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Had correct contents" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+test_expect_success '2a-check-R: Modify/rename, merge into rename side' '
+	test_when_finished "git -C 2a reset --hard" &&
+	(
+		cd 2a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+# Testcase 2b, Changed and renamed on A, subset of changes on B
+#   Commit O: b_1
+#   Commit A: c_2
+#   Commit B: b_3
+#   Expected: c_2
+
+test_expect_success '2b-setup: Modify(A)/Modify(B), change on B subset of A' '
+	test_create_repo 2b &&
+	(
+		cd 2b &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '2b-check-L: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 2b reset --hard" &&
+	(
+		cd 2b &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:c &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:c >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+test_expect_success '2b-check-R: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 2b reset --hard" &&
+	(
+		cd 2b &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:c &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:c >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+###########################################################################
+# SECTION 3: Cases involving directory renames
+#
+# NOTE:
+#   Directory renames only apply when one side renames a directory, and the
+#   other side adds or renames a path into that directory.  Applying the
+#   directory rename to that new path creates a new pathname that didn't
+#   exist on either side of history.  Thus, it is impossible for the
+#   merge contents to already be at the right path, so all of these checks
+#   exist just to make sure that updates are not skipped.
+###########################################################################
+
+# Testcase 3a, Change + rename into dir foo on A, dir rename foo->bar on B
+#   Commit O: bq_1, foo/whatever
+#   Commit A: foo/{bq_2, whatever}
+#   Commit B: bq_1, bar/whatever
+#   Expected: bar/{bq_2, whatever}
+
+test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_create_repo 3a &&
+	(
+		cd 3a &&
+
+		mkdir foo &&
+		test_seq 1 10 >bq &&
+		test_write_lines a b c d e f g h i j k >foo/whatever &&
+		git add bq foo/whatever &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_seq 1 11 > bq &&
+		git add bq &&
+		git mv bq foo/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv foo/ bar/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3a reset --hard" &&
+	(
+		cd 3a &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Had correct contents for bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect A:foo/bq    A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq   bar/whatever   >actual &&
+		git rev-parse   A:foo/bq A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+test_expect_success '3a-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3a reset --hard" &&
+	(
+		cd 3a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect A:foo/bq    A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq   bar/whatever   >actual &&
+		git rev-parse   A:foo/bq A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+# Testcase 3b, rename into dir foo on A, dir rename foo->bar + change on B
+#   Commit O: bq_1, foo/whatever
+#   Commit A: foo/{bq_1, whatever}
+#   Commit B: bq_2, bar/whatever
+#   Expected: bar/{bq_2, whatever}
+
+test_expect_success '3b-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_create_repo 3b &&
+	(
+		cd 3b &&
+
+		mkdir foo &&
+		test_seq 1 10 >bq &&
+		test_write_lines a b c d e f g h i j k >foo/whatever &&
+		git add bq foo/whatever &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv bq foo/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_seq 1 11 > bq &&
+		git add bq &&
+		git mv foo/ bar/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3b-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3b reset --hard" &&
+	(
+		cd 3b &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect B:bq        A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq bar/whatever   >actual &&
+		git rev-parse   B:bq   A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+test_expect_success '3b-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3b reset --hard" &&
+	(
+		cd 3b &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Had correct contents for bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect B:bq        A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq bar/whatever   >actual &&
+		git rev-parse   B:bq   A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+test_done
-- 
2.16.0.35.g6dd7ede834


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates Elijah Newren
  2018-04-13 19:56   ` [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
@ 2018-04-13 19:56   ` Elijah Newren
  2018-04-16  0:37     ` Junio C Hamano
  2018-04-13 19:56   ` [PATCH v9 30/30] merge-recursive: fix check for skipability of working tree updates Elijah Newren
  3 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-04-13 19:56 UTC (permalink / raw)
  To: git; +Cc: torvalds, gitster, sbeller, Elijah Newren

In commit aacb82de3ff8 ("merge-recursive: Split was_tracked() out of
would_lose_untracked()", 2011-08-11), was_tracked() was split out of
would_lose_untracked() with the intent to provide a function that could
answer whether a path was tracked in the index before the merge.  Sadly,
it instead returned whether the path was in the working tree due to having
been tracked in the index before the merge OR having been written there by
unpack_trees().  The distinction is important when renames are involved,
e.g. for a merge where:

   HEAD:  modifies path b
   other: renames b->c

In this case, c was not tracked in the index before the merge, but would
have been added to the index at stage 0 and written to the working tree by
unpack_trees().  would_lose_untracked() is more interested in the
in-working-copy-for-either-reason behavior, while all other uses of
was_tracked() want just was-it-tracked-in-index-before-merge behavior.

Unsplit would_lose_untracked() and write a new was_tracked() function
which answers whether a path was tracked in the index before the merge
started.

This will also make was_dirty() return better results, as it will be based
off the original index rather than an index that possibly only copied over
some of the stat information.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 82 +++++++++++++++++++++++++++++++++++++++----------------
 merge-recursive.h |  1 +
 2 files changed, 59 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 05250939c7..adc800f188 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -344,6 +344,7 @@ static int git_merge_trees(struct merge_options *o,
 {
 	int rc;
 	struct tree_desc t[3];
+	struct index_state tmp_index = { NULL };
 
 	memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
 	if (o->call_depth)
@@ -354,7 +355,7 @@ static int git_merge_trees(struct merge_options *o,
 	o->unpack_opts.head_idx = 2;
 	o->unpack_opts.fn = threeway_merge;
 	o->unpack_opts.src_index = &the_index;
-	o->unpack_opts.dst_index = &the_index;
+	o->unpack_opts.dst_index = &tmp_index;
 	setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
 	init_tree_desc_from_tree(t+0, common);
@@ -362,13 +363,17 @@ static int git_merge_trees(struct merge_options *o,
 	init_tree_desc_from_tree(t+2, merge);
 
 	rc = unpack_trees(3, t, &o->unpack_opts);
+	cache_tree_free(&active_cache_tree);
+
+	o->orig_index = the_index;
+	the_index = tmp_index;
+
 	/*
-	 * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
-	 * so set to the new index which will usually have modification
-	 * timestamp info copied over.
+	 * src_index is used in verify_uptodate, but was NULLified in
+	 * unpack_trees, so we need to set it back to the original index.
 	 */
-	o->unpack_opts.src_index = &the_index;
-	cache_tree_free(&active_cache_tree);
+	o->unpack_opts.src_index = &o->orig_index;
+
 	return rc;
 }
 
@@ -773,31 +778,51 @@ static int dir_in_way(const char *path, int check_working_copy, int empty_ok)
 		!(empty_ok && is_empty_dir(path));
 }
 
-static int was_tracked(const char *path)
+/*
+ * Returns whether path was tracked in the index before the merge started
+ */
+static int was_tracked(struct merge_options *o, const char *path)
 {
-	int pos = cache_name_pos(path, strlen(path));
+	int pos = index_name_pos(&o->orig_index, path, strlen(path));
 
 	if (0 <= pos)
-		/* we have been tracking this path */
+		/* we were tracking this path before the merge */
 		return 1;
 
-	/*
-	 * Look for an unmerged entry for the path,
-	 * specifically stage #2, which would indicate
-	 * that "our" side before the merge started
-	 * had the path tracked (and resulted in a conflict).
-	 */
-	for (pos = -1 - pos;
-	     pos < active_nr && !strcmp(path, active_cache[pos]->name);
-	     pos++)
-		if (ce_stage(active_cache[pos]) == 2)
-			return 1;
 	return 0;
 }
 
 static int would_lose_untracked(const char *path)
 {
-	return !was_tracked(path) && file_exists(path);
+	/*
+	 * This may look like it can be simplified to:
+	 *   return !was_tracked(o, path) && file_exists(path)
+	 * but it can't.  This function needs to know whether path was
+	 * in the working tree due to EITHER having been tracked in the
+	 * index before the merge OR having been put into the working copy
+	 * and index by unpack_trees().  Due to that either-or requirement,
+	 * we check the current index instead of the original one.
+	 */
+	int pos = cache_name_pos(path, strlen(path));
+
+	if (pos < 0)
+		pos = -1 - pos;
+	while (pos < active_nr &&
+	       !strcmp(path, active_cache[pos]->name)) {
+		/*
+		 * If stage #0, it is definitely tracked.
+		 * If it has stage #2 then it was tracked
+		 * before this merge started.  All other
+		 * cases the path was not tracked.
+		 */
+		switch (ce_stage(active_cache[pos])) {
+		case 0:
+		case 2:
+			return 0;
+		}
+		pos++;
+	}
+	return file_exists(path);
 }
 
 static int was_dirty(struct merge_options *o, const char *path)
@@ -805,7 +830,7 @@ static int was_dirty(struct merge_options *o, const char *path)
 	struct cache_entry *ce;
 	int dirty = 1;
 
-	if (o->call_depth || !was_tracked(path))
+	if (o->call_depth || !was_tracked(o, path))
 		return !dirty;
 
 	ce = cache_file_exists(path, strlen(path), ignore_case);
@@ -2407,7 +2432,7 @@ static int process_renames(struct merge_options *o,
 			 * add-source case).
 			 */
 			remove_file(o, 1, ren1_src,
-				    renamed_stage == 2 || !was_tracked(ren1_src));
+				    renamed_stage == 2 || !was_tracked(o, ren1_src));
 
 			oidcpy(&src_other.oid,
 			       &ren1->src_entry->stages[other_stage].oid);
@@ -2800,7 +2825,7 @@ static int merge_content(struct merge_options *o,
 				if (update_stages(o, path, &one, &a, &b))
 					return -1;
 			} else {
-				int file_from_stage2 = was_tracked(path);
+				int file_from_stage2 = was_tracked(o, path);
 				struct diff_filespec merged;
 				oidcpy(&merged.oid, &mfi.oid);
 				merged.mode = mfi.mode;
@@ -3075,6 +3100,15 @@ int merge_trees(struct merge_options *o,
 	else
 		clean = 1;
 
+	/* Free the extra index left from git_merge_trees() */
+	/*
+	 * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
+	 * tucked away in o->unpack_opts.msgs, but the problem is that only
+	 * half of it refers to dynamically allocated data, while the other
+	 * half points at static strings.
+	 */
+	discard_index(&o->orig_index);
+
 	if (o->call_depth && !(*result = write_tree_from_memory(o)))
 		return -1;
 
diff --git a/merge-recursive.h b/merge-recursive.h
index d863cf8867..248093e407 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,7 @@ struct merge_options {
 	struct hashmap current_file_dir_set;
 	struct string_list df_conflict_file_set;
 	struct unpack_trees_options unpack_opts;
+	struct index_state orig_index;
 };
 
 /*
-- 
2.16.0.35.g6dd7ede834


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v9 30/30] merge-recursive: fix check for skipability of working tree updates
  2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
                     ` (2 preceding siblings ...)
  2018-04-13 19:56   ` [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths Elijah Newren
@ 2018-04-13 19:56   ` Elijah Newren
  3 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-04-13 19:56 UTC (permalink / raw)
  To: git; +Cc: torvalds, gitster, sbeller, Elijah Newren

The can-working-tree-updates-be-skipped check has had a long and blemished
history.  The update can be skipped iff:
  a) The merged contents match what was in HEAD
  b) The merged mode matches what was in HEAD
  c) The target path is usable and matches what was in HEAD

Steps a & b are easy to check; we have always gotten those right.  Step c
is just:
  c1) Nothing else is in the way of putting the content at the target path
      (i.e. it isn't involved in any D/F conflicts)
  c2) target path was tracked in the index before the merge

While it is easy to overlook c1, this was fixed seven years ago with
commit 4ab9a157d069 ("merge_content(): Check whether D/F conflicts are
still present", 2010-09-20).  merge-recursive didn't have a readily
available way to directly check c2, so various approximations were
used:

  * In commit b2c8c0a76274 ("merge-recursive: When we detect we can skip
    an update, actually skip it", 2011-02-28), it was noted that although
    the code claimed it was skipping the update, it did not actually skip
    the update.  The code was made to skip it, but used lstat(path, ...)
    as an approximation to path-was-tracked-in-index-before-merge.

  * In commit 5b448b853030 ("merge-recursive: When we detect we can skip
    an update, actually skip it", 2011-08-11), the problem with using
    lstat was noted.  It was changed to the approximation
       path2 && strcmp(path, path2)
    which is also wrong.  !path2 || strcmp(path, path2) would have been
    better, but would have fallen short with directory renames.

  * In c5b761fb2711 ("merge-recursive: ensure we write updates for
    directory-renamed file", 2018-02-14), the problem with the previous
    approximation was noted and changed to
       was_tracked(path)
    That looks like exactly what we were trying to answer, and is what
    was_tracked() was *intended* for, but not what was_tracked()
    actually returned.

Since the previous commit made was_tracked(path) actually mean "path was
tracked in the index before the merge", we can now use it instead of
other approximations to answer the question "was path tracked in the
index before the merge?"  So, although the code change in this commit is
the same one made in c5b761fb2711, it now safe and correct due to the
prior fix to was_tracked().

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                      | 20 +++++++++++++-------
 t/t6043-merge-rename-directories.sh    |  2 +-
 t/t6046-merge-skip-unneeded-updates.sh |  4 ++--
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index adc800f188..1b71d00fdb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2786,16 +2786,22 @@ static int merge_content(struct merge_options *o,
 				       o->branch2, path2, &mfi))
 		return -1;
 
+	/*
+	 * We can skip updating the working tree file iff:
+	 *   a) The merged contents match what was in HEAD
+	 *   b) The merged mode matches what was in HEAD
+	 *   c) The target path is usable and matches what was in HEAD
+	 * We test (a) & (b) here.
+	 */
 	if (mfi.clean && oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {
-		int path_renamed_outside_HEAD;
 		/*
-		 * The content merge resulted in the same file contents we
-		 * already had.  We can return early if those file contents
-		 * are recorded at the correct path (which may not be true
-		 * if the merge involves a rename or there's a D/F conflict).
+		 * Case c has two pieces:
+		 *   c1) Nothing else is in the way of writing the merged
+		 *       results to path (i.e. it isn't involved in any
+		 *       D/F conflict)
+		 *   c2) path was tracked in the index before the merge
 		 */
-		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-		if (!df_conflict_remains && !path_renamed_outside_HEAD) {
+		if (!df_conflict_remains && was_tracked(o, path)) {
 			output(o, 3, _("Skipped %s (merged same as existing)"), path);
 			add_cacheinfo(o, mfi.mode, &mfi.oid, path,
 				      0, (!o->call_depth), 0);
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 45f620633f..2e28f2908d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3884,7 +3884,7 @@ test_expect_success '12b-setup: Moving one directory hierarchy into another' '
 	)
 '
 
-test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+test_expect_success '12b-check: Moving one directory hierarchy into another' '
 	(
 		cd 12b &&
 
diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh
index 89c3a953ae..4dcec7226c 100755
--- a/t/t6046-merge-skip-unneeded-updates.sh
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -64,7 +64,7 @@ test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
 	)
 '
 
-test_expect_failure '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
+test_expect_success '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
 	test_when_finished "git -C 1a reset --hard" &&
 	(
 		cd 1a &&
@@ -346,7 +346,7 @@ test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	)
 '
 
-test_expect_failure '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+test_expect_success '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	test_when_finished "git -C 3a reset --hard" &&
 	(
 		cd 3a &&
-- 
2.16.0.35.g6dd7ede834


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge
  2018-04-13 19:56   ` [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
@ 2018-04-13 21:03     ` Stefan Beller
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2018-04-13 21:03 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Linus Torvalds, Junio C Hamano

On Fri, Apr 13, 2018 at 12:56 PM, Elijah Newren <newren@gmail.com> wrote:
> Add several tests checking whether updates can be skipped in a merge.
> Also add several similar testcases for where updates cannot be skipped in
> a merge to make sure that we skip if and only if we should.
>
> In particular:
>
>   * Testcase 1a (particularly 1a-check-L) would have pointed out the
>     problem Linus has been dealing with for year with his merges[1].
>
>   * Testcase 2a (particularly 2a-check-L) would have pointed out the
>     problem with my directory-rename-series before it broke master[2].
>
>   * Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
>     than 12b of t6043 making that one easier to understand.
>
>   * There are several complementary testcases to make sure we're not just
>     fixing those particular issues while regressing in the opposite
>     direction.
>
> [1] https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/
> [2] https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

This is
Reviewed-by: Stefan Beller <sbeller@google.com>


> +               test_tick &&
> +               git commit -m "O" &&

minor nit: I think you can omit the test_tick before creating O
for each setup.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths
  2018-04-13 19:56   ` [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths Elijah Newren
@ 2018-04-16  0:37     ` Junio C Hamano
  2018-04-16 21:21       ` Elijah Newren
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2018-04-16  0:37 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, torvalds, sbeller

Elijah Newren <newren@gmail.com> writes:

> @@ -362,13 +363,17 @@ static int git_merge_trees(struct merge_options *o,
>  	init_tree_desc_from_tree(t+2, merge);
>  
>  	rc = unpack_trees(3, t, &o->unpack_opts);
> +	cache_tree_free(&active_cache_tree);
> +
> +	o->orig_index = the_index;
> +	the_index = tmp_index;
> +
>  	/*
> -	 * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
> -	 * so set to the new index which will usually have modification
> -	 * timestamp info copied over.
> +	 * src_index is used in verify_uptodate, but was NULLified in
> +	 * unpack_trees, so we need to set it back to the original index.
>  	 */

Was NULLified?  I thought that the point of src/dst distinction
Linus introduced long time ago at 34110cd4 ("Make 'unpack_trees()'
have a separate source and destination index", 2008-03-06) was that
we can then keep the source side of the traversal unmodified.

> -	o->unpack_opts.src_index = &the_index;
> -	cache_tree_free(&active_cache_tree);
> +	o->unpack_opts.src_index = &o->orig_index;

> -static int was_tracked(const char *path)
> +/*
> + * Returns whether path was tracked in the index before the merge started
> + */
> +static int was_tracked(struct merge_options *o, const char *path)
>  {
> -	int pos = cache_name_pos(path, strlen(path));
> +	int pos = index_name_pos(&o->orig_index, path, strlen(path));
>  
>  	if (0 <= pos)
> -		/* we have been tracking this path */
> +		/* we were tracking this path before the merge */
>  		return 1;
>  
> -	/*
> -	 * Look for an unmerged entry for the path,
> -	 * specifically stage #2, which would indicate
> -	 * that "our" side before the merge started
> -	 * had the path tracked (and resulted in a conflict).
> -	 */
> -	for (pos = -1 - pos;
> -	     pos < active_nr && !strcmp(path, active_cache[pos]->name);
> -	     pos++)
> -		if (ce_stage(active_cache[pos]) == 2)
> -			return 1;
>  	return 0;
>  }

I do agree with the simplicity of the new code that directly asks
exactly what we want to ask.  However, there is one thing that is
puzzling below...

>  static int would_lose_untracked(const char *path)
>  {
> -	return !was_tracked(path) && file_exists(path);
> +	/*
> +	 * This may look like it can be simplified to:
> +	 *   return !was_tracked(o, path) && file_exists(path)
> +	 * but it can't.  This function needs to know whether path was
> +	 * in the working tree due to EITHER having been tracked in the
> +	 * index before the merge OR having been put into the working copy
> +	 * and index by unpack_trees().  Due to that either-or requirement,
> +	 * we check the current index instead of the original one.
> +	 */

If this path was created by merge-recursive, not by unpack_trees(),
what does this function want to say?  Say, we are looking at path P,
the other branch we are merging moved some other path Q to P (while
our side modified contents at path Q).  Then path P we are looking
at has contents of Q at the merge base at stage #1, the contents of
Q from our HEAD at stage #2 and the contents of P from the other
branch at stage #3.  The code below says "path P is OK, we won't
lose it" in such a case, but it is unclear if the above comment
wants to also cover that case.

> +	int pos = cache_name_pos(path, strlen(path));
> +
> +	if (pos < 0)
> +		pos = -1 - pos;
> +	while (pos < active_nr &&
> +	       !strcmp(path, active_cache[pos]->name)) {
> +		/*
> +		 * If stage #0, it is definitely tracked.
> +		 * If it has stage #2 then it was tracked
> +		 * before this merge started.  All other
> +		 * cases the path was not tracked.
> +		 */
> +		switch (ce_stage(active_cache[pos])) {
> +		case 0:
> +		case 2:
> +			return 0;
> +		}
> +		pos++;
> +	}
> +	return file_exists(path);
>  }
>  
>  static int was_dirty(struct merge_options *o, const char *path)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths
  2018-04-16  0:37     ` Junio C Hamano
@ 2018-04-16 21:21       ` Elijah Newren
  0 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-04-16 21:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Linus Torvalds, Stefan Beller

On Sun, Apr 15, 2018 at 5:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
>> @@ -362,13 +363,17 @@ static int git_merge_trees(struct merge_options *o,
>>       init_tree_desc_from_tree(t+2, merge);
>>
>>       rc = unpack_trees(3, t, &o->unpack_opts);
>> +     cache_tree_free(&active_cache_tree);
>> +
>> +     o->orig_index = the_index;
>> +     the_index = tmp_index;
>> +
>>       /*
>> -      * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
>> -      * so set to the new index which will usually have modification
>> -      * timestamp info copied over.
>> +      * src_index is used in verify_uptodate, but was NULLified in
>> +      * unpack_trees, so we need to set it back to the original index.
>>        */
>
> Was NULLified?  I thought that the point of src/dst distinction
> Linus introduced long time ago at 34110cd4 ("Make 'unpack_trees()'
> have a separate source and destination index", 2008-03-06) was that
> we can then keep the source side of the traversal unmodified.

That comment is messed up; maybe I edited and re-edited the comment
multiple times and then didn't notice the big problems when
re-reading?

Anyway, I should move the comment a few lines up, and make the code
instead read:

    /*
     * Update the_index to match the new results, AFTER saving a copy
     * in o->orig_index.  Update src_index to point to the saved copy.
     * (verify_uptodate() checks src_index, and the original index is
     * the one that had the necessary modification timestamps.)
     */
    o->orig_index = the_index;
    the_index = tmp_index;
    o->unpack_opts.src_index = &o->orig_index;

>>  static int would_lose_untracked(const char *path)
>>  {
>> -     return !was_tracked(path) && file_exists(path);
>> +     /*
>> +      * This may look like it can be simplified to:
>> +      *   return !was_tracked(o, path) && file_exists(path)
>> +      * but it can't.  This function needs to know whether path was
>> +      * in the working tree due to EITHER having been tracked in the
>> +      * index before the merge OR having been put into the working copy
>> +      * and index by unpack_trees().  Due to that either-or requirement,
>> +      * we check the current index instead of the original one.
>> +      */
>
> If this path was created by merge-recursive, not by unpack_trees(),
> what does this function want to say?  Say, we are looking at path P,
> the other branch we are merging moved some other path Q to P (while
> our side modified contents at path Q).  Then path P we are looking
> at has contents of Q at the merge base at stage #1, the contents of
> Q from our HEAD at stage #2 and the contents of P from the other
> branch at stage #3.  The code below says "path P is OK, we won't
> lose it" in such a case, but it is unclear if the above comment
> wants to also cover that case.

Oh, boy, here be dragons...

The comment as-is actually does cover your example case with Q and P:
unpack_trees(), which is unaware of renames, will see that P only
exists on one side of history and thus load it into the index at stage
0 rather than stage 3.

But your general comment about whether something else in
merge-recursive could create a path in the current index after
unpack_trees() is interesting...it touches on a pitfall that has bit
me multiple times.  There is a required ordering in merge-recursive.c
that for any given path, the working directory must be updated before
the index is -- otherwise, would_lose_untracked() will return faulty
information.  update_file_flags() has this ordering builtin,
update_stages() has a big obnoxious comment at the beginning about how
it should not be called until after update_file() is,
apply_directory_rename_modifications() has a big comment about this
~80% of the way through the function (look for
"would_lose_untracked"), and conflict_rename_rename_2to1() has a big
obnoxious comment near the end painstakingly pointing out that some
code that feels like it would make more sense being combined with a
previous function cannot be due to the
update-working-directory-before-index requirement.

I should probably add to this comment something about this annoying
(and error-prone) ordering restriction, since this function is the
source of those particular pains.  Your suggested ideal-world rewrite
(run unpack_trees() with unpack_opts.index_only=1, do merge in memory,
then update working tree), would make this whole problem go away.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-04-16 21:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11  7:19 [Git] recursive merge on 'master' severely broken? Junio C Hamano
2018-04-11  9:06 ` Junio C Hamano
2018-04-11  9:57   ` Junio C Hamano
2018-04-11 15:51 ` Elijah Newren
2018-04-12  1:37   ` Junio C Hamano
2018-04-13 19:56 ` [RFC PATCH v9 0/30] Add directory rename detection to git Elijah Newren
2018-04-13 19:56   ` [PATCH v9 29.25/30] merge-recursive: improve output precision around skipping updates Elijah Newren
2018-04-13 19:56   ` [PATCH v9 29.50/30] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
2018-04-13 21:03     ` Stefan Beller
2018-04-13 19:56   ` [PATCH v9 29.75/30] merge-recursive: Fix was_tracked() to quit lying with some renamed paths Elijah Newren
2018-04-16  0:37     ` Junio C Hamano
2018-04-16 21:21       ` Elijah Newren
2018-04-13 19:56   ` [PATCH v9 30/30] merge-recursive: fix check for skipability of working tree updates Elijah Newren

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).