git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
* [PATCH 0/4] Directory rename detection testcases and rules
@ 2020-10-15 20:46 Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 1/4] directory-rename-detection.txt: update references to regression tests Elijah Newren via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-10-15 20:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

While testing merge-ort on some real world repositories a little while back,
I found some issues with directory rename detection...some of which were
issues in the current merge-recursive as well. Also, I found that there was
a nice optimization I could use if a new obvious-looking rule was added,
though it has one slight side effect to one corner case. Fixing the bugs and
implementing the new rules is a bit more involved, so for now this series
just updates the rule descriptions and adds or modifies tests to document
the various cases.

Elijah Newren (4):
  directory-rename-detection.txt: update references to regression tests
  t6423: more involved directory rename test
  t6423: update directory rename detection tests with new rule
  t6423: more involved rules for renaming directories into each other

 .../technical/directory-rename-detection.txt  |  15 +-
 t/t6423-merge-rename-directories.sh           | 592 ++++++++++++++++--
 2 files changed, 553 insertions(+), 54 deletions(-)


base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-879%2Fnewren%2Fdrd-testcases-and-rules-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-879/newren/drd-testcases-and-rules-v1
Pull-Request: https://github.com/git/git/pull/879
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] directory-rename-detection.txt: update references to regression tests
  2020-10-15 20:46 [PATCH 0/4] Directory rename detection testcases and rules Elijah Newren via GitGitGadget
@ 2020-10-15 20:46 ` Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 2/4] t6423: more involved directory rename test Elijah Newren via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-10-15 20:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The regression tests for directory rename detection were renamed from
t6043 to t6423 in commit 919df31955 ("Collect merge-related tests to
t64xx", 2020-08-10); update this file to match.  Also, add a small
clarification to nearby text while we're at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/technical/directory-rename-detection.txt | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt
index 844629c8c4..ce042cfcae 100644
--- a/Documentation/technical/directory-rename-detection.txt
+++ b/Documentation/technical/directory-rename-detection.txt
@@ -18,7 +18,8 @@ It is perhaps easiest to start with an example:
 More interesting possibilities exist, though, such as:
 
   * one side of history renames x -> z, and the other renames some file to
-    x/e, causing the need for the merge to do a transitive rename.
+    x/e, causing the need for the merge to do a transitive rename so that
+    the rename ends up at z/e.
 
   * one side of history renames x -> z, but also renames all files within x.
     For example, x/a -> z/alpha, x/b -> z/bravo, etc.
@@ -35,7 +36,7 @@ More interesting possibilities exist, though, such as:
     directory itself contained inner directories that were renamed to yet
     other locations).
 
-  * combinations of the above; see t/t6043-merge-rename-directories.sh for
+  * combinations of the above; see t/t6423-merge-rename-directories.sh for
     various interesting cases.
 
 Limitations -- applicability of directory renames
@@ -62,7 +63,7 @@ directory rename detection applies:
 Limitations -- detailed rules and testcases
 -------------------------------------------
 
-t/t6043-merge-rename-directories.sh contains extensive tests and commentary
+t/t6423-merge-rename-directories.sh contains extensive tests and commentary
 which generate and explore the rules listed above.  It also lists a few
 additional rules:
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/4] t6423: more involved directory rename test
  2020-10-15 20:46 [PATCH 0/4] Directory rename detection testcases and rules Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 1/4] directory-rename-detection.txt: update references to regression tests Elijah Newren via GitGitGadget
@ 2020-10-15 20:46 ` Elijah Newren via GitGitGadget
  2020-10-15 20:57   ` Eric Sunshine
  2020-10-15 20:46 ` [PATCH 3/4] t6423: update directory rename detection tests with new rule Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 4/4] t6423: more involved rules for renaming directories into each other Elijah Newren via GitGitGadget
  3 siblings, 1 reply; 6+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-10-15 20:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Add a new testcase modelled on a real world repository example that
served multiple purposes:
  * it uncovered a bug in the current directory rename detection
    implementation.
  * it is a good test of needing to do directory rename detection for
    a series of commits instead of just one (and uses rebase instead
    of just merge like all the other tests in this testfile).
  * it is an excellent stress test for some of the optimizations in
    my new merge-ort engine

I can expand on the final item later when I have submitted more of
merge-ort, but the bug is the main immediate concern.  It arises as
follows:

  * dir/subdir/ has several files
  * almost all files in dir/subdir/ are renamed to folder/subdir/
  * one of the files in dir/subdir/ is renamed to folder/subdir/newsubdir/
  * If the other side of history (that doesn't do the renames) adds a
    new file to dir/subdir/, where should it be placed after the merge?

The most obvious two choices are: (1) leave the new file in dir/subdir/,
don't make it follow the rename, and (2) move the new file to
folder/subdir/, following the rename of most the files.  However,
there's a possible third choice here: (3) move the new file to
folder/subdir/newsubdir/.  The choice reinforce the fact that
merge.directoryRenames=conflict is a good default, but when the merge
machinery needs to stick it somewhere and notify the user of the
possibility that they might want to place it elsewhere.  Surprisingly,
the current code would always choose (3), while the real world
repository was clearly expecting (2) -- move the file along with where
the herd of files was going, not with the special exception.

The problem here is that for the majority of the file renames,
   dir/subdir/ -> folder/subdir/
is actually represented as
   dir/ -> folder/
This directory rename would have a big weight associated with it since
most the files followed that rename.  However, we always consult the
most immediate directory first, and there is only one rename rule for
it:
   dir/subdir/ -> folder/subdir/newsubdir/
Since this rule is the only one for mapping from dir/subdir/, it
automatically wins and that directory rename was followed instead of the
desired dir/subdir/ -> folder/subdir/.

Unfortunately, the fix is a bit involved so for now just add the
testcase documenting the issue.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6423-merge-rename-directories.sh | 195 ++++++++++++++++++++++++++++
 1 file changed, 195 insertions(+)

diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh
index f7ecbb886d..31aea47522 100755
--- a/t/t6423-merge-rename-directories.sh
+++ b/t/t6423-merge-rename-directories.sh
@@ -4227,6 +4227,201 @@ test_expect_success '12e: Rename/merge subdir into the root, variant 2' '
 	)
 '
 
+# Testcase 12f, Rebase of patches with big directory rename
+#   Commit O:
+#              dir/subdir/{a,b,c,d,e_O,Makefile_TOP_O}
+#              dir/subdir/tweaked/{f,g,h,Makefile_SUB_O}
+#              dir/unchanged/<LOTS OF FILES>
+#   Commit A:
+#     (Remove f & g, move e into newsubdir, rename dir/->folder/, modify files)
+#              folder/subdir/{a,b,c,d,Makefile_TOP_A}
+#              folder/subdir/newsubdir/e_A
+#              folder/subdir/tweaked/{h,Makefile_SUB_A}
+#              folder/unchanged/<LOTS OF FILES>
+#   Commit B1:
+#     (add newfile.{c,py}, modify underscored files)
+#              dir/{a,b,c,d,e_B1,Makefile_TOP_B1,newfile.c}
+#              dir/tweaked/{f,g,h,Makefile_SUB_B1,newfile.py}
+#              dir/unchanged/<LOTS OF FILES>
+#   Commit B2:
+#     (Modify e further, add newfile.rs)
+#              dir/{a,b,c,d,e_B2,Makefile_TOP_B1,newfile.c,newfile.rs}
+#              dir/tweaked/{f,g,h,Makefile_SUB_B1,newfile.py}
+#              dir/unchanged/<LOTS OF FILES>
+#   Expected:
+#          B1-picked:
+#              folder/subdir/{a,b,c,d,Makefile_TOP_Merge1,newfile.c}
+#              folder/subdir/newsubdir/e_Merge1
+#              folder/subdir/tweaked/{h,Makefile_SUB_Merge1,newfile.py}
+#              folder/unchanged/<LOTS OF FILES>
+#          B2-picked:
+#              folder/subdir/{a,b,c,d,Makefile_TOP_Merge1,newfile.c,newfile.rs}
+#              folder/subdir/newsubdir/e_Merge2
+#              folder/subdir/tweaked/{h,Makefile_SUB_Merge1,newfile.py}
+#              folder/unchanged/<LOTS OF FILES>
+#
+# Notes: This testcase happens to exercise lots of the
+#        optimization-specific codepaths in merge-ort, and also
+#        demonstrated a failing of the directory rename detection algorithm
+#        in merge-recursive; newfile.c should not get pushed into
+#        folder/subdir/newsubdir/, yet merge-recursive put it there because
+#        the rename of dir/subdir/{a,b,c,d} -> folder/subdir/{a,b,c,d}
+#        looks like
+#            dir/ -> folder/,
+#        whereas the rename of dir/subdir/e -> folder/subdir/newsubdir/e
+#        looks like
+#            dir/subdir/ -> folder/subdir/newsubdir/
+#        and if we note that newfile.c is found in dir/subdir/, we might
+#        overlook the dir/ -> folder/ rule that has more weight.
+
+test_setup_12f () {
+	test_create_repo 12f &&
+	(
+		cd 12f &&
+
+		mkdir -p dir/unchanged &&
+		mkdir -p dir/subdir/tweaked &&
+		echo a >dir/subdir/a &&
+		echo b >dir/subdir/b &&
+		echo c >dir/subdir/c &&
+		echo d >dir/subdir/d &&
+		test_seq 1 10 >dir/subdir/e &&
+		test_seq 10 20 >dir/subdir/Makefile &&
+		echo f >dir/subdir/tweaked/f &&
+		echo g >dir/subdir/tweaked/g &&
+		echo h >dir/subdir/tweaked/h &&
+		test_seq 20 30 >dir/subdir/tweaked/Makefile &&
+		for i in `test_seq 1 88`; do
+			echo content $i >dir/unchanged/file_$i
+		done &&
+		git add . &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git switch A &&
+		git rm dir/subdir/tweaked/f dir/subdir/tweaked/g &&
+		test_seq 2 10 >dir/subdir/e &&
+		test_seq 11 20 >dir/subdir/Makefile &&
+		test_seq 21 30 >dir/subdir/tweaked/Makefile &&
+		mkdir dir/subdir/newsubdir &&
+		git mv dir/subdir/e dir/subdir/newsubdir/ &&
+		git mv dir folder &&
+		git add . &&
+		git commit -m "A" &&
+
+		git switch B &&
+		mkdir dir/subdir/newsubdir/ &&
+		echo c code >dir/subdir/newfile.c &&
+		echo python code >dir/subdir/newsubdir/newfile.py &&
+		test_seq 1 11 >dir/subdir/e &&
+		test_seq 10 21 >dir/subdir/Makefile &&
+		test_seq 20 31 >dir/subdir/tweaked/Makefile &&
+		git add . &&
+		git commit -m "B1" &&
+
+		echo rust code >dir/subdir/newfile.rs &&
+		test_seq 1 12 >dir/subdir/e &&
+		git add . &&
+		git commit -m "B2"
+	)
+}
+
+test_expect_failure '12f: Trivial directory resolve, caching, all kinds of fun' '
+	test_setup_12f &&
+	(
+		cd 12f &&
+
+		git checkout A^0 &&
+		git branch Bmod B &&
+
+		git -c merge.directoryRenames=true rebase A Bmod &&
+
+		echo Checking the pick of B1... &&
+
+		test_must_fail git rev-parse Bmod~1:dir &&
+
+		git ls-tree -r Bmod~1 >out &&
+		test_line_count = 98 out &&
+
+		git diff --name-status A Bmod~1 >actual &&
+		q_to_tab >expect <<-\EOF &&
+		MQfolder/subdir/Makefile
+		AQfolder/subdir/newfile.c
+		MQfolder/subdir/newsubdir/e
+		AQfolder/subdir/newsubdir/newfile.py
+		MQfolder/subdir/tweaked/Makefile
+		EOF
+		test_cmp expect actual &&
+
+		# Three-way merged files
+		test_seq  2 11 >e_Merge1 &&
+		test_seq 11 21 >Makefile_TOP &&
+		test_seq 21 31 >Makefile_SUB &&
+		git hash-object >expect      \
+			e_Merge1             \
+			Makefile_TOP         \
+			Makefile_SUB         &&
+		git rev-parse >actual              \
+			Bmod~1:folder/subdir/newsubdir/e     \
+			Bmod~1:folder/subdir/Makefile        \
+			Bmod~1:folder/subdir/tweaked/Makefile &&
+		test_cmp expect actual &&
+
+		# New files showed up at the right location with right contents
+		git rev-parse >expect                \
+			B~1:dir/subdir/newfile.c            \
+			B~1:dir/subdir/newsubdir/newfile.py &&
+		git rev-parse >actual                      \
+			Bmod~1:folder/subdir/newfile.c            \
+			Bmod~1:folder/subdir/newsubdir/newfile.py &&
+		test_cmp expect actual &&
+
+		# Removed files
+		test_path_is_missing folder/subdir/tweaked/f &&
+		test_path_is_missing folder/subdir/tweaked/g &&
+
+		# Unchanged files or directories
+		git rev-parse >actual        \
+			Bmod~1:folder/subdir/a          \
+			Bmod~1:folder/subdir/b          \
+			Bmod~1:folder/subdir/c          \
+			Bmod~1:folder/subdir/d          \
+			Bmod~1:folder/unchanged         \
+			Bmod~1:folder/subdir/tweaked/h &&
+		git rev-parse >expect          \
+			O:dir/subdir/a         \
+			O:dir/subdir/b         \
+			O:dir/subdir/c         \
+			O:dir/subdir/d         \
+			O:dir/unchanged        \
+			O:dir/subdir/tweaked/h &&
+		test_cmp expect actual &&
+
+		echo Checking the pick of B2... &&
+
+		test_must_fail git rev-parse Bmod:dir &&
+
+		git ls-tree -r Bmod >out &&
+		test_line_count = 99 out &&
+
+		git diff --name-status Bmod~1 Bmod >actual &&
+		q_to_tab >expect <<-\EOF &&
+		AQfolder/subdir/newfile.rs
+		MQfolder/subdir/newsubdir/e
+		EOF
+		test_cmp expect actual &&
+
+		# Three-way merged file
+		test_seq  2 12 >e_Merge2 &&
+		git hash-object e_Merge2 >expect &&
+		git rev-parse Bmod:folder/subdir/newsubdir/e >actual &&
+		test_cmp expect actual
+	)
+'
+
 ###########################################################################
 # SECTION 13: Checking informational and conflict messages
 #
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/4] t6423: update directory rename detection tests with new rule
  2020-10-15 20:46 [PATCH 0/4] Directory rename detection testcases and rules Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 1/4] directory-rename-detection.txt: update references to regression tests Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 2/4] t6423: more involved directory rename test Elijah Newren via GitGitGadget
@ 2020-10-15 20:46 ` Elijah Newren via GitGitGadget
  2020-10-15 20:46 ` [PATCH 4/4] t6423: more involved rules for renaming directories into each other Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 6+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-10-15 20:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

While investigating the issues highlighted by the testcase in the
previous patch, I also found a shortcoming in the directory rename
detection rules.  Split testcase 6b into two to explain this issue
and update directory-rename-detection.txt to remove one of the previous
rules that I know believe to be detrimental.  Also, update the wording
around testcase 8e; while we are not modifying the results of that
testcase, we were previously unsure of the appropriate resolution of
that test and the new rule makes the previously chosen resolution for
that testcase a bit more solid.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 .../technical/directory-rename-detection.txt  |   5 +-
 t/t6423-merge-rename-directories.sh           | 144 +++++++++++++++---
 2 files changed, 124 insertions(+), 25 deletions(-)

diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt
index ce042cfcae..5d03539412 100644
--- a/Documentation/technical/directory-rename-detection.txt
+++ b/Documentation/technical/directory-rename-detection.txt
@@ -70,10 +70,7 @@ additional rules:
   a) If renames split a directory into two or more others, the directory
      with the most renames, "wins".
 
-  b) Avoid directory-rename-detection for a path, if that path is the
-     source of a rename on either side of a merge.
-
-  c) Only apply implicit directory renames to directories if the other side
+  b) Only apply implicit directory renames to directories if the other side
      of history is the one doing the renaming.
 
 Limitations -- support in different commands
diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh
index 31aea47522..00eac6e9a2 100755
--- a/t/t6423-merge-rename-directories.sh
+++ b/t/t6423-merge-rename-directories.sh
@@ -1277,20 +1277,114 @@ test_expect_success '6a: Tricky rename/delete' '
 	)
 '
 
-# Testcase 6b, Same rename done on both sides
+# Testcase 6b1, Same rename done on both sides
+#   (Related to testcase 6b2 and 8e)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, x/e
+#   Commit B: y/{b,c,d}, z/{e,f}
+#   Expected: y/{b,c,d,f}, x/e
+#   Note: Directory rename detection says A renamed z/ -> y/ (3 paths renamed
+#         to y/ and only 1 renamed to x/), therefore the new file 'z/f' in B
+#         should be moved to 'y/f'.
+#
+#         This is a bit of an edge case where any behavior might surprise users,
+#         whether that is treating A as renaming z/ -> y/, treating A as renaming
+#         z/ -> x/, or treating A as not doing any directory rename.  However, I
+#         think this answer is the least confusing and most consistent with the
+#         rules elsewhere.
+#
+#         A note about z/ -> x/, since it may not be clear how that could come
+#         about: If we were to ignore files renamed by both sides
+#         (i.e. z/{b,c,d}), as directory rename detection did in git-2.18 thru
+#         at least git-2.28, then we would note there are no renames from z/ to
+#         y/ and one rename from z/ to x/ and thus come to the conclusion that
+#         A renamed z/ -> x/.  This seems more confusing for end users than a
+#         rename of z/ to y/, it makes directory rename detection behavior
+#         harder for them to predict.  As such, we modified the rule, changed
+#         the behavior on testcases 6b2 and 8e, and introduced this 6b1 testcase.
+
+test_setup_6b1 () {
+	test_create_repo 6b1 &&
+	(
+		cd 6b1 &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		mkdir x &&
+		git mv y/e x/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		git mv y/e z/e &&
+		echo f >z/f &&
+		git add z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+}
+
+test_expect_failure '6b1: Same renames done on both sides, plus another rename' '
+	test_setup_6b1 &&
+	(
+		cd 6b1 &&
+
+		git checkout A^0 &&
+
+		git -c merge.directoryRenames=true merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:x/e HEAD:y/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    B:z/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6b2, Same rename done on both sides
 #   (Related to testcases 6c and 8e)
 #   Commit O: z/{b,c}
 #   Commit A: y/{b,c}
 #   Commit B: y/{b,c}, z/d
-#   Expected: y/{b,c}, z/d
-#   Note: If we did directory rename detection here, we'd move z/d into y/,
-#         but B did that rename and still decided to put the file into z/,
-#         so we probably shouldn't apply directory rename detection for it.
-
-test_setup_6b () {
-	test_create_repo 6b &&
+#   Expected: y/{b,c,d}
+#   Alternate: y/{b,c}, z/d
+#   Note: Directory rename detection says A renamed z/ -> y/, therefore the new
+#         file 'z/d' in B should be moved to 'y/d'.
+#
+#         We could potentially ignore the renames of z/{b,c} on side A since
+#         those were renamed on both sides.  However, it's a bit of a corner
+#         case because what if there was also a z/e that side A moved to x/e
+#         and side B left alone?  If we used the "ignore renames done on both
+#         sides" logic, then we'd compute that A renamed z/ -> x/, and move
+#         z/d to x/d.  That seems more surprising and uglier than allowing
+#         the z/ -> y/ rename.
+
+test_setup_6b2 () {
+	test_create_repo 6b2 &&
 	(
-		cd 6b &&
+		cd 6b2 &&
 
 		mkdir z &&
 		echo b >z/b &&
@@ -1318,10 +1412,10 @@ test_setup_6b () {
 	)
 }
 
-test_expect_success '6b: Same rename done on both sides' '
-	test_setup_6b &&
+test_expect_failure '6b2: Same rename done on both sides' '
+	test_setup_6b2 &&
 	(
-		cd 6b &&
+		cd 6b2 &&
 
 		git checkout A^0 &&
 
@@ -1335,7 +1429,7 @@ test_expect_success '6b: Same rename done on both sides' '
 		test_line_count = 1 out &&
 
 		git rev-parse >actual \
-			HEAD:y/b HEAD:y/c HEAD:z/d &&
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
 		git rev-parse >expect \
 			O:z/b    O:z/c    B:z/d &&
 		test_cmp expect actual
@@ -1343,7 +1437,7 @@ test_expect_success '6b: Same rename done on both sides' '
 '
 
 # Testcase 6c, Rename only done on same side
-#   (Related to testcases 6b and 8e)
+#   (Related to testcases 6b1, 6b2, and 8e)
 #   Commit O: z/{b,c}
 #   Commit A: z/{b,c} (no change)
 #   Commit B: y/{b,c}, z/d
@@ -2269,14 +2363,22 @@ test_expect_success '8d: rename/delete...or not?' '
 # Notes: In commit A, directory z got renamed to y.  In commit B, directory z
 #        did NOT get renamed; the directory is still present; instead it is
 #        considered to have just renamed a subset of paths in directory z
-#        elsewhere.  However, this is much like testcase 6b (where commit B
-#        moves all the original paths out of z/ but opted to keep d
-#        within z/).  This makes it hard to judge where d should end up.
+#        elsewhere.  This is much like testcase 6b2 (where commit B moves all
+#        the original paths out of z/ but opted to keep d within z/).
+#
+#        It was not clear in the past what should be done with this testcase;
+#        in fact, I noted that I "just picked one" previously.  However,
+#        following the new logic for testcase 6b2, we should take the rename
+#        and move z/d to y/d.
 #
-#        It's possible that users would get confused about this, but what
-#        should we do instead?  It's not at all clear to me whether z/d or
-#        y/d or something else is a better resolution here, and other cases
-#        start getting really tricky, so I just picked one.
+#        6b1, 6b2, and this case are definitely somewhat fuzzy in terms of
+#        whether they are optimal for end users, but (a) the default for
+#        directory rename detection is to mark these all as conflicts
+#        anyway, (b) it feels like this is less prone to higher order corner
+#        case confusion, and (c) the current algorithm requires less global
+#        knowledge (i.e. less coupling in the algorithm between renames done
+#        on both sides) which thus means users are better able to predict
+#        the behavior, and predict it without computing as many details.
 
 test_setup_8e () {
 	test_create_repo 8e &&
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/4] t6423: more involved rules for renaming directories into each other
  2020-10-15 20:46 [PATCH 0/4] Directory rename detection testcases and rules Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-10-15 20:46 ` [PATCH 3/4] t6423: update directory rename detection tests with new rule Elijah Newren via GitGitGadget
@ 2020-10-15 20:46 ` Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 6+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-10-15 20:46 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Testcases 12b and 12c were both slightly weird; they were marked as
having a weird resolution, but with the note that even straightforward
simple rules can give weird results when the input is bizarre.

However, during optimization work for merge-ort, I discovered a
significant speedup that is possible if we add one more fairly
straightforward rule: we don't bother doing directory rename detection
if there are no new files added to the directory on the other side of
the history to be affected by the directory rename.  This seems like an
obvious and straightforward rule, but there was one funny corner case
where directory rename detection could affect only existing files: the
funny corner case where two directories are renamed into each other on
opposite sides of history.  In other words, it only results in a
different output for testcases 12b and 12c.

Since we already thought testcases 12b and 12c were weird anyway, and
because the optimization often has a significant effect on common cases
(but is entirely prevented if we can't change how 12b and 12c function),
let's add the additional rule and tweak how 12b and 12c work.  Split
both testcases into two (one where we add no new files, and one where
the side that doesn't rename a given directory will add files to it),
and mark them with the new expectation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 .../technical/directory-rename-detection.txt  |   3 +
 t/t6423-merge-rename-directories.sh           | 253 ++++++++++++++++--
 2 files changed, 230 insertions(+), 26 deletions(-)

diff --git a/Documentation/technical/directory-rename-detection.txt b/Documentation/technical/directory-rename-detection.txt
index 5d03539412..49b83ef3cc 100644
--- a/Documentation/technical/directory-rename-detection.txt
+++ b/Documentation/technical/directory-rename-detection.txt
@@ -73,6 +73,9 @@ additional rules:
   b) Only apply implicit directory renames to directories if the other side
      of history is the one doing the renaming.
 
+  c) Do not perform directory rename detection for directories which had no
+     new paths added to them.
+
 Limitations -- support in different commands
 --------------------------------------------
 
diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh
index 00eac6e9a2..06b46af765 100755
--- a/t/t6423-merge-rename-directories.sh
+++ b/t/t6423-merge-rename-directories.sh
@@ -4049,31 +4049,124 @@ test_expect_success '12a: Moving one directory hierarchy into another' '
 	)
 '
 
-# Testcase 12b, Moving two directory hierarchies into each other
+# Testcase 12b1, Moving two directory hierarchies into each other
 #   (Related to testcases 1c and 12c)
 #   Commit O: node1/{leaf1, leaf2}, node2/{leaf3, leaf4}
 #   Commit A: node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
 #   Commit B: node2/{leaf3, leaf4, node1/{leaf1, leaf2}}
-#   Expected: node1/node2/node1/{leaf1, leaf2},
+#   Expected: node1/node2/{leaf3, leaf4}
+#             node2/node1/{leaf1, leaf2}
+#   NOTE: If there were new files added to the old node1/ or node2/ directories,
+#         then we would need to detect renames for those directories and would
+#         find that:
+#             commit A renames node2/ -> node1/node2/
+#             commit B renames node1/ -> node2/node1/
+#         Applying those directory renames to the initial result (making all
+#         four paths experience a transitive renaming), yields
+#             node1/node2/node1/{leaf1, leaf2}
 #             node2/node1/node2/{leaf3, leaf4}
+#         as the result.  It may be really weird to have two directories
+#         rename each other, but simple rules give weird results when given
+#         weird inputs.  HOWEVER, the "If" at the beginning of those NOTE was
+#         false; there were no new files added and thus there is no directory
+#         rename detection to perform.  As such, we just have simple renames
+#         and the expected answer is:
+#             node1/node2/{leaf3, leaf4}
+#             node2/node1/{leaf1, leaf2}
+
+test_setup_12b1 () {
+	test_create_repo 12b1 &&
+	(
+		cd 12b1 &&
+
+		mkdir -p node1 node2 &&
+		echo leaf1 >node1/leaf1 &&
+		echo leaf2 >node1/leaf2 &&
+		echo leaf3 >node2/leaf3 &&
+		echo leaf4 >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+}
+
+test_expect_failure '12b1: Moving two directory hierarchies into each other' '
+	test_setup_12b1 &&
+	(
+		cd 12b1 &&
+
+		git checkout A^0 &&
+
+		git -c merge.directoryRenames=true merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:node2/node1/leaf1 \
+			HEAD:node2/node1/leaf2 \
+			HEAD:node1/node2/leaf3 \
+			HEAD:node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12b2, Moving two directory hierarchies into each other
+#   (Related to testcases 1c and 12c)
+#   Commit O: node1/{leaf1, leaf2}, node2/{leaf3, leaf4}
+#   Commit A: node1/{leaf1, leaf2, leaf5, node2/{leaf3, leaf4}}
+#   Commit B: node2/{leaf3, leaf4, leaf6, node1/{leaf1, leaf2}}
+#   Expected: node1/node2/{node1/{leaf1, leaf2}, leaf6}
+#             node2/node1/{node2/{leaf3, leaf4}, leaf5}
 #   NOTE: Without directory renames, we would expect
-#                   node2/node1/{leaf1, leaf2},
-#                   node1/node2/{leaf3, leaf4}
+#             A: node2/leaf3 -> node1/node2/leaf3
+#             A: node2/leaf1 -> node1/node2/leaf4
+#             A: Adds           node1/leaf5
+#             B: node1/leaf1 -> node2/node1/leaf1
+#             B: node1/leaf2 -> node2/node1/leaf2
+#             B: Adds           node2/leaf6
 #         with directory rename detection, we note that
 #             commit A renames node2/ -> node1/node2/
 #             commit B renames node1/ -> node2/node1/
-#         therefore, applying those directory renames to the initial result
-#         (making all four paths experience a transitive renaming), yields
-#         the expected result.
+#         therefore, applying A's directory rename to the paths added in B gives:
+#             B: node1/leaf1 -> node1/node2/node1/leaf1
+#             B: node1/leaf2 -> node1/node2/node1/leaf2
+#             B: Adds           node1/node2/leaf6
+#         and applying B's directory rename to the paths added in A gives:
+#             A: node2/leaf3 -> node2/node1/node2/leaf3
+#             A: node2/leaf1 -> node2/node1/node2/leaf4
+#             A: Adds           node2/node1/leaf5
+#         resulting in the expected
+#             node1/node2/{node1/{leaf1, leaf2}, leaf6}
+#             node2/node1/{node2/{leaf3, leaf4}, leaf5}
 #
 #         You may ask, is it weird to have two directories rename each other?
 #         To which, I can do no more than shrug my shoulders and say that
 #         even simple rules give weird results when given weird inputs.
 
-test_setup_12b () {
-	test_create_repo 12b &&
+test_setup_12b2 () {
+	test_create_repo 12b2 &&
 	(
-		cd 12b &&
+		cd 12b2 &&
 
 		mkdir -p node1 node2 &&
 		echo leaf1 >node1/leaf1 &&
@@ -4090,43 +4183,51 @@ test_setup_12b () {
 
 		git checkout A &&
 		git mv node2/ node1/ &&
+		echo leaf5 >node1/leaf5 &&
+		git add node1/leaf5 &&
 		test_tick &&
 		git commit -m "A" &&
 
 		git checkout B &&
 		git mv node1/ node2/ &&
+		echo leaf6 >node2/leaf6 &&
+		git add node2/leaf6 &&
 		test_tick &&
 		git commit -m "B"
 	)
 }
 
-test_expect_success '12b: Moving two directory hierarchies into each other' '
-	test_setup_12b &&
+test_expect_success '12b2: Moving two directory hierarchies into each other' '
+	test_setup_12b2 &&
 	(
-		cd 12b &&
+		cd 12b2 &&
 
 		git checkout A^0 &&
 
 		git -c merge.directoryRenames=true merge -s recursive B^0 &&
 
 		git ls-files -s >out &&
-		test_line_count = 4 out &&
+		test_line_count = 6 out &&
 
 		git rev-parse >actual \
 			HEAD:node1/node2/node1/leaf1 \
 			HEAD:node1/node2/node1/leaf2 \
 			HEAD:node2/node1/node2/leaf3 \
-			HEAD:node2/node1/node2/leaf4 &&
+			HEAD:node2/node1/node2/leaf4 \
+			HEAD:node2/node1/leaf5       \
+			HEAD:node1/node2/leaf6       &&
 		git rev-parse >expect \
 			O:node1/leaf1 \
 			O:node1/leaf2 \
 			O:node2/leaf3 \
-			O:node2/leaf4 &&
+			O:node2/leaf4 \
+			A:node1/leaf5 \
+			B:node2/leaf6 &&
 		test_cmp expect actual
 	)
 '
 
-# Testcase 12c, Moving two directory hierarchies into each other w/ content merge
+# Testcase 12c1, Moving two directory hierarchies into each other w/ content merge
 #   (Related to testcase 12b)
 #   Commit O: node1/{       leaf1_1, leaf2_1}, node2/{leaf3_1, leaf4_1}
 #   Commit A: node1/{       leaf1_2, leaf2_2,  node2/{leaf3_2, leaf4_2}}
@@ -4134,13 +4235,13 @@ test_expect_success '12b: Moving two directory hierarchies into each other' '
 #   Expected: Content merge conflicts for each of:
 #               node1/node2/node1/{leaf1, leaf2},
 #               node2/node1/node2/{leaf3, leaf4}
-#   NOTE: This is *exactly* like 12c, except that every path is modified on
+#   NOTE: This is *exactly* like 12b1, except that every path is modified on
 #         each side of the merge.
 
-test_setup_12c () {
-	test_create_repo 12c &&
+test_setup_12c1 () {
+	test_create_repo 12c1 &&
 	(
-		cd 12c &&
+		cd 12c1 &&
 
 		mkdir -p node1 node2 &&
 		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf1\n" >node1/leaf1 &&
@@ -4171,10 +4272,10 @@ test_setup_12c () {
 	)
 }
 
-test_expect_success '12c: Moving one directory hierarchy into another w/ content merge' '
-	test_setup_12c &&
+test_expect_failure '12c1: Moving one directory hierarchy into another w/ content merge' '
+	test_setup_12c1 &&
 	(
-		cd 12c &&
+		cd 12c1 &&
 
 		git checkout A^0 &&
 
@@ -4183,6 +4284,102 @@ test_expect_success '12c: Moving one directory hierarchy into another w/ content
 		git ls-files -u >out &&
 		test_line_count = 12 out &&
 
+		git rev-parse >actual \
+			:1:node2/node1/leaf1 \
+			:1:node2/node1/leaf2 \
+			:1:node1/node2/leaf3 \
+			:1:node1/node2/leaf4 \
+			:2:node2/node1/leaf1 \
+			:2:node2/node1/leaf2 \
+			:2:node1/node2/leaf3 \
+			:2:node1/node2/leaf4 \
+			:3:node2/node1/leaf1 \
+			:3:node2/node1/leaf2 \
+			:3:node1/node2/leaf3 \
+			:3:node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 \
+			A:node1/leaf1 \
+			A:node1/leaf2 \
+			A:node1/node2/leaf3 \
+			A:node1/node2/leaf4 \
+			B:node2/node1/leaf1 \
+			B:node2/node1/leaf2 \
+			B:node2/leaf3 \
+			B:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12c2, Moving two directory hierarchies into each other w/ content merge
+#   (Related to testcase 12b)
+#   Commit O: node1/{       leaf1_1, leaf2_1}, node2/{leaf3_1, leaf4_1}
+#   Commit A: node1/{       leaf1_2, leaf2_2,  node2/{leaf3_2, leaf4_2}, leaf5}
+#   Commit B: node2/{node1/{leaf1_3, leaf2_3},        leaf3_3, leaf4_3,  leaf6}
+#   Expected: Content merge conflicts for each of:
+#               node1/node2/node1/{leaf1, leaf2}
+#               node2/node1/node2/{leaf3, leaf4}
+#             plus
+#               node2/node1/leaf5
+#               node1/node2/leaf6
+#   NOTE: This is *exactly* like 12b2, except that every path from O is modified
+#         on each side of the merge.
+
+test_setup_12c2 () {
+	test_create_repo 12c2 &&
+	(
+		cd 12c2 &&
+
+		mkdir -p node1 node2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf1\n" >node1/leaf1 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf2\n" >node1/leaf2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf3\n" >node2/leaf3 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf4\n" >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		for i in `git ls-files`; do echo side A >>$i; done &&
+		git add -u &&
+		echo leaf5 >node1/leaf5 &&
+		git add node1/leaf5 &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		for i in `git ls-files`; do echo side B >>$i; done &&
+		git add -u &&
+		echo leaf6 >node2/leaf6 &&
+		git add node2/leaf6 &&
+		test_tick &&
+		git commit -m "B"
+	)
+}
+
+test_expect_success '12c2: Moving one directory hierarchy into another w/ content merge' '
+	test_setup_12c2 &&
+	(
+		cd 12c2 &&
+
+		git checkout A^0 &&
+
+		test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 14 out &&
+		git ls-files -u >out &&
+		test_line_count = 12 out &&
+
 		git rev-parse >actual \
 			:1:node1/node2/node1/leaf1 \
 			:1:node1/node2/node1/leaf2 \
@@ -4195,7 +4392,9 @@ test_expect_success '12c: Moving one directory hierarchy into another w/ content
 			:3:node1/node2/node1/leaf1 \
 			:3:node1/node2/node1/leaf2 \
 			:3:node2/node1/node2/leaf3 \
-			:3:node2/node1/node2/leaf4 &&
+			:3:node2/node1/node2/leaf4 \
+			:0:node2/node1/leaf5       \
+			:0:node1/node2/leaf6       &&
 		git rev-parse >expect \
 			O:node1/leaf1 \
 			O:node1/leaf2 \
@@ -4208,7 +4407,9 @@ test_expect_success '12c: Moving one directory hierarchy into another w/ content
 			B:node2/node1/leaf1 \
 			B:node2/node1/leaf2 \
 			B:node2/leaf3 \
-			B:node2/leaf4 &&
+			B:node2/leaf4 \
+			A:node1/leaf5 \
+			B:node2/leaf6 &&
 		test_cmp expect actual
 	)
 '
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/4] t6423: more involved directory rename test
  2020-10-15 20:46 ` [PATCH 2/4] t6423: more involved directory rename test Elijah Newren via GitGitGadget
@ 2020-10-15 20:57   ` Eric Sunshine
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Sunshine @ 2020-10-15 20:57 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Elijah Newren

On Thu, Oct 15, 2020 at 4:46 PM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> Add a new testcase modelled on a real world repository example that
> served multiple purposes:

A couple minor style nits I noticed while quickly running my eye over
the patch...

> +               for i in `test_seq 1 88`; do
> +                       echo content $i >dir/unchanged/file_$i
> +               done &&

'do' on its own line, and prefer $(...) over `...`:

    for i in $(test_seq 1 88)
    do
        echo content $i >dir/unchanged/file_$i
    done &&

(Not necessarily worth a re-roll, though.)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-15 20:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 20:46 [PATCH 0/4] Directory rename detection testcases and rules Elijah Newren via GitGitGadget
2020-10-15 20:46 ` [PATCH 1/4] directory-rename-detection.txt: update references to regression tests Elijah Newren via GitGitGadget
2020-10-15 20:46 ` [PATCH 2/4] t6423: more involved directory rename test Elijah Newren via GitGitGadget
2020-10-15 20:57   ` Eric Sunshine
2020-10-15 20:46 ` [PATCH 3/4] t6423: update directory rename detection tests with new rule Elijah Newren via GitGitGadget
2020-10-15 20:46 ` [PATCH 4/4] t6423: more involved rules for renaming directories into each other Elijah Newren via GitGitGadget

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git