git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [PATCH v10 00/36] Add directory rename detection to git
@ 2018-04-19 17:57 Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 01/36] directory rename detection: basic testcases Elijah Newren
                   ` (37 more replies)
  0 siblings, 38 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

This series is a reboot of the directory rename detection series that was
merged to master and then reverted due to the final patch having a buggy
can-skip-update check, as noted at
  https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
This series based on top of master.

This updated series fixes the problem found with the previous series, and
also fixes Linus' issue with unnecessary rebuilds noted at
  https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/

For the original details about design considerations surrounding
directory rename detection, see
  https://public-inbox.org/git/20171110190550.27059-1-newren@gmail.com/

Patches 1--28 are identical to what was previously merged to master,
modulo trivial compilation fixes due to the fact that I've rebased on
master which now includes commit 916bc35b29af ("tree-walk: convert tree
entry functions to object_id", 2018-03-12).  As such, I've retained the
Reviewed-by and Signed-off-by tags for these first 28 patches.  (The
final patch of the original series, patch 29, has been rewritten and
replaced in this series.)

The remaining eight patches are new; a brief summary:

  merge-recursive: improve add_cacheinfo error handling
  merge-recursive: move more is_dirty handling to merge_content
  merge-recursive: avoid triggering add_cacheinfo error with dirty mod

    When Junio was bit by the previous series, the code reached a
    detected error state that should not ever be hit in production.
    That was bad enough, but the problem compounded because the code
    simply printed a vague not-very-scary-sounding error, and returned
    an error code that the caller ignored (which not only proceeded to
    then handle other paths which might print messages causing the error
    to scroll off the screen, but could result in a "clean" merge).  Fix
    issues with the error handling...and then deal with the breakage of
    one particular test that was triggering this codepath.

  t6046: testcases checking whether updates can be skipped in a merge

    Add a fairly comprehensive set of tests for the skipability of
    working tree updates.

  merge-recursive: fix was_tracked() to quit lying with some renamed
    paths
  merge-recursive: fix remainder of was_dirty() to use original index

    Instead of using the current index as a (rather imperfect) proxy for
    the state of the index just before the merge, keep a copy of the
    original index around so we can get correct answers to whether
    certain paths were tracked or dirty before the merge.

  merge-recursive: make "Auto-merging" comment show for other merges
  merge-recursive: fix check for skipability of working tree updates

    Fix and simplify the skipability check.  Due to some tests being
    picky about output, the first of these two patches exists to avoid
    triggering the "Auto-merging $FILE" message too often with the
    simplified logic; in the process, it fixes a pair of existing issues
    with when those messages are shown, making it more accurate in
    general.

Additional testing:

  * I've re-merged all ~13k merge commits in git.git with both
    git-2.17.0 and this version of git, comparing the results to each
    other in detail.  (Including stdout & stderr, as well as the output
    of subsequent commands like `git status`, `git ls-files -s`, `git
    diff -M`, `git diff -M --staged`).  The only differences were in 23
    merges of either git-gui or gitk which involved directory renames
    (e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
    or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
    'gitk-git/po/ru.po')

  * I'm trying to do the same with linux.git, but it looks like that will
    take nearly a week to complete...

My biggest question:

  * Is there any other testing others would like to see, in order to avoid
    a repeat of the pain from my previous series and allow us to safely
    merge this newer one?

Elijah Newren (36):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
    far
  directory rename detection: partially renamed directory
    testcase/discussion
  directory rename detection: files/directories in the way of some
    renames
  directory rename detection: testcases checking which side did the
    rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
    merges
  directory rename detection: miscellaneous testcases to complete
    coverage
  directory rename detection: tests for handling overwriting untracked
    files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: add get_directory_renames()
  merge-recursive: check for directory level conflicts
  merge-recursive: add computation of collisions due to dir rename &
    merging
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: when comparing files, don't include trees
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: avoid clobbering untracked files with directory
    renames
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: fix remaining directory rename + dirty overwrite
    cases
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: avoid spurious rename/rename conflict from dir
    renames
  merge-recursive: improve add_cacheinfo error handling
  merge-recursive: move more is_dirty handling to merge_content
  merge-recursive: avoid triggering add_cacheinfo error with dirty mod
  t6046: testcases checking whether updates can be skipped in a merge
  merge-recursive: fix was_tracked() to quit lying with some renamed
    paths
  merge-recursive: fix remainder of was_dirty() to use original index
  merge-recursive: make "Auto-merging" comment show for other merges
  merge-recursive: fix check for skipability of working tree updates

 merge-recursive.c                      | 1432 ++++++++-
 merge-recursive.h                      |   28 +
 strbuf.c                               |   16 +
 strbuf.h                               |   16 +
 t/t3501-revert-cherry-pick.sh          |    7 +-
 t/t6022-merge-rename.sh                |    2 +-
 t/t6043-merge-rename-directories.sh    | 3998 ++++++++++++++++++++++++
 t/t6046-merge-skip-unneeded-updates.sh |  761 +++++
 t/t7607-merge-overwrite.sh             |    2 +-
 unpack-trees.c                         |    4 +-
 unpack-trees.h                         |    4 +
 11 files changed, 6092 insertions(+), 178 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 01/36] directory rename detection: basic testcases
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 02/36] directory rename detection: directory splitting testcases Elijah Newren
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 442 ++++++++++++++++++++++++++++
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 0000000000..d045f0e31e
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#      A
+#      o
+#     / \
+#  O o   ?
+#     \ /
+#      o
+#      B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#    z/{b,c}   means  files z/b and z/c both exist
+#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
+#                     underscore notation is to differentiate different
+#                     files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###########################################################################
+# SECTION 1: Basic cases we should be able to handle
+###########################################################################
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+	test_create_repo 1a &&
+	(
+		cd 1a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		mkdir z/e &&
+		echo f >z/e/f &&
+		git add z/d z/e/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+	(
+		cd 1a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d    B:z/e/f &&
+		test_cmp expect actual &&
+
+		git hash-object y/d >actual &&
+		git rev-parse B:z/d >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_must_fail git rev-parse HEAD:z/e/f &&
+		test_path_is_missing z/d &&
+		test_path_is_missing z/e/f
+	)
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+	test_create_repo 1b &&
+	(
+		cd 1b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/b y &&
+		git mv z/c y &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1b-check: Merge a directory with another' '
+	(
+		cd 1b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:y/d    A:z/e &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:z/e
+	)
+'
+
+# Testcase 1c, Transitive renaming
+#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
+#   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   Commit O: z/{b,c},   x/d
+#   Commit A: y/{b,c},   x/d
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)
+
+test_expect_success '1c-setup: Transitive renaming' '
+	test_create_repo 1c &&
+	(
+		cd 1c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1c-check: Transitive renaming' '
+	(
+		cd 1c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:x/d &&
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_path_is_missing z/d
+	)
+'
+
+# Testcase 1d, Directory renames (merging two directories into one new one)
+#              cause a rename/rename(2to1) conflict
+#   (Related to testcases 1c and 7b)
+#   Commit O. z/{b,c},        y/{d,e}
+#   Commit A. x/{b,c},        y/{d,e,m,wham_1}
+#   Commit B. z/{b,c,n,wham_2}, x/{d,e}
+#   Expected: x/{b,c,d,e,m,n}, CONFLICT:(y/wham_1 & z/wham_2 -> x/wham)
+#   Note: y/m & z/n should definitely move into x.  By the same token, both
+#         y/wham_1 & z/wham_2 should too...giving us a conflict.
+
+test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) conflict' '
+	test_create_repo 1d &&
+	(
+		cd 1d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		echo e >y/e &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z x &&
+		echo m >y/m &&
+		echo wham1 >y/wham &&
+		git add y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y x &&
+		echo n >z/n &&
+		echo wham2 >z/wham &&
+		git add z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+	(
+		cd 1d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 8 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:x/c :0:x/d :0:x/e :0:x/m :0:x/n &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:y/d  O:y/e  A:y/m  B:z/n &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :0:x/wham &&
+		git rev-parse >actual \
+			:2:x/wham :3:x/wham &&
+		git rev-parse >expect \
+			 A:y/wham  B:z/wham &&
+		test_cmp expect actual &&
+
+		test_path_is_missing x/wham &&
+		test_path_is_file x/wham~HEAD &&
+		test_path_is_file x/wham~B^0 &&
+
+		git hash-object >actual \
+			x/wham~HEAD x/wham~B^0 &&
+		git rev-parse >expect \
+			A:y/wham    B:z/wham &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 1e, Renamed directory, with all filenames being renamed too
+#   Commit O: z/{oldb,oldc}
+#   Commit A: y/{newb,newc}
+#   Commit B: z/{oldb,oldc,d}
+#   Expected: y/{newb,newc,d}
+
+test_expect_success '1e-setup: Renamed directory, with all files being renamed too' '
+	test_create_repo 1e &&
+	(
+		cd 1e &&
+
+		mkdir z &&
+		echo b >z/oldb &&
+		echo c >z/oldc &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		git mv z/oldb y/newb &&
+		git mv z/oldc y/newc &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+	(
+		cd 1e &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/newb HEAD:y/newc HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/oldb    O:z/oldc    B:z/d &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:z/d
+	)
+'
+
+# Testcase 1f, Split a directory into two other directories
+#   (Related to testcases 3a, all of section 2, and all of section 4)
+#   Commit O: z/{b,c,d,e,f}
+#   Commit A: z/{b,c,d,e,f,g}
+#   Commit B: y/{b,c}, x/{d,e,f}
+#   Expected: y/{b,c}, x/{d,e,f,g}
+
+test_expect_success '1f-setup: Split a directory into two other directories' '
+	test_create_repo 1f &&
+	(
+		cd 1f &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		echo f >z/f &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo g >z/g &&
+		git add z/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		git mv z/e x/ &&
+		git mv z/f x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1f-check: Split a directory into two other directories' '
+	(
+		cd 1f &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d HEAD:x/e HEAD:x/f HEAD:x/g &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    O:z/f    A:z/g &&
+		test_cmp expect actual &&
+		test_path_is_missing z/g &&
+		test_must_fail git rev-parse HEAD:z/g
+	)
+'
+
+###########################################################################
+# Rules suggested by testcases in section 1:
+#
+#   We should still detect the directory rename even if it wasn't just
+#   the directory renamed, but the files within it. (see 1b)
+#
+#   If renames split a directory into two or more others, the directory
+#   with the most renames, "wins" (see 1c).  However, see the testcases
+#   in section 2, plus testcases 3a and 4a.
+###########################################################################
+
+test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 02/36] directory rename detection: directory splitting testcases
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 01/36] directory rename detection: basic testcases Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far Elijah Newren
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 143 ++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index d045f0e31e..b22a9052b3 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###########################################################################
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with equal numbers of paths' '
+	test_create_repo 2a &&
+	(
+		cd 2a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir w &&
+		git mv z/b y/ &&
+		git mv z/c w/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+	(
+		cd 2a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:w/c :0:z/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with equal numbers of paths' '
+	test_create_repo 2b &&
+	(
+		cd 2b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir w &&
+		git mv z/b y/ &&
+		git mv z/c w/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir x &&
+		echo d >x/d &&
+		git add x/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with equal numbers of paths' '
+	(
+		cd 2b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:w/c :0:x/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:x/d &&
+		test_cmp expect actual &&
+		test_i18ngrep ! "CONFLICT.*directory rename split" out
+	)
+'
+
+###########################################################################
+# Rules suggested by section 2:
+#
+#   None; the rule was already covered in section 1.  These testcases are
+#   here just to make sure the conflict resolution and necessary warning
+#   messages are handled correctly.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 01/36] directory rename detection: basic testcases Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 02/36] directory rename detection: directory splitting testcases Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 153 ++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b22a9052b3..8049ed5fc9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
 #   messages are handled correctly.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###########################################################################
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on other side' '
+	test_create_repo 3a &&
+	(
+		cd 3a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on other side' '
+	(
+		cd 3a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+#         a source in a file rename on the same side of history, that we don't
+#         get it involved in directory rename detection.  If it were, we might
+#         end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+#         rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on current side' '
+	test_create_repo 3b &&
+	(
+		cd 3b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir w &&
+		git mv z/d w/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on current side' '
+	(
+		cd 3b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
+		test_i18ngrep ! CONFLICT.*rename/rename.*y/d out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :1:z/d :2:x/d :3:w/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:z/d  O:z/d  O:z/d &&
+		test_cmp expect actual &&
+
+		test_path_is_missing z/d &&
+		git hash-object >actual \
+			x/d   w/d &&
+		git rev-parse >expect \
+			O:z/d O:z/d &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 3:
+#
+#   Avoid directory-rename-detection for a path, if that path is the source
+#   of a rename on either side of a merge.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (2 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 05/36] directory rename detection: files/directories in the way of some renames Elijah Newren
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Add a long note about why we are not considering "partial directory
renames" for the current directory rename detection implementation.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 115 ++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 8049ed5fc9..713ad2b75e 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,119 @@ test_expect_success '3b-check: Avoid implicit rename if involved as source on cu
 #   of a rename on either side of a merge.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# Note that the wording of the rule ("We don't do directory rename
+# detection if the directory still exists on both sides of the merge.")
+# also excludes "renaming" of a directory into a subdirectory of itself
+# (e.g. /some/dir/* -> /some/dir/subdir/*).  It may be possible to carve
+# out an exception for "renaming"-beneath-itself cases without opening
+# weird edge/corner cases for other partial directory renames, but for now
+# we are keeping the rule simple.
+#
+# This section contains a test for a partially-renamed-directory case.
+###########################################################################
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still present' '
+	test_create_repo 4a &&
+	(
+		cd 4a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo f >z/f &&
+		git add z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '4a-check: Directory split, with original directory still present' '
+	(
+		cd 4a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/e HEAD:z/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    B:z/f &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 05/36] directory rename detection: files/directories in the way of some renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (3 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 06/36] directory rename detection: testcases checking which side did the rename Elijah Newren
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 330 ++++++++++++++++++++++++++++
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 713ad2b75e..b469c807c2 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -850,4 +850,334 @@ test_expect_success '4a-check: Directory split, with original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###########################################################################
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},       y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+#         become y/f, we did not apply this for z/e_1 because that would
+#         give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+#         this add/add, is that both versions of y/e are from the same side
+#         of history, giving us no way to represent this conflict in the
+#         index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to original and target' '
+	test_create_repo 5a &&
+	(
+		cd 5a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e1 >z/e &&
+		echo f >z/f &&
+		echo e2 >y/e &&
+		git add z/e z/f y/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+	(
+		cd 5a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+#         we normaly would since z/ is being renamed to y/, then this would be
+#         a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+#         conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+#         representable in the index, so the existence of y/d_3 needs to
+#         cause us to bail on directory rename detection for that path, falling
+#         back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflict' '
+	test_create_repo 5b &&
+	(
+		cd 5b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d1 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		echo d2 >y/d &&
+		git add y/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		echo d3 >y/d &&
+		echo e >z/e &&
+		git add y/d z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+	(
+		cd 5b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e :2:y/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e  A:y/d  B:y/d &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		test_path_is_file y/d
+	)
+'
+
+# Testcase 5c, Transitive rename would cause rename/rename/rename/add/add/add
+#   (Directory rename detection would result in transitive rename vs.
+#    rename/rename(1to2) and turn it into a rename/rename(1to3).  Further,
+#    rename paths conflict with separate adds on the other side)
+#   (Related to testcases 3b and 7c)
+#   Commit O: z/{b,c}, x/d_1
+#   Commit A: y/{b,c,d_2}, w/d_1
+#   Commit B: z/{b,c,d_1,e}, w/d_3, y/d_4
+#   Expected: A mess, but only a rename/rename(1to2)/add/add mess.  Use the
+#             presence of y/d_4 in B to avoid doing transitive rename of
+#             x/d_1 -> z/d_1 -> y/d_1, so that the only paths we have at
+#             y/d are y/d_2 and y/d_4.  We still do the move from z/e to y/e,
+#             though, because it doesn't have anything in the way.
+
+test_expect_success '5c-setup: Transitive rename would cause rename/rename/rename/add/add/add' '
+	test_create_repo 5c &&
+	(
+		cd 5c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d1 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		echo d2 >y/d &&
+		git add y/d &&
+		git mv x w &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		mkdir w &&
+		mkdir y &&
+		echo d3 >w/d &&
+		echo d4 >y/d &&
+		echo e >z/e &&
+		git add w/ y/ z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+	(
+		cd 5c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*z/d" out &&
+		test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 9 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		git rev-parse >actual \
+			:2:w/d :3:w/d :1:x/d :2:y/d :3:y/d :3:z/d &&
+		git rev-parse >expect \
+			 O:x/d  B:w/d  O:x/d  A:y/d  B:y/d  O:x/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			w/d~HEAD w/d~B^0 z/d &&
+		git rev-parse >expect \
+			O:x/d    B:w/d   O:x/d &&
+		test_cmp expect actual &&
+		test_path_is_missing x/d &&
+		test_path_is_file y/d &&
+		grep -q "<<<<" y/d  # conflict markers should be present
+	)
+'
+
+# Testcase 5d, Directory/file/file conflict due to directory rename
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c,d_1}
+#   Commit B: z/{b,c,d_2,f}, y/d/e
+#   Expected: y/{b,c,d/e,f}, z/d_2, CONFLICT(file/directory), y/d_1~HEAD
+#   Note: The fact that y/d/ exists in B makes us bail on directory rename
+#         detection for z/d_2, but that doesn't prevent us from applying the
+#         directory rename detection for z/f -> y/f.
+
+test_expect_success '5d-setup: Directory/file/file conflict due to directory rename' '
+	test_create_repo 5d &&
+	(
+		cd 5d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		echo d1 >y/d &&
+		git add y/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir -p y/d &&
+		echo e >y/d/e &&
+		echo d2 >z/d &&
+		echo f >z/f &&
+		git add y/d/e z/d z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+	(
+		cd 5d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (file/directory).*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:z/d :0:y/f :2:y/d :0:y/d/e &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/d  B:z/f  A:y/d  B:y/d/e &&
+		test_cmp expect actual &&
+
+		git hash-object y/d~HEAD >actual &&
+		git rev-parse A:y/d >expect &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 5:
+#
+#   If a subset of to-be-renamed files have a file or directory in the way,
+#   "turn off" the directory rename for those specific sub-paths, falling
+#   back to old handling.  But, sadly, see testcases 8a and 8b.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 06/36] directory rename detection: testcases checking which side did the rename
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (4 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 05/36] directory rename detection: files/directories in the way of some renames Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 07/36] directory rename detection: more involved edge/corner testcases Elijah Newren
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 336 ++++++++++++++++++++++++++++
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b469c807c2..9ae245a522 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1180,4 +1180,340 @@ test_expect_failure '5d-check: Directory/file/file conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###########################################################################
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+#         them under y/ doesn't accidentally catch z/d and make it look like
+#         it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+	test_create_repo 6a &&
+	(
+		cd 6a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		git rm z/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+	(
+		cd 6a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 2 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :3:y/c &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+#         but B did that rename and still decided to put the file into z/,
+#         so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+	test_create_repo 6b &&
+	(
+		cd 6b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6b-check: Same rename done on both sides' '
+	(
+		cd 6b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6c, Rename only done on same side
+#   (Related to testcases 6b and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: z/{b,c} (no change)
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Seems obvious, but just checking that the implementation doesn't
+#         "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6c-setup: Rename only done on same side' '
+	test_create_repo 6c &&
+	(
+		cd 6c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6c-check: Rename only done on same side' '
+	(
+		cd 6c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6d, We don't always want transitive renaming
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: z/{b,c}, x/d (no change)
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6d-setup: We do not always want transitive renaming' '
+	test_create_repo 6d &&
+	(
+		cd 6d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		git mv x z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6d-check: We do not always want transitive renaming' '
+	(
+		cd 6d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6e, Add/add from one-side
+#   Commit O: z/{b,c}
+#   Commit A: z/{b,c} (no change)
+#   Commit B: y/{b,c,d_1}, z/d_2
+#   Expected: y/{b,c,d_1}, z/d_2
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c} +
+#         add/add conflict on y/d_1 vs y/d_2.
+
+test_expect_success '6e-setup: Add/add from one side' '
+	test_create_repo 6e &&
+	(
+		cd 6e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		echo d1 > y/d &&
+		mkdir z &&
+		echo d2 > z/d &&
+		git add y/d z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6e-check: Add/add from one side' '
+	(
+		cd 6e &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:y/d    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 6:
+#
+#   Only apply implicit directory renames to directories if the other
+#   side of history is the one doing the renaming.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 07/36] directory rename detection: more involved edge/corner testcases
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (5 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 06/36] directory rename detection: testcases checking which side did the rename Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 396 ++++++++++++++++++++++++++++
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 9ae245a522..6db1439675 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1516,4 +1516,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###########################################################################
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	test_create_repo 7a &&
+	(
+		cd 7a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir w &&
+		mkdir x &&
+		git mv z/b w/ &&
+		git mv z/c x/ &&
+		echo d > z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	(
+		cd 7a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+		test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/b   w/b   y/c   x/c &&
+		git rev-parse >expect \
+			O:z/b O:z/b O:z/c O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c},     x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},        w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive rename' '
+	test_create_repo 7b &&
+	(
+		cd 7b &&
+
+		mkdir z &&
+		mkdir x &&
+		mkdir w &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d1 > x/d &&
+		echo d2 > w/d &&
+		git add z x w &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv w/d y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+	(
+		cd 7b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :2:y/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:w/d  O:x/d &&
+		test_cmp expect actual &&
+
+		test_path_is_missing y/d &&
+		test_path_is_file y/d~HEAD &&
+		test_path_is_file y/d~B^0 &&
+
+		git hash-object >actual \
+			y/d~HEAD y/d~B^0 &&
+		git rev-parse >expect \
+			O:w/d    O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
+#   (Related to testcases 3b and 5c)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: y/{b,c}, w/d
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)
+#   NOTE: z/ was renamed to y/ so we do want to report
+#         neither CONFLICT(x/d -> w/d vs. z/d)
+#         nor CONFLiCT x/d -> w/d vs. y/d vs. z/d)
+
+test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may add complexity' '
+	test_create_repo 7c &&
+	(
+		cd 7c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv x w &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+	(
+		cd 7c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :1:x/d :2:w/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:x/d  O:x/d  O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7d, transitive rename involved in rename/delete; how is it reported?
+#   (Related somewhat to testcases 5b and 8d)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(delete x/d vs rename to y/d)
+#   NOTE: z->y so NOT CONFLICT(delete x/d vs rename to z/d)
+
+test_expect_success '7d-setup: transitive rename involved in rename/delete; how is it reported?' '
+	test_create_repo 7d &&
+	(
+		cd 7d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git rm -rf x &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+	(
+		cd 7d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7e, transitive rename in rename/delete AND dirs in the way
+#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
+#   (Also related to testcases 9c and 9d)
+#   Commit O: z/{b,c},     x/d_1
+#   Commit A: y/{b,c,d/g}, x/d/f
+#   Commit B: z/{b,c,d_1}
+#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
+#             y/{b,c,d/g}, y/d_1~B^0, x/d/f
+
+#   NOTE: The main path of interest here is d_1 and where it ends up, but
+#         this is actually a case that has two potential directory renames
+#         involved and D/F conflict(s), so it makes sense to walk through
+#         each step.
+#
+#         Commit A renames z/ -> y/.  Thus everything that B adds to z/
+#         should be instead moved to y/.  This gives us the D/F conflict on
+#         y/d because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.
+#
+#         Further, commit B renames x/ -> z/, thus everything A adds to x/
+#         should instead be moved to z/...BUT we removed z/ and renamed it
+#         to y/, so maybe everything should move not from x/ to z/, but
+#         from x/ to z/ to y/.  Doing so might make sense from the logic so
+#         far, but note that commit A had both an x/ and a y/; it did the
+#         renaming of z/ to y/ and created x/d/f and it clearly made these
+#         things separate, so it doesn't make much sense to push these
+#         together.  Doing so is what I'd call a doubly transitive rename;
+#         see testcases 9c and 9d for further discussion of this issue and
+#         how it's resolved.
+
+test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in the way' '
+	test_create_repo 7e &&
+	(
+		cd 7e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d1 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git rm x/d &&
+		mkdir -p x/d &&
+		mkdir -p y/d &&
+		echo f >x/d/f &&
+		echo g >y/d/g &&
+		git add x/d/f y/d/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+	(
+		cd 7e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:x/d/f :0:y/d/g :0:y/b :0:y/c :3:y/d &&
+		git rev-parse >expect \
+			 A:x/d/f  A:y/d/g  O:z/b  O:z/c  O:x/d &&
+		test_cmp expect actual &&
+
+		git hash-object y/d~B^0 >actual &&
+		git rev-parse O:x/d >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (6 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 07/36] directory rename detection: more involved edge/corner testcases Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 404 ++++++++++++++++++++++++++++
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 6db1439675..e211e8ca31 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1912,4 +1912,408 @@ test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in th
 	)
 '
 
+###########################################################################
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###########################################################################
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:       y/{a,b,e,f}, z/{c,d}
+#   Optimal:                  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
+	test_create_repo 8a &&
+	(
+		cd 8a &&
+
+		mkdir x &&
+		mkdir y &&
+		echo a >x/a &&
+		echo b >x/b &&
+		echo c >y/c &&
+		echo d >y/d &&
+		git add x y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e >x/e &&
+		echo f >y/f &&
+		git add x/e y/f &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y z &&
+		git mv x y &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+	(
+		cd 8a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:x/a    O:x/b    A:x/e    A:y/f    O:y/c    O:y/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with conflicting filenames
+#   Commit O. x/{a_1,b_1},     y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1},     z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:       <same>
+#   Scary:                    y/{a_1,b_1},     z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
+#   Optimal:                  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior for both
+# e_1 and e_2.
+
+test_expect_success '8b-setup: Dual-directory rename, one into the others way, with conflicting filenames' '
+	test_create_repo 8b &&
+	(
+		cd 8b &&
+
+		mkdir x &&
+		mkdir y &&
+		echo a1 >x/a &&
+		echo b1 >x/b &&
+		echo a2 >y/a &&
+		echo b2 >y/b &&
+		git add x y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e1 >x/e &&
+		echo e2 >y/e &&
+		git add x/e y/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y z &&
+		git mv x y &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '8b-check: Dual-directory rename, one into the others way, with conflicting filenames' '
+	(
+		cd 8b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/a HEAD:y/b HEAD:z/a HEAD:z/b HEAD:x/e HEAD:y/e &&
+		git rev-parse >expect \
+			O:x/a    O:x/b    O:y/a    O:y/b    A:x/e    A:y/e &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8c, rename+modify/delete
+#   (Related to testcases 5b and 8d)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d_modified,e}
+#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
+#
+#   Note: This testcase doesn't present any concerns for me...until you
+#         compare it with testcases 5b and 8d.  See notes in 8d for more
+#         details.
+
+test_expect_success '8c-setup: rename+modify/delete' '
+	test_create_repo 8c &&
+	(
+		cd 8c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		test_seq 1 10 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo 11 >z/d &&
+		test_chmod +x z/d &&
+		echo e >z/e &&
+		git add z/d z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8c-check: rename+modify/delete' '
+	(
+		cd 8c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).* z/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e  B:z/d &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		test_must_fail git rev-parse :2:y/d &&
+		git ls-files -s y/d | grep ^100755 &&
+		test_path_is_file y/d
+	)
+'
+
+# Testcase 8d, rename/delete...or not?
+#   (Related to testcase 5b; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e}
+#   Expected: y/{b,c,e}
+#
+#   Note: It would also be somewhat reasonable to resolve this as
+#             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
+#   The logic being that the only difference between this testcase and 8c
+#   is that there is no modification to d.  That suggests that instead of a
+#   rename/modify vs. delete conflict, we should just have a rename/delete
+#   conflict, otherwise we are being inconsistent.
+#
+#   However...as far as consistency goes, we didn't report a conflict for
+#   path d_1 in testcase 5b due to a different file being in the way.  So,
+#   we seem to be forced to have cases where users can change things
+#   slightly and get what they may perceive as inconsistent results.  It
+#   would be nice to avoid that, but I'm not sure I see how.
+#
+#   In this case, I'm leaning towards: commit A was the one that deleted z/d
+#   and it did the rename of z to y, so the two "conflicts" (rename vs.
+#   delete) are both coming from commit A, which is illogical.  Conflicts
+#   during merging are supposed to be about opposite sides doing things
+#   differently.
+
+test_expect_success '8d-setup: rename/delete...or not?' '
+	test_create_repo 8d &&
+	(
+		cd 8d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		test_seq 1 10 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8d-check: rename/delete...or not?' '
+	(
+		cd 8d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/e &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/e &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8e, Both sides rename, one side adds to original directory
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/{b,c}, z/d
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: z/d, CONFLICT(z/b -> y/b vs. w/b),
+#                                  CONFLICT(z/c -> y/c vs. w/c)
+#   Currently expected:       y/d, CONFLICT(z/b -> y/b vs. w/b),
+#                                  CONFLICT(z/c -> y/c vs. w/c)
+#   Optimal:                  ??
+#
+# Notes: In commit A, directory z got renamed to y.  In commit B, directory z
+#        did NOT get renamed; the directory is still present; instead it is
+#        considered to have just renamed a subset of paths in directory z
+#        elsewhere.  Therefore, the directory rename done in commit A to z/
+#        applies to z/d and maps it to y/d.
+#
+#        It's possible that users would get confused about this, but what
+#        should we do instead?  Silently leaving at z/d seems just as bad or
+#        maybe even worse.  Perhaps we could print a big warning about z/d
+#        and how we're moving to y/d in this case, but when I started thinking
+#        about the ramifications of doing that, I didn't know how to rule out
+#        that opening other weird edge and corner cases so I just punted.
+
+test_expect_success '8e-setup: Both sides rename, one side adds to original directory' '
+	test_create_repo 8e &&
+	(
+		cd 8e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z w &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+	(
+		cd 8e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/c.*y/c.*w/c out &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/b.*y/b.*w/b out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:w/c :0:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/b   w/b   y/c   w/c &&
+		git rev-parse >expect \
+			O:z/b O:z/b O:z/c O:z/c &&
+		test_cmp expect actual &&
+
+		test_path_is_missing z/b &&
+		test_path_is_missing z/c
+	)
+'
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (7 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files Elijah Newren
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 565 +++++++++++++++++++++++++++-
 1 file changed, 564 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index e211e8ca31..cbbb949014 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
 ###########################################################################
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2316,4 +2317,566 @@ test_expect_failure '8e-check: Both sides rename, one side adds to original dire
 	)
 '
 
+###########################################################################
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###########################################################################
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+#         is split into multiple other directories, we determine by the weight
+#         of which one had the most paths going to it.  A naive implementation
+#         of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed directory' '
+	test_create_repo 9a &&
+	(
+		cd 9a &&
+
+		mkdir -p z/d &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo e >z/d/e &&
+		echo f >z/d/f &&
+		echo g >z/d/g &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir x &&
+		git mv z/d x/w &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo h >z/d/h &&
+		echo i >z/i &&
+		git add z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+	(
+		cd 9a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/i &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/i &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+		git rev-parse >expect \
+			O:z/d/e    O:z/d/f    O:z/d/g    B:z/d/h &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+	test_create_repo 9b &&
+	(
+		cd 9b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		test_seq 1 10 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_seq 1 11 >x/d &&
+		git add x/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_seq 0 10 >x/d &&
+		git mv x/d z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9b-check: Transitive rename with content merge' '
+	(
+		cd 9b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		test_seq 0 11 >expected &&
+		test_cmp expected y/d &&
+		git add expected &&
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    :0:expected &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:x/d &&
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_path_is_missing z/d &&
+
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse O:x/d) &&
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse A:x/d) &&
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse B:z/d)
+	)
+'
+
+# Testcase 9c, Doubly transitive rename?
+#   (Related to testcase 1c, 7e, and 9d)
+#   Commit O: z/{b,c},     x/{d,e},    w/f
+#   Commit A: y/{b,c},     x/{d,e,f,g}
+#   Commit B: z/{b,c,d,e},             w/f
+#   Expected: y/{b,c,d,e}, x/{f,g}
+#
+#   NOTE: x/f and x/g may be slightly confusing here.  The rename from w/f to
+#         x/f is clear.  Let's look beyond that.  Here's the logic:
+#            Commit B renamed x/ -> z/
+#            Commit A renamed z/ -> y/
+#         So, we could possibly further rename x/f to z/f to y/f, a doubly
+#         transient rename.  However, where does it end?  We can chain these
+#         indefinitely (see testcase 9d).  What if there is a D/F conflict
+#         at z/f/ or y/f/?  Or just another file conflict at one of those
+#         paths?  In the case of an N-long chain of transient renamings,
+#         where do we "abort" the rename at?  Can the user make sense of
+#         the resulting conflict and resolve it?
+#
+#         To avoid this confusion I use the simple rule that if the other side
+#         of history did a directory rename to a path that your side renamed
+#         away, then ignore that particular rename from the other side of
+#         history for any implicit directory renames.
+
+test_expect_success '9c-setup: Doubly transitive rename?' '
+	test_create_repo 9c &&
+	(
+		cd 9c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		echo e >x/e &&
+		mkdir w &&
+		echo f >w/f &&
+		git add z x w &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv w/f x/ &&
+		echo g >x/g &&
+		git add x/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/d &&
+		git mv x/e z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9c-check: Doubly transitive rename?' '
+	(
+		cd 9c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+		test_i18ngrep "WARNING: Avoiding applying x -> z rename to x/f" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e HEAD:x/f HEAD:x/g &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d    O:x/e    O:w/f    A:x/g &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9d, N-fold transitive rename?
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit O: z/a, y/b, x/c, w/d, v/e, u/f
+#   Commit A:  y/{a,b},  w/{c,d},  u/{e,f}
+#   Commit B: z/{a,t}, x/{b,c}, v/{d,e}, u/f
+#   Expected: <see NOTE first>
+#
+#   NOTE: z/ -> y/ (in commit A)
+#         y/ -> x/ (in commit B)
+#         x/ -> w/ (in commit A)
+#         w/ -> v/ (in commit B)
+#         v/ -> u/ (in commit A)
+#         So, if we add a file to z, say z/t, where should it end up?  In u?
+#         What if there's another file or directory named 't' in one of the
+#         intervening directories and/or in u itself?  Also, shouldn't the
+#         same logic that places 't' in u/ also move ALL other files to u/?
+#         What if there are file or directory conflicts in any of them?  If
+#         we attempted to do N-way (N-fold? N-ary? N-uple?) transitive renames
+#         like this, would the user have any hope of understanding any
+#         conflicts or how their working tree ended up?  I think not, so I'm
+#         ruling out N-ary transitive renames for N>1.
+#
+#   Therefore our expected result is:
+#     z/t, y/a, x/b, w/c, u/d, u/e, u/f
+#   The reason that v/d DOES get transitively renamed to u/d is that u/ isn't
+#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
+#   simple rules that are consistent with what we need for all the other
+#   testcases and simplifies things for the user.
+
+test_expect_success '9d-setup: N-way transitive rename?' '
+	test_create_repo 9d &&
+	(
+		cd 9d &&
+
+		mkdir z y x w v u &&
+		echo a >z/a &&
+		echo b >y/b &&
+		echo c >x/c &&
+		echo d >w/d &&
+		echo e >v/e &&
+		echo f >u/f &&
+		git add z y x w v u &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/a y/ &&
+		git mv x/c w/ &&
+		git mv v/e u/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo t >z/t &&
+		git mv y/b x/ &&
+		git mv w/d v/ &&
+		git add z/t &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9d-check: N-way transitive rename?' '
+	(
+		cd 9d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+		test_i18ngrep "WARNING: Avoiding applying z -> y rename to z/t" out &&
+		test_i18ngrep "WARNING: Avoiding applying y -> x rename to y/a" out &&
+		test_i18ngrep "WARNING: Avoiding applying x -> w rename to x/b" out &&
+		test_i18ngrep "WARNING: Avoiding applying w -> v rename to w/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:z/t \
+			HEAD:y/a HEAD:x/b HEAD:w/c \
+			HEAD:u/d HEAD:u/e HEAD:u/f &&
+		git rev-parse >expect \
+			B:z/t    \
+			O:z/a    O:y/b    O:x/c    \
+			O:w/d    O:v/e    A:u/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9e, N-to-1 whammo
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit O: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
+#   Commit A: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, dirN/{j,k,l,yo}
+#   Commit B: combined/{a,b,d,e,g,h,j,k}
+#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
+#             dir1/yo, dir2/yo, dir3/yo, dirN/yo
+
+test_expect_success '9e-setup: N-to-1 whammo' '
+	test_create_repo 9e &&
+	(
+		cd 9e &&
+
+		mkdir dir1 dir2 dir3 dirN &&
+		echo a >dir1/a &&
+		echo b >dir1/b &&
+		echo d >dir2/d &&
+		echo e >dir2/e &&
+		echo g >dir3/g &&
+		echo h >dir3/h &&
+		echo j >dirN/j &&
+		echo k >dirN/k &&
+		git add dir* &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo c  >dir1/c &&
+		echo yo >dir1/yo &&
+		echo f  >dir2/f &&
+		echo yo >dir2/yo &&
+		echo i  >dir3/i &&
+		echo yo >dir3/yo &&
+		echo l  >dirN/l &&
+		echo yo >dirN/yo &&
+		git add dir* &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv dir1 combined &&
+		git mv dir2/* combined/ &&
+		git mv dir3/* combined/ &&
+		git mv dirN/* combined/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
+	(
+		cd 9e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		grep "CONFLICT (implicit dir rename): Cannot map more than one path to combined/yo" out >error_line &&
+		grep -q dir1/yo error_line &&
+		grep -q dir2/yo error_line &&
+		grep -q dir3/yo error_line &&
+		grep -q dirN/yo error_line &&
+
+		git ls-files -s >out &&
+		test_line_count = 16 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:combined/a :0:combined/b :0:combined/c \
+			:0:combined/d :0:combined/e :0:combined/f \
+			:0:combined/g :0:combined/h :0:combined/i \
+			:0:combined/j :0:combined/k :0:combined/l &&
+		git rev-parse >expect \
+			 O:dir1/a      O:dir1/b      A:dir1/c \
+			 O:dir2/d      O:dir2/e      A:dir2/f \
+			 O:dir3/g      O:dir3/h      A:dir3/i \
+			 O:dirN/j      O:dirN/k      A:dirN/l &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			:0:dir1/yo :0:dir2/yo :0:dir3/yo :0:dirN/yo &&
+		git rev-parse >expect \
+			 A:dir1/yo  A:dir2/yo  A:dir3/yo  A:dirN/yo &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9f, Renamed directory that only contained immediate subdirs
+#   (Related to testcases 1e & 9g)
+#   Commit O: goal/{a,b}/$more_files
+#   Commit A: priority/{a,b}/$more_files
+#   Commit B: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{a,b}/$more_files, priority/c
+
+test_expect_success '9f-setup: Renamed directory that only contained immediate subdirs' '
+	test_create_repo 9f &&
+	(
+		cd 9f &&
+
+		mkdir -p goal/a &&
+		mkdir -p goal/b &&
+		echo foo >goal/a/foo &&
+		echo bar >goal/b/bar &&
+		echo baz >goal/b/baz &&
+		git add goal &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv goal/ priority &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >goal/c &&
+		git add goal/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+	(
+		cd 9f &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:priority/a/foo \
+			HEAD:priority/b/bar \
+			HEAD:priority/b/baz \
+			HEAD:priority/c &&
+		git rev-parse >expect \
+			O:goal/a/foo \
+			O:goal/b/bar \
+			O:goal/b/baz \
+			B:goal/c &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:goal/c
+	)
+'
+
+# Testcase 9g, Renamed directory that only contained immediate subdirs, immediate subdirs renamed
+#   (Related to testcases 1e & 9f)
+#   Commit O: goal/{a,b}/$more_files
+#   Commit A: priority/{alpha,bravo}/$more_files
+#   Commit B: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{alpha,bravo}/$more_files, priority/c
+
+test_expect_success '9g-setup: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	test_create_repo 9g &&
+	(
+		cd 9g &&
+
+		mkdir -p goal/a &&
+		mkdir -p goal/b &&
+		echo foo >goal/a/foo &&
+		echo bar >goal/b/bar &&
+		echo baz >goal/b/baz &&
+		git add goal &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir priority &&
+		git mv goal/a/ priority/alpha &&
+		git mv goal/b/ priority/beta &&
+		rmdir goal/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >goal/c &&
+		git add goal/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9g-check: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	(
+		cd 9g &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:priority/alpha/foo \
+			HEAD:priority/beta/bar  \
+			HEAD:priority/beta/baz  \
+			HEAD:priority/c &&
+		git rev-parse >expect \
+			O:goal/a/foo \
+			O:goal/b/bar \
+			O:goal/b/baz \
+			B:goal/c &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:goal/c
+	)
+'
+
+###########################################################################
+# Rules suggested by section 9:
+#
+#   If the other side of history did a directory rename to a path that your
+#   side renamed away, then ignore that particular rename from the other
+#   side of history for any implicit directory renames.
+###########################################################################
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (8 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files Elijah Newren
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 367 ++++++++++++++++++++++++++++
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index cbbb949014..a6cd38336c 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2879,4 +2879,371 @@ test_expect_failure '9g-check: Renamed directory that only contained immediate s
 #   side of history for any implicit directory renames.
 ###########################################################################
 
+###########################################################################
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###########################################################################
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#       ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' '
+	test_create_repo 10a &&
+	(
+		cd 10a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/c z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' '
+	(
+		cd 10a &&
+
+		git checkout A^0 &&
+		echo very >z/c &&
+		echo important >z/d &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "The following untracked working tree files would be overwritten by merge" err &&
+
+		git ls-files -s >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		echo very >expect &&
+		test_cmp expect z/c &&
+
+		echo important >expect &&
+		test_cmp expect z/d &&
+
+		git rev-parse HEAD:z/b >actual &&
+		git rev-parse O:z/b >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+#             z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#       ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+	test_create_repo 10b &&
+	(
+		cd 10b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/c z/d &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+	(
+		cd 10b &&
+
+		git checkout A^0 &&
+		echo very >y/c &&
+		echo important >y/d &&
+		echo contents >y/e &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d left in tree at y/d~B\^0" out &&
+		test_i18ngrep "Error: Refusing to lose untracked file at y/e; writing to y/e~B\^0 instead" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 5 out &&
+
+		git rev-parse >actual \
+			:0:y/b :3:y/d :3:y/e &&
+		git rev-parse >expect \
+			O:z/b  O:z/c  B:z/e &&
+		test_cmp expect actual &&
+
+		echo very >expect &&
+		test_cmp expect y/c &&
+
+		echo important >expect &&
+		test_cmp expect y/d &&
+
+		echo contents >expect &&
+		test_cmp expect y/e
+	)
+'
+
+# Testcase 10c, Overwrite untracked: dir rename/rename(1to2)
+#   Commit O: z/{a,b}, x/{c,d}
+#   Commit A: y/{a,b}, w/c, x/d + different untracked y/c
+#   Commit B: z/{a,b,c}, x/d
+#   Expected: Failed Merge; y/{a,b} + x/d + untracked y/c +
+#             CONFLICT(rename/rename) x/c -> w/c vs y/c +
+#             y/c~B^0 +
+#             ERROR_MSG(Refusing to lose untracked file at y/c)
+
+test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)' '
+	test_create_repo 10c &&
+	(
+		cd 10c &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir w &&
+		git mv x/c w/c &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/c z/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+	(
+		cd 10c &&
+
+		git checkout A^0 &&
+		echo important >y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose untracked file at y/c; adding as y/c~B\^0 instead" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :3:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  O:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c~B^0 >actual &&
+		git rev-parse O:x/c >expect &&
+		test_cmp expect actual &&
+
+		echo important >expect &&
+		test_cmp expect y/c
+	)
+'
+
+# Testcase 10d, Delete untracked w/ dir rename/rename(2to1)
+#   Commit O: z/{a,b,c_1},        x/{d,e,f_2}
+#   Commit A: y/{a,b},            x/{d,e,f_2,wham_1} + untracked y/wham
+#   Commit B: z/{a,b,c_1,wham_2}, y/{d,e}
+#   Expected: Failed Merge; y/{a,b,d,e} + untracked y/{wham,wham~B^0,wham~HEAD}+
+#             CONFLICT(rename/rename) z/c_1 vs x/f_2 -> y/wham
+#             ERROR_MSG(Refusing to lose untracked file at y/wham)
+
+test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
+	test_create_repo 10d &&
+	(
+		cd 10d &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >x/d &&
+		echo e >x/e &&
+		echo f >x/f &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/c x/wham &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/f z/wham &&
+		git mv x/ y/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+	(
+		cd 10d &&
+
+		git checkout A^0 &&
+		echo important >y/wham &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose untracked file at y/wham" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:y/d :0:y/e :2:y/wham :3:y/wham &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/e  O:z/c     O:x/f &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/wham &&
+
+		echo important >expect &&
+		test_cmp expect y/wham &&
+
+		git hash-object >actual \
+			y/wham~B^0 y/wham~HEAD &&
+		git rev-parse >expect \
+			O:x/f      O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 10e, Does git complain about untracked file that's not in the way?
+#   Commit O: z/{a,b}
+#   Commit A: y/{a,b} + untracked z/c
+#   Commit B: z/{a,b,c}
+#   Expected: y/{a,b,c} + untracked z/c
+
+test_expect_success '10e-setup: Does git complain about untracked file that is not really in the way?' '
+	test_create_repo 10e &&
+	(
+		cd 10e &&
+
+		mkdir z &&
+		echo a >z/a &&
+		echo b >z/b &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >z/c &&
+		git add z/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10e-check: Does git complain about untracked file that is not really in the way?' '
+	(
+		cd 10e &&
+
+		git checkout A^0 &&
+		mkdir z &&
+		echo random >z/c &&
+
+		git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep ! "following untracked working tree files would be overwritten by merge" err &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  B:z/c &&
+		test_cmp expect actual &&
+
+		echo random >expect &&
+		test_cmp expect z/c
+	)
+'
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (9 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:57 ` [PATCH v10 12/36] merge-recursive: move the get_renames() function Elijah Newren
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 458 ++++++++++++++++++++++++++++
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index a6cd38336c..8ea9ec49bc 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3246,4 +3246,462 @@ test_expect_failure '10e-check: Does git complain about untracked file that is n
 	)
 '
 
+###########################################################################
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###########################################################################
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+#             z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+#             z/c~HEAD with contents of B:z/b_v2,
+#             z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple rename' '
+	test_create_repo 11a &&
+	(
+		cd 11a &&
+
+		mkdir z &&
+		echo a >z/a &&
+		test_seq 1 10 >z/b &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/b z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo 11 >>z/b &&
+		git add z/b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+	(
+		cd 11a &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 2 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:z/a :2:z/c &&
+		git rev-parse >expect \
+			 O:z/a  B:z/b &&
+		test_cmp expect actual &&
+
+		git hash-object z/c~HEAD >actual &&
+		git rev-parse B:z/b >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a,         x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit B: y/a,         x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+#             ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory rename' '
+	test_create_repo 11b &&
+	(
+		cd 11b &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		echo 11 >>x/c &&
+		git add x/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+	(
+		cd 11b &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		grep -q stuff z/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -m >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:y/a :0:y/c &&
+		git rev-parse >expect \
+			 O:x/b  O:z/a  B:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c >actual &&
+		git rev-parse B:x/c >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11c, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit O: y/a,         x/{b,c_v1}
+#   Commit A: y/{a,c_v1},  x/b,       and y/c_v1 has uncommitted mods
+#   Commit B: y/{a,c/d},   x/{b,c_v2}
+#   Expected: Abort_msg("following files would be overwritten by merge") +
+#             y/c left untouched (still has uncommitted mods)
+
+test_expect_success '11c-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	test_create_repo 11c &&
+	(
+		cd 11c &&
+
+		mkdir y x &&
+		echo a >y/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add y x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c y/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y/c &&
+		echo d >y/c/d &&
+		echo 11 >>x/c &&
+		git add x/c y/c/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '11c-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	(
+		cd 11c &&
+
+		git checkout A^0 &&
+		echo stuff >>y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "following files would be overwritten by merge" err &&
+
+		grep -q stuff y/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected y/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -m >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out
+	)
+'
+
+# Testcase 11d, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit O: z/a,         x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit B: y/{a,c/d},   x/{b,c_v2}
+#   Expected: D/F: y/c_v2 vs y/c/d) +
+#             Warning_Msg("Refusing to lose dirty file at z/c) +
+#             y/{a,c~HEAD,c/d}, x/b, now-untracked z/c_v1 with uncommitted mods
+
+test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	test_create_repo 11d &&
+	(
+		cd 11d &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir y/c &&
+		echo d >y/c/d &&
+		echo 11 >>x/c &&
+		git add x/c y/c/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	(
+		cd 11d &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		grep -q stuff z/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 5 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:y/a :0:y/c/d :3:y/c &&
+		git rev-parse >expect \
+			 O:x/b  O:z/a  B:y/c/d  B:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c~HEAD >actual &&
+		git rev-parse B:x/c >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11e, Avoid deleting not-up-to-date with dir rename/rename(1to2)/add
+#   Commit O: z/{a,b},      x/{c_1,d}
+#   Commit A: y/{a,b,c_2},  x/d, w/c_1, and y/c_2 has uncommitted mods
+#   Commit B: z/{a,b,c_1},  x/d
+#   Expected: Failed Merge; y/{a,b} + x/d +
+#             CONFLICT(rename/rename) x/c_1 -> w/c_1 vs y/c_1 +
+#             ERROR_MSG(Refusing to lose dirty file at y/c)
+#             y/c~B^0 has O:x/c_1 contents
+#             y/c~HEAD has A:y/c_2 contents
+#             y/c has dirty file from before merge
+
+test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	test_create_repo 11e &&
+	(
+		cd 11e &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		echo different >y/c &&
+		mkdir w &&
+		git mv x/c w/ &&
+		git add y/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/c z/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	(
+		cd 11e &&
+
+		git checkout A^0 &&
+		echo mods >>y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose dirty file at y/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 4 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		echo different >expected &&
+		echo mods >>expected &&
+		test_cmp expected y/c &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :2:y/c :3:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  A:y/c  O:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/c~B^0 y/c~HEAD &&
+		git rev-parse >expect \
+			O:x/c   A:y/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11f, Avoid deleting not-up-to-date w/ dir rename/rename(2to1)
+#   Commit O: z/{a,b},        x/{c_1,d_2}
+#   Commit A: y/{a,b,wham_1}, x/d_2, except y/wham has uncommitted mods
+#   Commit B: z/{a,b,wham_2}, x/c_1
+#   Expected: Failed Merge; y/{a,b} + untracked y/{wham~B^0,wham~B^HEAD} +
+#             y/wham with dirty changes from before merge +
+#             CONFLICT(rename/rename) x/c vs x/d -> y/wham
+#             ERROR_MSG(Refusing to lose dirty file at y/wham)
+
+test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	test_create_repo 11f &&
+	(
+		cd 11f &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		test_seq 1 10 >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		git mv x/c y/wham &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/wham &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	(
+		cd 11f &&
+
+		git checkout A^0 &&
+		echo important >>y/wham &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose dirty file at y/wham" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		test_seq 1 10 >expected &&
+		echo important >>expected &&
+		test_cmp expected y/wham &&
+
+		test_must_fail git rev-parse :1:y/wham &&
+		git hash-object >actual \
+			y/wham~B^0 y/wham~HEAD &&
+		git rev-parse >expect \
+			O:x/d      O:x/c &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :2:y/wham :3:y/wham &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/c     O:x/d &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 12/36] merge-recursive: move the get_renames() function
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (10 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files Elijah Newren
@ 2018-04-19 17:57 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 13/36] merge-recursive: introduce new functions to handle rename logic Elijah Newren
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:57 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Move this function so it can re-use some others (without either
moving all of them or adding an annoying split between function
declarations and definitions).  Cheat slightly by adding a blank line
for readability, and in order to silence checkpatch.pl.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 139 +++++++++++++++++++++++-----------------------
 1 file changed, 70 insertions(+), 69 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 0c0d48624d..973b6e2985 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -537,75 +537,6 @@ struct rename {
 	unsigned processed:1;
 };
 
-/*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
- */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
-{
-	int i;
-	struct string_list *renames;
-	struct diff_options opts;
-
-	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
-
-	diff_setup(&opts);
-	opts.flags.recursive = 1;
-	opts.flags.rename_empty = 0;
-	opts.detect_rename = DIFF_DETECT_RENAME;
-	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
-			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
-			    1000;
-	opts.rename_score = o->rename_score;
-	opts.show_rename_progress = o->show_rename_progress;
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_setup_done(&opts);
-	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
-	diffcore_std(&opts);
-	if (opts.needed_rename_limit > o->needed_rename_limit)
-		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
-		struct string_list_item *item;
-		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
-		if (pair->status != 'R') {
-			diff_free_filepair(pair);
-			continue;
-		}
-		re = xmalloc(sizeof(*re));
-		re->processed = 0;
-		re->pair = pair;
-		item = string_list_lookup(entries, re->pair->one->path);
-		if (!item)
-			re->src_entry = insert_stage_data(re->pair->one->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->src_entry = item->util;
-
-		item = string_list_lookup(entries, re->pair->two->path);
-		if (!item)
-			re->dst_entry = insert_stage_data(re->pair->two->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->dst_entry = item->util;
-		item = string_list_insert(renames, pair->one->path);
-		item->util = re;
-	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
-	return renames;
-}
-
 static int update_stages(struct merge_options *opt, const char *path,
 			 const struct diff_filespec *o,
 			 const struct diff_filespec *a,
@@ -1390,6 +1321,76 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 	return ret;
 }
 
+/*
+ * Get information of all renames which occurred between 'o_tree' and
+ * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
+ * 'b_tree') to be able to associate the correct cache entries with
+ * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+	struct diff_options opts;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+	if (!o->detect_rename)
+		return renames;
+
+	diff_setup(&opts);
+	opts.flags.recursive = 1;
+	opts.flags.rename_empty = 0;
+	opts.detect_rename = DIFF_DETECT_RENAME;
+	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
+			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
+			    1000;
+	opts.rename_score = o->rename_score;
+	opts.show_rename_progress = o->show_rename_progress;
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_setup_done(&opts);
+	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
+	diffcore_std(&opts);
+	if (opts.needed_rename_limit > o->needed_rename_limit)
+		o->needed_rename_limit = opts.needed_rename_limit;
+	for (i = 0; i < diff_queued_diff.nr; ++i) {
+		struct string_list_item *item;
+		struct rename *re;
+		struct diff_filepair *pair = diff_queued_diff.queue[i];
+
+		if (pair->status != 'R') {
+			diff_free_filepair(pair);
+			continue;
+		}
+		re = xmalloc(sizeof(*re));
+		re->processed = 0;
+		re->pair = pair;
+		item = string_list_lookup(entries, re->pair->one->path);
+		if (!item)
+			re->src_entry = insert_stage_data(re->pair->one->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->src_entry = item->util;
+
+		item = string_list_lookup(entries, re->pair->two->path);
+		if (!item)
+			re->dst_entry = insert_stage_data(re->pair->two->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->dst_entry = item->util;
+		item = string_list_insert(renames, pair->one->path);
+		item->util = re;
+	}
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_flush(&opts);
+	return renames;
+}
+
 static int process_renames(struct merge_options *o,
 			   struct string_list *a_renames,
 			   struct string_list *b_renames)
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 13/36] merge-recursive: introduce new functions to handle rename logic
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (11 preceding siblings ...)
  2018-04-19 17:57 ` [PATCH v10 12/36] merge-recursive: move the get_renames() function Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 14/36] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each.  No code or
logic changes, just a new place to put stuff for when the rename detection
gains additional checks.

Note that process_renames() records pointers to various information (such
as diff_filepairs) into rename_conflict_info structs.  Even though the
rename string_lists are not directly used once handle_renames() completes,
we should not immediately free the lists at the end of that function
because they store the information referenced in the rename_conflict_info,
which is used later in process_entry().  Thus the reason for a separate
cleanup_renames().

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 973b6e2985..40e142efdb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1646,6 +1646,32 @@ static int process_renames(struct merge_options *o,
 	return clean_merge;
 }
 
+struct rename_info {
+	struct string_list *head_renames;
+	struct string_list *merge_renames;
+};
+
+static int handle_renames(struct merge_options *o,
+			  struct tree *common,
+			  struct tree *head,
+			  struct tree *merge,
+			  struct string_list *entries,
+			  struct rename_info *ri)
+{
+	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
+	return process_renames(o, ri->head_renames, ri->merge_renames);
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+	string_list_clear(re_info->head_renames, 0);
+	string_list_clear(re_info->merge_renames, 0);
+
+	free(re_info->head_renames);
+	free(re_info->merge_renames);
+}
+
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
 {
 	return (is_null_oid(oid) || mode == 0) ? NULL: (struct object_id *)oid;
@@ -2005,7 +2031,8 @@ int merge_trees(struct merge_options *o,
 	}
 
 	if (unmerged_cache()) {
-		struct string_list *entries, *re_head, *re_merge;
+		struct string_list *entries;
+		struct rename_info re_info;
 		int i;
 		/*
 		 * Only need the hashmap while processing entries, so
@@ -2019,9 +2046,8 @@ int merge_trees(struct merge_options *o,
 		get_files_dirs(o, merge);
 
 		entries = get_unmerged();
-		re_head  = get_renames(o, head, common, head, merge, entries);
-		re_merge = get_renames(o, merge, common, head, merge, entries);
-		clean = process_renames(o, re_head, re_merge);
+		clean = handle_renames(o, common, head, merge, entries,
+				       &re_info);
 		record_df_conflict_files(o, entries);
 		if (clean < 0)
 			goto cleanup;
@@ -2046,16 +2072,13 @@ int merge_trees(struct merge_options *o,
 		}
 
 cleanup:
-		string_list_clear(re_merge, 0);
-		string_list_clear(re_head, 0);
+		cleanup_renames(&re_info);
+
 		string_list_clear(entries, 1);
+		free(entries);
 
 		hashmap_free(&o->current_file_dir_set, 1);
 
-		free(re_merge);
-		free(re_head);
-		free(entries);
-
 		if (clean < 0)
 			return clean;
 	}
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 14/36] merge-recursive: fix leaks of allocated renames and diff_filepairs
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (12 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 13/36] merge-recursive: introduce new functions to handle rename logic Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 15/36] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list.  Make sure all of these are deallocated when we
are done with them.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 40e142efdb..fc96653f63 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1663,13 +1663,23 @@ static int handle_renames(struct merge_options *o,
 	return process_renames(o, ri->head_renames, ri->merge_renames);
 }
 
-static void cleanup_renames(struct rename_info *re_info)
+static void cleanup_rename(struct string_list *rename)
 {
-	string_list_clear(re_info->head_renames, 0);
-	string_list_clear(re_info->merge_renames, 0);
+	const struct rename *re;
+	int i;
 
-	free(re_info->head_renames);
-	free(re_info->merge_renames);
+	for (i = 0; i < rename->nr; i++) {
+		re = rename->items[i].util;
+		diff_free_filepair(re->pair);
+	}
+	string_list_clear(rename, 1);
+	free(rename);
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+	cleanup_rename(re_info->head_renames);
+	cleanup_rename(re_info->merge_renames);
 }
 
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 15/36] merge-recursive: make !o->detect_rename codepath more obvious
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (13 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 14/36] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 16/36] merge-recursive: split out code for determining diff_filepairs Elijah Newren
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over.  It seems more straightforward to simply avoid calling
either function in that case.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index fc96653f63..5da60b9516 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1339,8 +1339,6 @@ static struct string_list *get_renames(struct merge_options *o,
 	struct diff_options opts;
 
 	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
 
 	diff_setup(&opts);
 	opts.flags.recursive = 1;
@@ -1658,6 +1656,12 @@ static int handle_renames(struct merge_options *o,
 			  struct string_list *entries,
 			  struct rename_info *ri)
 {
+	ri->head_renames = NULL;
+	ri->merge_renames = NULL;
+
+	if (!o->detect_rename)
+		return 1;
+
 	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
 	return process_renames(o, ri->head_renames, ri->merge_renames);
@@ -1668,6 +1672,9 @@ static void cleanup_rename(struct string_list *rename)
 	const struct rename *re;
 	int i;
 
+	if (rename == NULL)
+		return;
+
 	for (i = 0; i < rename->nr; i++) {
 		re = rename->items[i].util;
 		diff_free_filepair(re->pair);
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 16/36] merge-recursive: split out code for determining diff_filepairs
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (14 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 15/36] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 17/36] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees.  While these are currently only used in
get_renames(), I want them to be available to some new functions.  No
actual logic changes yet.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 84 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 62 insertions(+), 22 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 5da60b9516..55a8ace948 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1322,24 +1322,15 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 }
 
 /*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ * Get the diff_filepairs changed between o_tree and tree.
  */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
+static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
+					       struct tree *o_tree,
+					       struct tree *tree)
 {
-	int i;
-	struct string_list *renames;
+	struct diff_queue_struct *ret;
 	struct diff_options opts;
 
-	renames = xcalloc(1, sizeof(struct string_list));
-
 	diff_setup(&opts);
 	opts.flags.recursive = 1;
 	opts.flags.rename_empty = 0;
@@ -1355,10 +1346,41 @@ static struct string_list *get_renames(struct merge_options *o,
 	diffcore_std(&opts);
 	if (opts.needed_rename_limit > o->needed_rename_limit)
 		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
+
+	ret = xmalloc(sizeof(*ret));
+	*ret = diff_queued_diff;
+
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_queued_diff.queue = NULL;
+	diff_flush(&opts);
+	return ret;
+}
+
+/*
+ * Get information of all renames which occurred in 'pairs', making use of
+ * any implicit directory renames inferred from the other side of history.
+ * We need the three trees in the merge ('o_tree', 'a_tree' and 'b_tree')
+ * to be able to associate the correct cache entries with the rename
+ * information; tree is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct diff_queue_struct *pairs,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+
+	for (i = 0; i < pairs->nr; ++i) {
 		struct string_list_item *item;
 		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
+		struct diff_filepair *pair = pairs->queue[i];
 
 		if (pair->status != 'R') {
 			diff_free_filepair(pair);
@@ -1383,9 +1405,6 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
 	return renames;
 }
 
@@ -1656,15 +1675,36 @@ static int handle_renames(struct merge_options *o,
 			  struct string_list *entries,
 			  struct rename_info *ri)
 {
+	struct diff_queue_struct *head_pairs, *merge_pairs;
+	int clean;
+
 	ri->head_renames = NULL;
 	ri->merge_renames = NULL;
 
 	if (!o->detect_rename)
 		return 1;
 
-	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
-	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
-	return process_renames(o, ri->head_renames, ri->merge_renames);
+	head_pairs = get_diffpairs(o, common, head);
+	merge_pairs = get_diffpairs(o, common, merge);
+
+	ri->head_renames  = get_renames(o, head_pairs, head,
+					 common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge_pairs, merge,
+					 common, head, merge, entries);
+	clean = process_renames(o, ri->head_renames, ri->merge_renames);
+
+	/*
+	 * Some cleanup is deferred until cleanup_renames() because the
+	 * data structures are still needed and referenced in
+	 * process_entry().  But there are a few things we can free now.
+	 */
+
+	free(head_pairs->queue);
+	free(head_pairs);
+	free(merge_pairs->queue);
+	free(merge_pairs);
+
+	return clean;
 }
 
 static void cleanup_rename(struct string_list *rename)
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 17/36] merge-recursive: make a helper function for cleanup for handle_renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (15 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 16/36] merge-recursive: split out code for determining diff_filepairs Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

In anticipation of more involved cleanup to come, make a helper function
for doing the cleanup at the end of handle_renames.  Rename the already
existing cleanup_rename[s]() to final_cleanup_rename[s](), name the new
helper initial_cleanup_rename(), and leave the big comment in the code
about why we can't do all the cleanup at once.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 55a8ace948..30894c1cc7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1668,6 +1668,12 @@ struct rename_info {
 	struct string_list *merge_renames;
 };
 
+static void initial_cleanup_rename(struct diff_queue_struct *pairs)
+{
+	free(pairs->queue);
+	free(pairs);
+}
+
 static int handle_renames(struct merge_options *o,
 			  struct tree *common,
 			  struct tree *head,
@@ -1698,16 +1704,13 @@ static int handle_renames(struct merge_options *o,
 	 * data structures are still needed and referenced in
 	 * process_entry().  But there are a few things we can free now.
 	 */
-
-	free(head_pairs->queue);
-	free(head_pairs);
-	free(merge_pairs->queue);
-	free(merge_pairs);
+	initial_cleanup_rename(head_pairs);
+	initial_cleanup_rename(merge_pairs);
 
 	return clean;
 }
 
-static void cleanup_rename(struct string_list *rename)
+static void final_cleanup_rename(struct string_list *rename)
 {
 	const struct rename *re;
 	int i;
@@ -1723,10 +1726,10 @@ static void cleanup_rename(struct string_list *rename)
 	free(rename);
 }
 
-static void cleanup_renames(struct rename_info *re_info)
+static void final_cleanup_renames(struct rename_info *re_info)
 {
-	cleanup_rename(re_info->head_renames);
-	cleanup_rename(re_info->merge_renames);
+	final_cleanup_rename(re_info->head_renames);
+	final_cleanup_rename(re_info->merge_renames);
 }
 
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
@@ -2129,7 +2132,7 @@ int merge_trees(struct merge_options *o,
 		}
 
 cleanup:
-		cleanup_renames(&re_info);
+		final_cleanup_renames(&re_info);
 
 		string_list_clear(entries, 1);
 		free(entries);
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 18/36] merge-recursive: add get_directory_renames()
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (16 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 17/36] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-05-06 23:41   ` SZEDER Gábor
  2019-10-09 20:38   ` [PATCH v10 18/36] " Johannes Schindelin
  2018-04-19 17:58 ` [PATCH v10 19/36] merge-recursive: check for directory level conflicts Elijah Newren
                   ` (19 subsequent siblings)
  37 siblings, 2 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

This populates a set of directory renames for us.  The set of directory
renames is not yet used, but will be in subsequent commits.

Note that the use of a string_list for possible_new_dirs in the new
dir_rename_entry struct implies an O(n^2) algorithm; however, in practice
I expect the number of distinct directories that files were renamed into
from a single original directory to be O(1).  My guess is that n has a
mode of 1 and a mean of less than 2, so, for now, string_list seems good
enough for possible_new_dirs.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 224 +++++++++++++++++++++++++++++++++++++++++++++-
 merge-recursive.h |  18 ++++
 2 files changed, 239 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 30894c1cc7..22c5e8e5c9 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -49,6 +49,44 @@ static unsigned int path_hash(const char *path)
 	return ignore_case ? strihash(path) : strhash(path);
 }
 
+static struct dir_rename_entry *dir_rename_find_entry(struct hashmap *hashmap,
+						      char *dir)
+{
+	struct dir_rename_entry key;
+
+	if (dir == NULL)
+		return NULL;
+	hashmap_entry_init(&key, strhash(dir));
+	key.dir = dir;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int dir_rename_cmp(const void *unused_cmp_data,
+			  const void *entry,
+			  const void *entry_or_key,
+			  const void *unused_keydata)
+{
+	const struct dir_rename_entry *e1 = entry;
+	const struct dir_rename_entry *e2 = entry_or_key;
+
+	return strcmp(e1->dir, e2->dir);
+}
+
+static void dir_rename_init(struct hashmap *map)
+{
+	hashmap_init(map, dir_rename_cmp, NULL, 0);
+}
+
+static void dir_rename_entry_init(struct dir_rename_entry *entry,
+				  char *directory)
+{
+	hashmap_entry_init(entry, strhash(directory));
+	entry->dir = directory;
+	entry->non_unique_new_dir = 0;
+	strbuf_init(&entry->new_dir, 0);
+	string_list_init(&entry->possible_new_dirs, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
@@ -1357,6 +1395,169 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static void get_renamed_dir_portion(const char *old_path, const char *new_path,
+				    char **old_dir, char **new_dir)
+{
+	char *end_of_old, *end_of_new;
+	int old_len, new_len;
+
+	*old_dir = NULL;
+	*new_dir = NULL;
+
+	/*
+	 * For
+	 *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
+	 * the "e/foo.c" part is the same, we just want to know that
+	 *    "a/b/c/d" was renamed to "a/b/some/thing/else"
+	 * so, for this example, this function returns "a/b/c/d" in
+	 * *old_dir and "a/b/some/thing/else" in *new_dir.
+	 *
+	 * Also, if the basename of the file changed, we don't care.  We
+	 * want to know which portion of the directory, if any, changed.
+	 */
+	end_of_old = strrchr(old_path, '/');
+	end_of_new = strrchr(new_path, '/');
+
+	if (end_of_old == NULL || end_of_new == NULL)
+		return;
+	while (*--end_of_new == *--end_of_old &&
+	       end_of_old != old_path &&
+	       end_of_new != new_path)
+		; /* Do nothing; all in the while loop */
+	/*
+	 * We've found the first non-matching character in the directory
+	 * paths.  That means the current directory we were comparing
+	 * represents the rename.  Move end_of_old and end_of_new back
+	 * to the full directory name.
+	 */
+	if (*end_of_old == '/')
+		end_of_old++;
+	if (*end_of_old != '/')
+		end_of_new++;
+	end_of_old = strchr(end_of_old, '/');
+	end_of_new = strchr(end_of_new, '/');
+
+	/*
+	 * It may have been the case that old_path and new_path were the same
+	 * directory all along.  Don't claim a rename if they're the same.
+	 */
+	old_len = end_of_old - old_path;
+	new_len = end_of_new - new_path;
+
+	if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
+		*old_dir = xstrndup(old_path, old_len);
+		*new_dir = xstrndup(new_path, new_len);
+	}
+}
+
+static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
+					     struct tree *tree)
+{
+	struct hashmap *dir_renames;
+	struct hashmap_iter iter;
+	struct dir_rename_entry *entry;
+	int i;
+
+	/*
+	 * Typically, we think of a directory rename as all files from a
+	 * certain directory being moved to a target directory.  However,
+	 * what if someone first moved two files from the original
+	 * directory in one commit, and then renamed the directory
+	 * somewhere else in a later commit?  At merge time, we just know
+	 * that files from the original directory went to two different
+	 * places, and that the bulk of them ended up in the same place.
+	 * We want each directory rename to represent where the bulk of the
+	 * files from that directory end up; this function exists to find
+	 * where the bulk of the files went.
+	 *
+	 * The first loop below simply iterates through the list of file
+	 * renames, finding out how often each directory rename pair
+	 * possibility occurs.
+	 */
+	dir_renames = xmalloc(sizeof(struct hashmap));
+	dir_rename_init(dir_renames);
+	for (i = 0; i < pairs->nr; ++i) {
+		struct string_list_item *item;
+		int *count;
+		struct diff_filepair *pair = pairs->queue[i];
+		char *old_dir, *new_dir;
+
+		/* File not part of directory rename if it wasn't renamed */
+		if (pair->status != 'R')
+			continue;
+
+		get_renamed_dir_portion(pair->one->path, pair->two->path,
+					&old_dir,        &new_dir);
+		if (!old_dir)
+			/* Directory didn't change at all; ignore this one. */
+			continue;
+
+		entry = dir_rename_find_entry(dir_renames, old_dir);
+		if (!entry) {
+			entry = xmalloc(sizeof(struct dir_rename_entry));
+			dir_rename_entry_init(entry, old_dir);
+			hashmap_put(dir_renames, entry);
+		} else {
+			free(old_dir);
+		}
+		item = string_list_lookup(&entry->possible_new_dirs, new_dir);
+		if (!item) {
+			item = string_list_insert(&entry->possible_new_dirs,
+						  new_dir);
+			item->util = xcalloc(1, sizeof(int));
+		} else {
+			free(new_dir);
+		}
+		count = item->util;
+		*count += 1;
+	}
+
+	/*
+	 * For each directory with files moved out of it, we find out which
+	 * target directory received the most files so we can declare it to
+	 * be the "winning" target location for the directory rename.  This
+	 * winner gets recorded in new_dir.  If there is no winner
+	 * (multiple target directories received the same number of files),
+	 * we set non_unique_new_dir.  Once we've determined the winner (or
+	 * that there is no winner), we no longer need possible_new_dirs.
+	 */
+	hashmap_iter_init(dir_renames, &iter);
+	while ((entry = hashmap_iter_next(&iter))) {
+		int max = 0;
+		int bad_max = 0;
+		char *best = NULL;
+
+		for (i = 0; i < entry->possible_new_dirs.nr; i++) {
+			int *count = entry->possible_new_dirs.items[i].util;
+
+			if (*count == max)
+				bad_max = max;
+			else if (*count > max) {
+				max = *count;
+				best = entry->possible_new_dirs.items[i].string;
+			}
+		}
+		if (bad_max == max)
+			entry->non_unique_new_dir = 1;
+		else {
+			assert(entry->new_dir.len == 0);
+			strbuf_addstr(&entry->new_dir, best);
+		}
+		/*
+		 * The relevant directory sub-portion of the original full
+		 * filepaths were xstrndup'ed before inserting into
+		 * possible_new_dirs, and instead of manually iterating the
+		 * list and free'ing each, just lie and tell
+		 * possible_new_dirs that it did the strdup'ing so that it
+		 * will free them for us.
+		 */
+		entry->possible_new_dirs.strdup_strings = 1;
+		string_list_clear(&entry->possible_new_dirs, 1);
+	}
+
+	return dir_renames;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1668,8 +1869,21 @@ struct rename_info {
 	struct string_list *merge_renames;
 };
 
-static void initial_cleanup_rename(struct diff_queue_struct *pairs)
+static void initial_cleanup_rename(struct diff_queue_struct *pairs,
+				   struct hashmap *dir_renames)
 {
+	struct hashmap_iter iter;
+	struct dir_rename_entry *e;
+
+	hashmap_iter_init(dir_renames, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->dir);
+		strbuf_release(&e->new_dir);
+		/* possible_new_dirs already cleared in get_directory_renames */
+	}
+	hashmap_free(dir_renames, 1);
+	free(dir_renames);
+
 	free(pairs->queue);
 	free(pairs);
 }
@@ -1682,6 +1896,7 @@ static int handle_renames(struct merge_options *o,
 			  struct rename_info *ri)
 {
 	struct diff_queue_struct *head_pairs, *merge_pairs;
+	struct hashmap *dir_re_head, *dir_re_merge;
 	int clean;
 
 	ri->head_renames = NULL;
@@ -1693,6 +1908,9 @@ static int handle_renames(struct merge_options *o,
 	head_pairs = get_diffpairs(o, common, head);
 	merge_pairs = get_diffpairs(o, common, merge);
 
+	dir_re_head = get_directory_renames(head_pairs, head);
+	dir_re_merge = get_directory_renames(merge_pairs, merge);
+
 	ri->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge_pairs, merge,
@@ -1704,8 +1922,8 @@ static int handle_renames(struct merge_options *o,
 	 * data structures are still needed and referenced in
 	 * process_entry().  But there are a few things we can free now.
 	 */
-	initial_cleanup_rename(head_pairs);
-	initial_cleanup_rename(merge_pairs);
+	initial_cleanup_rename(head_pairs, dir_re_head);
+	initial_cleanup_rename(merge_pairs, dir_re_merge);
 
 	return clean;
 }
diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..fe64c78de4 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,24 @@ struct merge_options {
 	struct string_list df_conflict_file_set;
 };
 
+/*
+ * For dir_rename_entry, directory names are stored as a full path from the
+ * toplevel of the repository and do not include a trailing '/'.  Also:
+ *
+ *   dir:                original name of directory being renamed
+ *   non_unique_new_dir: if true, could not determine new_dir
+ *   new_dir:            final name of directory being renamed
+ *   possible_new_dirs:  temporary used to help determine new_dir; see comments
+ *                       in get_directory_renames() for details
+ */
+struct dir_rename_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *dir;
+	unsigned non_unique_new_dir:1;
+	struct strbuf new_dir;
+	struct string_list possible_new_dirs;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 19/36] merge-recursive: check for directory level conflicts
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (17 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 20/36] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
directory level.  There will be additional checks at the individual
file level too, which will be added later.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 22c5e8e5c9..3b0e509513 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1395,6 +1395,15 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static int tree_has_path(struct tree *tree, const char *path)
+{
+	struct object_id hashy;
+	unsigned int mode_o;
+
+	return !get_tree_entry(&tree->object.oid, path,
+			       &hashy, &mode_o);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir)
 {
@@ -1450,6 +1459,112 @@ static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 	}
 }
 
+static void remove_hashmap_entries(struct hashmap *dir_renames,
+				   struct string_list *items_to_remove)
+{
+	int i;
+	struct dir_rename_entry *entry;
+
+	for (i = 0; i < items_to_remove->nr; i++) {
+		entry = items_to_remove->items[i].util;
+		hashmap_remove(dir_renames, entry, NULL);
+	}
+	string_list_clear(items_to_remove, 0);
+}
+
+/*
+ * There are a couple things we want to do at the directory level:
+ *   1. Check for both sides renaming to the same thing, in order to avoid
+ *      implicit renaming of files that should be left in place.  (See
+ *      testcase 6b in t6043 for details.)
+ *   2. Prune directory renames if there are still files left in the
+ *      the original directory.  These represent a partial directory rename,
+ *      i.e. a rename where only some of the files within the directory
+ *      were renamed elsewhere.  (Technically, this could be done earlier
+ *      in get_directory_renames(), except that would prevent us from
+ *      doing the previous check and thus failing testcase 6b.)
+ *   3. Check for rename/rename(1to2) conflicts (at the directory level).
+ *      In the future, we could potentially record this info as well and
+ *      omit reporting rename/rename(1to2) conflicts for each path within
+ *      the affected directories, thus cleaning up the merge output.
+ *   NOTE: We do NOT check for rename/rename(2to1) conflicts at the
+ *         directory level, because merging directories is fine.  If it
+ *         causes conflicts for files within those merged directories, then
+ *         that should be detected at the individual path level.
+ */
+static void handle_directory_level_conflicts(struct merge_options *o,
+					     struct hashmap *dir_re_head,
+					     struct tree *head,
+					     struct hashmap *dir_re_merge,
+					     struct tree *merge)
+{
+	struct hashmap_iter iter;
+	struct dir_rename_entry *head_ent;
+	struct dir_rename_entry *merge_ent;
+
+	struct string_list remove_from_head = STRING_LIST_INIT_NODUP;
+	struct string_list remove_from_merge = STRING_LIST_INIT_NODUP;
+
+	hashmap_iter_init(dir_re_head, &iter);
+	while ((head_ent = hashmap_iter_next(&iter))) {
+		merge_ent = dir_rename_find_entry(dir_re_merge, head_ent->dir);
+		if (merge_ent &&
+		    !head_ent->non_unique_new_dir &&
+		    !merge_ent->non_unique_new_dir &&
+		    !strbuf_cmp(&head_ent->new_dir, &merge_ent->new_dir)) {
+			/* 1. Renamed identically; remove it from both sides */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			strbuf_release(&merge_ent->new_dir);
+		} else if (tree_has_path(head, head_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+		}
+	}
+
+	remove_hashmap_entries(dir_re_head, &remove_from_head);
+	remove_hashmap_entries(dir_re_merge, &remove_from_merge);
+
+	hashmap_iter_init(dir_re_merge, &iter);
+	while ((merge_ent = hashmap_iter_next(&iter))) {
+		head_ent = dir_rename_find_entry(dir_re_head, merge_ent->dir);
+		if (tree_has_path(merge, merge_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+		} else if (head_ent &&
+			   !head_ent->non_unique_new_dir &&
+			   !merge_ent->non_unique_new_dir) {
+			/* 3. rename/rename(1to2) */
+			/*
+			 * We can assume it's not rename/rename(1to1) because
+			 * that was case (1), already checked above.  So we
+			 * know that head_ent->new_dir and merge_ent->new_dir
+			 * are different strings.
+			 */
+			output(o, 1, _("CONFLICT (rename/rename): "
+				       "Rename directory %s->%s in %s. "
+				       "Rename directory %s->%s in %s"),
+			       head_ent->dir, head_ent->new_dir.buf, o->branch1,
+			       head_ent->dir, merge_ent->new_dir.buf, o->branch2);
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			strbuf_release(&merge_ent->new_dir);
+		}
+	}
+
+	remove_hashmap_entries(dir_re_head, &remove_from_head);
+	remove_hashmap_entries(dir_re_merge, &remove_from_merge);
+}
+
 static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 					     struct tree *tree)
 {
@@ -1911,6 +2026,10 @@ static int handle_renames(struct merge_options *o,
 	dir_re_head = get_directory_renames(head_pairs, head);
 	dir_re_merge = get_directory_renames(merge_pairs, merge);
 
+	handle_directory_level_conflicts(o,
+					 dir_re_head, head,
+					 dir_re_merge, merge);
+
 	ri->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge_pairs, merge,
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 20/36] merge-recursive: add computation of collisions due to dir rename & merging
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (18 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 19/36] merge-recursive: check for directory level conflicts Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 21/36] merge-recursive: check for file level conflicts then get new name Elijah Newren
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

directory renaming and merging can cause one or more files to be moved to
where an existing file is, or to cause several files to all be moved to
the same (otherwise vacant) location.  Add checking and reporting for such
cases, falling back to no-directory-rename handling for such paths.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 146 +++++++++++++++++++++++++++++++++++++++++++++-
 merge-recursive.h |   7 +++
 2 files changed, 150 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3b0e509513..25ea6841fc 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -87,6 +87,29 @@ static void dir_rename_entry_init(struct dir_rename_entry *entry,
 	string_list_init(&entry->possible_new_dirs, 0);
 }
 
+static struct collision_entry *collision_find_entry(struct hashmap *hashmap,
+						    char *target_file)
+{
+	struct collision_entry key;
+
+	hashmap_entry_init(&key, strhash(target_file));
+	key.target_file = target_file;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int collision_cmp(void *unused_cmp_data,
+			 const struct collision_entry *e1,
+			 const struct collision_entry *e2,
+			 const void *unused_keydata)
+{
+	return strcmp(e1->target_file, e2->target_file);
+}
+
+static void collision_init(struct hashmap *map)
+{
+	hashmap_init(map, (hashmap_cmp_fn) collision_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
@@ -1404,6 +1427,31 @@ static int tree_has_path(struct tree *tree, const char *path)
 			       &hashy, &mode_o);
 }
 
+/*
+ * Return a new string that replaces the beginning portion (which matches
+ * entry->dir), with entry->new_dir.  In perl-speak:
+ *   new_path_name = (old_path =~ s/entry->dir/entry->new_dir/);
+ * NOTE:
+ *   Caller must ensure that old_path starts with entry->dir + '/'.
+ */
+static char *apply_dir_rename(struct dir_rename_entry *entry,
+			      const char *old_path)
+{
+	struct strbuf new_path = STRBUF_INIT;
+	int oldlen, newlen;
+
+	if (entry->non_unique_new_dir)
+		return NULL;
+
+	oldlen = strlen(entry->dir);
+	newlen = entry->new_dir.len + (strlen(old_path) - oldlen) + 1;
+	strbuf_grow(&new_path, newlen);
+	strbuf_addbuf(&new_path, &entry->new_dir);
+	strbuf_addstr(&new_path, &old_path[oldlen]);
+
+	return strbuf_detach(&new_path, NULL);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir)
 {
@@ -1673,6 +1721,84 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 	return dir_renames;
 }
 
+static struct dir_rename_entry *check_dir_renamed(const char *path,
+						  struct hashmap *dir_renames)
+{
+	char temp[PATH_MAX];
+	char *end;
+	struct dir_rename_entry *entry;
+
+	strcpy(temp, path);
+	while ((end = strrchr(temp, '/'))) {
+		*end = '\0';
+		entry = dir_rename_find_entry(dir_renames, temp);
+		if (entry)
+			return entry;
+	}
+	return NULL;
+}
+
+static void compute_collisions(struct hashmap *collisions,
+			       struct hashmap *dir_renames,
+			       struct diff_queue_struct *pairs)
+{
+	int i;
+
+	/*
+	 * Multiple files can be mapped to the same path due to directory
+	 * renames done by the other side of history.  Since that other
+	 * side of history could have merged multiple directories into one,
+	 * if our side of history added the same file basename to each of
+	 * those directories, then all N of them would get implicitly
+	 * renamed by the directory rename detection into the same path,
+	 * and we'd get an add/add/.../add conflict, and all those adds
+	 * from *this* side of history.  This is not representable in the
+	 * index, and users aren't going to easily be able to make sense of
+	 * it.  So we need to provide a good warning about what's
+	 * happening, and fall back to no-directory-rename detection
+	 * behavior for those paths.
+	 *
+	 * See testcases 9e and all of section 5 from t6043 for examples.
+	 */
+	collision_init(collisions);
+
+	for (i = 0; i < pairs->nr; ++i) {
+		struct dir_rename_entry *dir_rename_ent;
+		struct collision_entry *collision_ent;
+		char *new_path;
+		struct diff_filepair *pair = pairs->queue[i];
+
+		if (pair->status == 'D')
+			continue;
+		dir_rename_ent = check_dir_renamed(pair->two->path,
+						   dir_renames);
+		if (!dir_rename_ent)
+			continue;
+
+		new_path = apply_dir_rename(dir_rename_ent, pair->two->path);
+		if (!new_path)
+			/*
+			 * dir_rename_ent->non_unique_new_path is true, which
+			 * means there is no directory rename for us to use,
+			 * which means it won't cause us any additional
+			 * collisions.
+			 */
+			continue;
+		collision_ent = collision_find_entry(collisions, new_path);
+		if (!collision_ent) {
+			collision_ent = xcalloc(1,
+						sizeof(struct collision_entry));
+			hashmap_entry_init(collision_ent, strhash(new_path));
+			hashmap_put(collisions, collision_ent);
+			collision_ent->target_file = new_path;
+		} else {
+			free(new_path);
+		}
+		string_list_insert(&collision_ent->source_files,
+				   pair->two->path);
+	}
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1682,6 +1808,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
  */
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
+				       struct hashmap *dir_renames,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
@@ -1689,8 +1816,12 @@ static struct string_list *get_renames(struct merge_options *o,
 				       struct string_list *entries)
 {
 	int i;
+	struct hashmap collisions;
+	struct hashmap_iter iter;
+	struct collision_entry *e;
 	struct string_list *renames;
 
+	compute_collisions(&collisions, dir_renames, pairs);
 	renames = xcalloc(1, sizeof(struct string_list));
 
 	for (i = 0; i < pairs->nr; ++i) {
@@ -1721,6 +1852,13 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
+
+	hashmap_iter_init(&collisions, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->target_file);
+		string_list_clear(&e->source_files, 0);
+	}
+	hashmap_free(&collisions, 1);
 	return renames;
 }
 
@@ -2030,9 +2168,11 @@ static int handle_renames(struct merge_options *o,
 					 dir_re_head, head,
 					 dir_re_merge, merge);
 
-	ri->head_renames  = get_renames(o, head_pairs, head,
-					 common, head, merge, entries);
-	ri->merge_renames = get_renames(o, merge_pairs, merge,
+	ri->head_renames  = get_renames(o, head_pairs,
+					dir_re_merge, head,
+					common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge_pairs,
+					dir_re_head, merge,
 					 common, head, merge, entries);
 	clean = process_renames(o, ri->head_renames, ri->merge_renames);
 
diff --git a/merge-recursive.h b/merge-recursive.h
index fe64c78de4..50a4e6af4e 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -47,6 +47,13 @@ struct dir_rename_entry {
 	struct string_list possible_new_dirs;
 };
 
+struct collision_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *target_file;
+	struct string_list source_files;
+	unsigned reported_already:1;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 21/36] merge-recursive: check for file level conflicts then get new name
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (19 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 20/36] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 22/36] merge-recursive: when comparing files, don't include trees Elijah Newren
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
file level either.  If there aren't any, then get the new name from
any directory renames.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   | 174 ++++++++++++++++++++++++++--
 strbuf.c                            |  16 +++
 strbuf.h                            |  16 +++
 t/t6043-merge-rename-directories.sh |   2 +-
 4 files changed, 199 insertions(+), 9 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 25ea6841fc..ecead3df4b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1520,6 +1520,91 @@ static void remove_hashmap_entries(struct hashmap *dir_renames,
 	string_list_clear(items_to_remove, 0);
 }
 
+/*
+ * See if there is a directory rename for path, and if there are any file
+ * level conflicts for the renamed location.  If there is a rename and
+ * there are no conflicts, return the new name.  Otherwise, return NULL.
+ */
+static char *handle_path_level_conflicts(struct merge_options *o,
+					 const char *path,
+					 struct dir_rename_entry *entry,
+					 struct hashmap *collisions,
+					 struct tree *tree)
+{
+	char *new_path = NULL;
+	struct collision_entry *collision_ent;
+	int clean = 1;
+	struct strbuf collision_paths = STRBUF_INIT;
+
+	/*
+	 * entry has the mapping of old directory name to new directory name
+	 * that we want to apply to path.
+	 */
+	new_path = apply_dir_rename(entry, path);
+
+	if (!new_path) {
+		/* This should only happen when entry->non_unique_new_dir set */
+		if (!entry->non_unique_new_dir)
+			BUG("entry->non_unqiue_dir not set and !new_path");
+		output(o, 1, _("CONFLICT (directory rename split): "
+			       "Unclear where to place %s because directory "
+			       "%s was renamed to multiple other directories, "
+			       "with no destination getting a majority of the "
+			       "files."),
+		       path, entry->dir);
+		clean = 0;
+		return NULL;
+	}
+
+	/*
+	 * The caller needs to have ensured that it has pre-populated
+	 * collisions with all paths that map to new_path.  Do a quick check
+	 * to ensure that's the case.
+	 */
+	collision_ent = collision_find_entry(collisions, new_path);
+	if (collision_ent == NULL)
+		BUG("collision_ent is NULL");
+
+	/*
+	 * Check for one-sided add/add/.../add conflicts, i.e.
+	 * where implicit renames from the other side doing
+	 * directory rename(s) can affect this side of history
+	 * to put multiple paths into the same location.  Warn
+	 * and bail on directory renames for such paths.
+	 */
+	if (collision_ent->reported_already) {
+		clean = 0;
+	} else if (tree_has_path(tree, new_path)) {
+		collision_ent->reported_already = 1;
+		strbuf_add_separated_string_list(&collision_paths, ", ",
+						 &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Existing "
+			       "file/dir at %s in the way of implicit "
+			       "directory rename(s) putting the following "
+			       "path(s) there: %s."),
+		       new_path, collision_paths.buf);
+		clean = 0;
+	} else if (collision_ent->source_files.nr > 1) {
+		collision_ent->reported_already = 1;
+		strbuf_add_separated_string_list(&collision_paths, ", ",
+						 &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Cannot map "
+			       "more than one path to %s; implicit directory "
+			       "renames tried to put these paths there: %s"),
+		       new_path, collision_paths.buf);
+		clean = 0;
+	}
+
+	/* Free memory we no longer need */
+	strbuf_release(&collision_paths);
+	if (!clean && new_path) {
+		free(new_path);
+		return NULL;
+	}
+
+	return new_path;
+}
+
 /*
  * There are a couple things we want to do at the directory level:
  *   1. Check for both sides renaming to the same thing, in order to avoid
@@ -1799,6 +1884,59 @@ static void compute_collisions(struct hashmap *collisions,
 	}
 }
 
+static char *check_for_directory_rename(struct merge_options *o,
+					const char *path,
+					struct tree *tree,
+					struct hashmap *dir_renames,
+					struct hashmap *dir_rename_exclusions,
+					struct hashmap *collisions,
+					int *clean_merge)
+{
+	char *new_path = NULL;
+	struct dir_rename_entry *entry = check_dir_renamed(path, dir_renames);
+	struct dir_rename_entry *oentry = NULL;
+
+	if (!entry)
+		return new_path;
+
+	/*
+	 * This next part is a little weird.  We do not want to do an
+	 * implicit rename into a directory we renamed on our side, because
+	 * that will result in a spurious rename/rename(1to2) conflict.  An
+	 * example:
+	 *   Base commit: dumbdir/afile, otherdir/bfile
+	 *   Side 1:      smrtdir/afile, otherdir/bfile
+	 *   Side 2:      dumbdir/afile, dumbdir/bfile
+	 * Here, while working on Side 1, we could notice that otherdir was
+	 * renamed/merged to dumbdir, and change the diff_filepair for
+	 * otherdir/bfile into a rename into dumbdir/bfile.  However, Side
+	 * 2 will notice the rename from dumbdir to smrtdir, and do the
+	 * transitive rename to move it from dumbdir/bfile to
+	 * smrtdir/bfile.  That gives us bfile in dumbdir vs being in
+	 * smrtdir, a rename/rename(1to2) conflict.  We really just want
+	 * the file to end up in smrtdir.  And the way to achieve that is
+	 * to not let Side1 do the rename to dumbdir, since we know that is
+	 * the source of one of our directory renames.
+	 *
+	 * That's why oentry and dir_rename_exclusions is here.
+	 *
+	 * As it turns out, this also prevents N-way transient rename
+	 * confusion; See testcases 9c and 9d of t6043.
+	 */
+	oentry = dir_rename_find_entry(dir_rename_exclusions, entry->new_dir.buf);
+	if (oentry) {
+		output(o, 1, _("WARNING: Avoiding applying %s -> %s rename "
+			       "to %s, because %s itself was renamed."),
+		       entry->dir, entry->new_dir.buf, path, entry->new_dir.buf);
+	} else {
+		new_path = handle_path_level_conflicts(o, path, entry,
+						       collisions, tree);
+		*clean_merge &= (new_path != NULL);
+	}
+
+	return new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1809,11 +1947,13 @@ static void compute_collisions(struct hashmap *collisions,
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
 				       struct hashmap *dir_renames,
+				       struct hashmap *dir_rename_exclusions,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
 				       struct tree *b_tree,
-				       struct string_list *entries)
+				       struct string_list *entries,
+				       int *clean_merge)
 {
 	int i;
 	struct hashmap collisions;
@@ -1828,11 +1968,22 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct string_list_item *item;
 		struct rename *re;
 		struct diff_filepair *pair = pairs->queue[i];
+		char *new_path; /* non-NULL only with directory renames */
 
-		if (pair->status != 'R') {
+		if (pair->status == 'D') {
 			diff_free_filepair(pair);
 			continue;
 		}
+		new_path = check_for_directory_rename(o, pair->two->path, tree,
+						      dir_renames,
+						      dir_rename_exclusions,
+						      &collisions,
+						      clean_merge);
+		if (pair->status != 'R' && !new_path) {
+			diff_free_filepair(pair);
+			continue;
+		}
+
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
 		re->pair = pair;
@@ -2150,7 +2301,7 @@ static int handle_renames(struct merge_options *o,
 {
 	struct diff_queue_struct *head_pairs, *merge_pairs;
 	struct hashmap *dir_re_head, *dir_re_merge;
-	int clean;
+	int clean = 1;
 
 	ri->head_renames = NULL;
 	ri->merge_renames = NULL;
@@ -2169,13 +2320,20 @@ static int handle_renames(struct merge_options *o,
 					 dir_re_merge, merge);
 
 	ri->head_renames  = get_renames(o, head_pairs,
-					dir_re_merge, head,
-					common, head, merge, entries);
+					dir_re_merge, dir_re_head, head,
+					common, head, merge, entries,
+					&clean);
+	if (clean < 0)
+		goto cleanup;
 	ri->merge_renames = get_renames(o, merge_pairs,
-					dir_re_head, merge,
-					 common, head, merge, entries);
-	clean = process_renames(o, ri->head_renames, ri->merge_renames);
+					dir_re_head, dir_re_merge, merge,
+					common, head, merge, entries,
+					&clean);
+	if (clean < 0)
+		goto cleanup;
+	clean &= process_renames(o, ri->head_renames, ri->merge_renames);
 
+cleanup:
 	/*
 	 * Some cleanup is deferred until cleanup_renames() because the
 	 * data structures are still needed and referenced in
diff --git a/strbuf.c b/strbuf.c
index 43a840c67b..83d05024e6 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "refs.h"
+#include "string-list.h"
 #include "utf8.h"
 
 int starts_with(const char *str, const char *prefix)
@@ -171,6 +172,21 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
 	return ret;
 }
 
+void strbuf_add_separated_string_list(struct strbuf *str,
+				      const char *sep,
+				      struct string_list *slist)
+{
+	struct string_list_item *item;
+	int sep_needed = 0;
+
+	for_each_string_list_item(item, slist) {
+		if (sep_needed)
+			strbuf_addstr(str, sep);
+		strbuf_addstr(str, item->string);
+		sep_needed = 1;
+	}
+}
+
 void strbuf_list_free(struct strbuf **sbs)
 {
 	struct strbuf **s = sbs;
diff --git a/strbuf.h b/strbuf.h
index 4efa80c1de..c4de5e4588 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -1,6 +1,8 @@
 #ifndef STRBUF_H
 #define STRBUF_H
 
+struct string_list;
+
 /**
  * strbuf's are meant to be used with all the usual C string and memory
  * APIs. Given that the length of the buffer is known, it's often better to
@@ -537,6 +539,20 @@ static inline struct strbuf **strbuf_split(const struct strbuf *sb,
 	return strbuf_split_max(sb, terminator, 0);
 }
 
+/*
+ * Adds all strings of a string list to the strbuf, separated by the given
+ * separator.  For example, if sep is
+ *   ', '
+ * and slist contains
+ *   ['element1', 'element2', ..., 'elementN'],
+ * then write:
+ *   'element1, element2, ..., elementN'
+ * to str.  If only one element, just write "element1" to str.
+ */
+extern void strbuf_add_separated_string_list(struct strbuf *str,
+					     const char *sep,
+					     struct string_list *slist);
+
 /**
  * Free a NULL-terminated list of strbufs (for example, the return
  * values of the strbuf_split*() functions).
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 8ea9ec49bc..b24562b849 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -489,7 +489,7 @@ test_expect_success '2a-setup: Directory split into two on one side, with equal
 	)
 '
 
-test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+test_expect_success '2a-check: Directory split into two on one side, with equal numbers of paths' '
 	(
 		cd 2a &&
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 22/36] merge-recursive: when comparing files, don't include trees
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (20 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 21/36] merge-recursive: check for file level conflicts then get new name Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 23/36] merge-recursive: apply necessary modifications for directory renames Elijah Newren
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

get_renames() would look up stage data that already existed (populated
in get_unmerged(), taken from whatever unpack_trees() created), and if
it didn't exist, would call insert_stage_data() to create the necessary
entry for the given file.  The insert_stage_data() fallback becomes
much more important for directory rename detection, because that creates
a mechanism to have a file in the resulting merge that didn't exist on
either side of history.  However, insert_stage_data(), due to calling
get_tree_entry() loaded up trees as readily as files.  We aren't
interested in comparing trees to files; the D/F conflict handling is
done elsewhere.  This code is just concerned with what entries existed
for a given path on the different sides of the merge, so create a
get_tree_entry_if_blob() helper function and use it.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index ecead3df4b..d569e3e893 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -421,6 +421,21 @@ static void get_files_dirs(struct merge_options *o, struct tree *tree)
 	read_tree_recursive(tree, "", 0, 0, &match_all, save_files_dirs, o);
 }
 
+static int get_tree_entry_if_blob(const struct object_id *tree,
+				  const char *path,
+				  struct object_id *hashy,
+				  unsigned int *mode_o)
+{
+	int ret;
+
+	ret = get_tree_entry(tree, path, hashy, mode_o);
+	if (S_ISDIR(*mode_o)) {
+		oidcpy(hashy, &null_oid);
+		*mode_o = 0;
+	}
+	return ret;
+}
+
 /*
  * Returns an index_entry instance which doesn't have to correspond to
  * a real cache entry in Git's index.
@@ -431,12 +446,12 @@ static struct stage_data *insert_stage_data(const char *path,
 {
 	struct string_list_item *item;
 	struct stage_data *e = xcalloc(1, sizeof(struct stage_data));
-	get_tree_entry(&o->object.oid, path,
-			&e->stages[1].oid, &e->stages[1].mode);
-	get_tree_entry(&a->object.oid, path,
-			&e->stages[2].oid, &e->stages[2].mode);
-	get_tree_entry(&b->object.oid, path,
-			&e->stages[3].oid, &e->stages[3].mode);
+	get_tree_entry_if_blob(&o->object.oid, path,
+			       &e->stages[1].oid, &e->stages[1].mode);
+	get_tree_entry_if_blob(&a->object.oid, path,
+			       &e->stages[2].oid, &e->stages[2].mode);
+	get_tree_entry_if_blob(&b->object.oid, path,
+			       &e->stages[3].oid, &e->stages[3].mode);
 	item = string_list_insert(entries, path);
 	item->util = e;
 	return e;
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 23/36] merge-recursive: apply necessary modifications for directory renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (21 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 22/36] merge-recursive: when comparing files, don't include trees Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 24/36] merge-recursive: avoid clobbering untracked files with " Elijah Newren
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

This commit hooks together all the directory rename logic by making the
necessary changes to the rename struct, it's dst_entry, and the
diff_filepair under consideration.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   | 187 +++++++++++++++++++++++++++-
 t/t6043-merge-rename-directories.sh |  50 ++++----
 2 files changed, 211 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index d569e3e893..ec0bbcc3f4 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -180,6 +180,7 @@ static int oid_eq(const struct object_id *a, const struct object_id *b)
 
 enum rename_type {
 	RENAME_NORMAL = 0,
+	RENAME_DIR,
 	RENAME_DELETE,
 	RENAME_ONE_FILE_TO_ONE,
 	RENAME_ONE_FILE_TO_TWO,
@@ -610,6 +611,7 @@ struct rename {
 	 */
 	struct stage_data *src_entry;
 	struct stage_data *dst_entry;
+	unsigned add_turned_into_rename:1;
 	unsigned processed:1;
 };
 
@@ -644,6 +646,27 @@ static int update_stages(struct merge_options *opt, const char *path,
 	return 0;
 }
 
+static int update_stages_for_stage_data(struct merge_options *opt,
+					const char *path,
+					const struct stage_data *stage_data)
+{
+	struct diff_filespec o, a, b;
+
+	o.mode = stage_data->stages[1].mode;
+	oidcpy(&o.oid, &stage_data->stages[1].oid);
+
+	a.mode = stage_data->stages[2].mode;
+	oidcpy(&a.oid, &stage_data->stages[2].oid);
+
+	b.mode = stage_data->stages[3].mode;
+	oidcpy(&b.oid, &stage_data->stages[3].oid);
+
+	return update_stages(opt, path,
+			     is_null_oid(&o.oid) ? NULL : &o,
+			     is_null_oid(&a.oid) ? NULL : &a,
+			     is_null_oid(&b.oid) ? NULL : &b);
+}
+
 static void update_entry(struct stage_data *entry,
 			 struct diff_filespec *o,
 			 struct diff_filespec *a,
@@ -1121,6 +1144,18 @@ static int merge_file_one(struct merge_options *o,
 	return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
 }
 
+static int conflict_rename_dir(struct merge_options *o,
+			       struct diff_filepair *pair,
+			       const char *rename_branch,
+			       const char *other_branch)
+{
+	const struct diff_filespec *dest = pair->two;
+
+	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
+		return -1;
+	return 0;
+}
+
 static int handle_change_delete(struct merge_options *o,
 				 const char *path, const char *old_path,
 				 const struct object_id *o_oid, int o_mode,
@@ -1390,6 +1425,24 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
 					  new_path2);
+		/*
+		 * unpack_trees() actually populates the index for us for
+		 * "normal" rename/rename(2to1) situtations so that the
+		 * correct entries are at the higher stages, which would
+		 * make the call below to update_stages_for_stage_data
+		 * unnecessary.  However, if either of the renames came
+		 * from a directory rename, then unpack_trees() will not
+		 * have gotten the right data loaded into the index, so we
+		 * need to do so now.  (While it'd be tempting to move this
+		 * call to update_stages_for_stage_data() to
+		 * apply_directory_rename_modifications(), that would break
+		 * our intermediate calls to would_lose_untracked() since
+		 * those rely on the current in-memory index.  See also the
+		 * big "NOTE" in update_stages()).
+		 */
+		if (update_stages_for_stage_data(o, path, ci->dst_entry1))
+			ret = -1;
+
 		free(new_path2);
 		free(new_path1);
 	}
@@ -1952,6 +2005,111 @@ static char *check_for_directory_rename(struct merge_options *o,
 	return new_path;
 }
 
+static void apply_directory_rename_modifications(struct merge_options *o,
+						 struct diff_filepair *pair,
+						 char *new_path,
+						 struct rename *re,
+						 struct tree *tree,
+						 struct tree *o_tree,
+						 struct tree *a_tree,
+						 struct tree *b_tree,
+						 struct string_list *entries,
+						 int *clean)
+{
+	struct string_list_item *item;
+	int stage = (tree == a_tree ? 2 : 3);
+
+	/*
+	 * In all cases where we can do directory rename detection,
+	 * unpack_trees() will have read pair->two->path into the
+	 * index and the working copy.  We need to remove it so that
+	 * we can instead place it at new_path.  It is guaranteed to
+	 * not be untracked (unpack_trees() would have errored out
+	 * saying the file would have been overwritten), but it might
+	 * be dirty, though.
+	 */
+	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+
+	/* Find or create a new re->dst_entry */
+	item = string_list_lookup(entries, new_path);
+	if (item) {
+		/*
+		 * Since we're renaming on this side of history, and it's
+		 * due to a directory rename on the other side of history
+		 * (which we only allow when the directory in question no
+		 * longer exists on the other side of history), the
+		 * original entry for re->dst_entry is no longer
+		 * necessary...
+		 */
+		re->dst_entry->processed = 1;
+
+		/*
+		 * ...because we'll be using this new one.
+		 */
+		re->dst_entry = item->util;
+	} else {
+		/*
+		 * re->dst_entry is for the before-dir-rename path, and we
+		 * need it to hold information for the after-dir-rename
+		 * path.  Before creating a new entry, we need to mark the
+		 * old one as unnecessary (...unless it is shared by
+		 * src_entry, i.e. this didn't use to be a rename, in which
+		 * case we can just allow the normal processing to happen
+		 * for it).
+		 */
+		if (pair->status == 'R')
+			re->dst_entry->processed = 1;
+
+		re->dst_entry = insert_stage_data(new_path,
+						  o_tree, a_tree, b_tree,
+						  entries);
+		item = string_list_insert(entries, new_path);
+		item->util = re->dst_entry;
+	}
+
+	/*
+	 * Update the stage_data with the information about the path we are
+	 * moving into place.  That slot will be empty and available for us
+	 * to write to because of the collision checks in
+	 * handle_path_level_conflicts().  In other words,
+	 * re->dst_entry->stages[stage].oid will be the null_oid, so it's
+	 * open for us to write to.
+	 *
+	 * It may be tempting to actually update the index at this point as
+	 * well, using update_stages_for_stage_data(), but as per the big
+	 * "NOTE" in update_stages(), doing so will modify the current
+	 * in-memory index which will break calls to would_lose_untracked()
+	 * that we need to make.  Instead, we need to just make sure that
+	 * the various conflict_rename_*() functions update the index
+	 * explicitly rather than relying on unpack_trees() to have done it.
+	 */
+	get_tree_entry(&tree->object.oid,
+		       pair->two->path,
+		       &re->dst_entry->stages[stage].oid,
+		       &re->dst_entry->stages[stage].mode);
+
+	/* Update pair status */
+	if (pair->status == 'A') {
+		/*
+		 * Recording rename information for this add makes it look
+		 * like a rename/delete conflict.  Make sure we can
+		 * correctly handle this as an add that was moved to a new
+		 * directory instead of reporting a rename/delete conflict.
+		 */
+		re->add_turned_into_rename = 1;
+	}
+	/*
+	 * We don't actually look at pair->status again, but it seems
+	 * pedagogically correct to adjust it.
+	 */
+	pair->status = 'R';
+
+	/*
+	 * Finally, record the new location.
+	 */
+	pair->two->path = new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -2001,6 +2159,7 @@ static struct string_list *get_renames(struct merge_options *o,
 
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
+		re->add_turned_into_rename = 0;
 		re->pair = pair;
 		item = string_list_lookup(entries, re->pair->one->path);
 		if (!item)
@@ -2017,6 +2176,12 @@ static struct string_list *get_renames(struct merge_options *o,
 			re->dst_entry = item->util;
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
+		if (new_path)
+			apply_directory_rename_modifications(o, pair, new_path,
+							     re, tree, o_tree,
+							     a_tree, b_tree,
+							     entries,
+							     clean_merge);
 	}
 
 	hashmap_iter_init(&collisions, &iter);
@@ -2186,7 +2351,19 @@ static int process_renames(struct merge_options *o,
 			dst_other.mode = ren1->dst_entry->stages[other_stage].mode;
 			try_merge = 0;
 
-			if (oid_eq(&src_other.oid, &null_oid)) {
+			if (oid_eq(&src_other.oid, &null_oid) &&
+			    ren1->add_turned_into_rename) {
+				setup_rename_conflict_info(RENAME_DIR,
+							   ren1->pair,
+							   NULL,
+							   branch1,
+							   branch2,
+							   ren1->dst_entry,
+							   NULL,
+							   o,
+							   NULL,
+							   NULL);
+			} else if (oid_eq(&src_other.oid, &null_oid)) {
 				setup_rename_conflict_info(RENAME_DELETE,
 							   ren1->pair,
 							   NULL,
@@ -2603,6 +2780,14 @@ static int process_entry(struct merge_options *o,
 						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 						    conflict_info);
 			break;
+		case RENAME_DIR:
+			clean_merge = 1;
+			if (conflict_rename_dir(o,
+						conflict_info->pair1,
+						conflict_info->branch1,
+						conflict_info->branch2))
+				clean_merge = -1;
+			break;
 		case RENAME_DELETE:
 			clean_merge = 0;
 			if (conflict_rename_delete(o,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b24562b849..3525c54bb4 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -69,7 +69,7 @@ test_expect_success '1a-setup: Simple directory rename detection' '
 	)
 '
 
-test_expect_failure '1a-check: Simple directory rename detection' '
+test_expect_success '1a-check: Simple directory rename detection' '
 	(
 		cd 1a &&
 
@@ -136,7 +136,7 @@ test_expect_success '1b-setup: Merge a directory with another' '
 	)
 '
 
-test_expect_failure '1b-check: Merge a directory with another' '
+test_expect_success '1b-check: Merge a directory with another' '
 	(
 		cd 1b &&
 
@@ -194,7 +194,7 @@ test_expect_success '1c-setup: Transitive renaming' '
 	)
 '
 
-test_expect_failure '1c-check: Transitive renaming' '
+test_expect_success '1c-check: Transitive renaming' '
 	(
 		cd 1c &&
 
@@ -263,7 +263,7 @@ test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) con
 	)
 '
 
-test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+test_expect_success '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
 	(
 		cd 1d &&
 
@@ -342,7 +342,7 @@ test_expect_success '1e-setup: Renamed directory, with all files being renamed t
 	)
 '
 
-test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+test_expect_success '1e-check: Renamed directory, with all files being renamed too' '
 	(
 		cd 1e &&
 
@@ -408,7 +408,7 @@ test_expect_success '1f-setup: Split a directory into two other directories' '
 	)
 '
 
-test_expect_failure '1f-check: Split a directory into two other directories' '
+test_expect_success '1f-check: Split a directory into two other directories' '
 	(
 		cd 1f &&
 
@@ -907,7 +907,7 @@ test_expect_success '5a-setup: Merge directories, other side adds files to origi
 	)
 '
 
-test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+test_expect_success '5a-check: Merge directories, other side adds files to original and target' '
 	(
 		cd 5a &&
 
@@ -981,7 +981,7 @@ test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflic
 	)
 '
 
-test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+test_expect_success '5b-check: Rename/delete in order to get add/add/add conflict' '
 	(
 		cd 5b &&
 
@@ -1061,7 +1061,7 @@ test_expect_success '5c-setup: Transitive rename would cause rename/rename/renam
 	)
 '
 
-test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+test_expect_success '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
 	(
 		cd 5c &&
 
@@ -1145,7 +1145,7 @@ test_expect_success '5d-setup: Directory/file/file conflict due to directory ren
 	)
 '
 
-test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+test_expect_success '5d-check: Directory/file/file conflict due to directory rename' '
 	(
 		cd 5d &&
 
@@ -1583,7 +1583,7 @@ test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS
 	)
 '
 
-test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+test_expect_success '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
 	(
 		cd 7a &&
 
@@ -1655,7 +1655,7 @@ test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive r
 	)
 '
 
-test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+test_expect_success '7b-check: rename/rename(2to1), but only due to transitive rename' '
 	(
 		cd 7b &&
 
@@ -1731,7 +1731,7 @@ test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may
 	)
 '
 
-test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+test_expect_success '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
 	(
 		cd 7c &&
 
@@ -1795,7 +1795,7 @@ test_expect_success '7d-setup: transitive rename involved in rename/delete; how
 	)
 '
 
-test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+test_expect_success '7d-check: transitive rename involved in rename/delete; how is it reported?' '
 	(
 		cd 7d &&
 
@@ -1885,7 +1885,7 @@ test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in th
 	)
 '
 
-test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+test_expect_success '7e-check: transitive rename in rename/delete AND dirs in the way' '
 	(
 		cd 7e &&
 
@@ -1976,7 +1976,7 @@ test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
 	)
 '
 
-test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+test_expect_success '8a-check: Dual-directory rename, one into the others way' '
 	(
 		cd 8a &&
 
@@ -2121,7 +2121,7 @@ test_expect_success '8c-setup: rename+modify/delete' '
 	)
 '
 
-test_expect_failure '8c-check: rename+modify/delete' '
+test_expect_success '8c-check: rename+modify/delete' '
 	(
 		cd 8c &&
 
@@ -2208,7 +2208,7 @@ test_expect_success '8d-setup: rename/delete...or not?' '
 	)
 '
 
-test_expect_failure '8d-check: rename/delete...or not?' '
+test_expect_success '8d-check: rename/delete...or not?' '
 	(
 		cd 8d &&
 
@@ -2283,7 +2283,7 @@ test_expect_success '8e-setup: Both sides rename, one side adds to original dire
 	)
 '
 
-test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+test_expect_success '8e-check: Both sides rename, one side adds to original directory' '
 	(
 		cd 8e &&
 
@@ -2370,7 +2370,7 @@ test_expect_success '9a-setup: Inner renamed directory within outer renamed dire
 	)
 '
 
-test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+test_expect_success '9a-check: Inner renamed directory within outer renamed directory' '
 	(
 		cd 9a &&
 
@@ -2440,7 +2440,7 @@ test_expect_success '9b-setup: Transitive rename with content merge' '
 	)
 '
 
-test_expect_failure '9b-check: Transitive rename with content merge' '
+test_expect_success '9b-check: Transitive rename with content merge' '
 	(
 		cd 9b &&
 
@@ -2530,7 +2530,7 @@ test_expect_success '9c-setup: Doubly transitive rename?' '
 	)
 '
 
-test_expect_failure '9c-check: Doubly transitive rename?' '
+test_expect_success '9c-check: Doubly transitive rename?' '
 	(
 		cd 9c &&
 
@@ -2618,7 +2618,7 @@ test_expect_success '9d-setup: N-way transitive rename?' '
 	)
 '
 
-test_expect_failure '9d-check: N-way transitive rename?' '
+test_expect_success '9d-check: N-way transitive rename?' '
 	(
 		cd 9d &&
 
@@ -2700,7 +2700,7 @@ test_expect_success '9e-setup: N-to-1 whammo' '
 	)
 '
 
-test_expect_failure C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
+test_expect_success C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
 	(
 		cd 9e &&
 
@@ -2778,7 +2778,7 @@ test_expect_success '9f-setup: Renamed directory that only contained immediate s
 	)
 '
 
-test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+test_expect_success '9f-check: Renamed directory that only contained immediate subdirs' '
 	(
 		cd 9f &&
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 24/36] merge-recursive: avoid clobbering untracked files with directory renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (22 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 23/36] merge-recursive: apply necessary modifications for directory renames Elijah Newren
@ 2018-04-19 17:58 ` " Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   | 42 +++++++++++++++++++++++++++--
 t/t6043-merge-rename-directories.sh |  6 ++---
 2 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index ec0bbcc3f4..c1c4faf61e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1151,6 +1151,26 @@ static int conflict_rename_dir(struct merge_options *o,
 {
 	const struct diff_filespec *dest = pair->two;
 
+	if (!o->call_depth && would_lose_untracked(dest->path)) {
+		char *alt_path = unique_path(o, dest->path, rename_branch);
+
+		output(o, 1, _("Error: Refusing to lose untracked file at %s; "
+			       "writing to %s instead."),
+		       dest->path, alt_path);
+		/*
+		 * Write the file in worktree at alt_path, but not in the
+		 * index.  Instead, write to dest->path for the index but
+		 * only at the higher appropriate stage.
+		 */
+		if (update_file(o, 0, &dest->oid, dest->mode, alt_path))
+			return -1;
+		free(alt_path);
+		return update_stages(o, dest->path, NULL,
+				     rename_branch == o->branch1 ? dest : NULL,
+				     rename_branch == o->branch1 ? NULL : dest);
+	}
+
+	/* Update dest->path both in index and in worktree */
 	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
 		return -1;
 	return 0;
@@ -1169,7 +1189,8 @@ static int handle_change_delete(struct merge_options *o,
 	const char *update_path = path;
 	int ret = 0;
 
-	if (dir_in_way(path, !o->call_depth, 0)) {
+	if (dir_in_way(path, !o->call_depth, 0) ||
+	    (!o->call_depth && would_lose_untracked(path))) {
 		update_path = alt_path = unique_path(o, path, change_branch);
 	}
 
@@ -1295,6 +1316,12 @@ static int handle_file(struct merge_options *o,
 			dst_name = unique_path(o, rename->path, cur_branch);
 			output(o, 1, _("%s is a directory in %s adding as %s instead"),
 			       rename->path, other_branch, dst_name);
+		} else if (!o->call_depth &&
+			   would_lose_untracked(rename->path)) {
+			dst_name = unique_path(o, rename->path, cur_branch);
+			output(o, 1, _("Refusing to lose untracked file at %s; "
+				       "adding as %s instead"),
+			       rename->path, dst_name);
 		}
 	}
 	if ((ret = update_file(o, 0, &rename->oid, rename->mode, dst_name)))
@@ -1420,7 +1447,18 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		remove_file(o, 0, path, 0);
+		if (would_lose_untracked(path))
+			/*
+			 * Only way we get here is if both renames were from
+			 * a directory rename AND user had an untracked file
+			 * at the location where both files end up after the
+			 * two directory renames.  See testcase 10d of t6043.
+			 */
+			output(o, 1, _("Refusing to lose untracked file at "
+				       "%s, even though it's in the way."),
+			       path);
+		else
+			remove_file(o, 0, path, 0);
 		ret = update_file(o, 0, &mfi_c1.oid, mfi_c1.mode, new_path1);
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 3525c54bb4..0b60eb8053 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2992,7 +2992,7 @@ test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
 	)
 '
 
-test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+test_expect_success '10b-check: Overwrite untracked with dir rename + delete' '
 	(
 		cd 10b &&
 
@@ -3070,7 +3070,7 @@ test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)
 	)
 '
 
-test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+test_expect_success '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
 	(
 		cd 10c &&
 
@@ -3145,7 +3145,7 @@ test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
 	)
 '
 
-test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+test_expect_success '10d-check: Delete untracked with dir rename/rename(2to1)' '
 	(
 		cd 10d &&
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (23 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 24/36] merge-recursive: avoid clobbering untracked files with " Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 20:48   ` Martin Ågren
  2018-04-19 17:58 ` [PATCH v10 26/36] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
                   ` (12 subsequent siblings)
  37 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of dirty files that are involved in directory rename
detection will be added in a subsequent commit.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   | 85 ++++++++++++++++++++++-------
 merge-recursive.h                   |  2 +
 t/t3501-revert-cherry-pick.sh       |  2 +-
 t/t6043-merge-rename-directories.sh |  2 +-
 t/t7607-merge-overwrite.sh          |  2 +-
 unpack-trees.c                      |  4 +-
 unpack-trees.h                      |  4 ++
 7 files changed, 77 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index c1c4faf61e..7fdcba4f22 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -337,32 +337,37 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
 	init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(int index_only,
+static int git_merge_trees(struct merge_options *o,
 			   struct tree *common,
 			   struct tree *head,
 			   struct tree *merge)
 {
 	int rc;
 	struct tree_desc t[3];
-	struct unpack_trees_options opts;
 
-	memset(&opts, 0, sizeof(opts));
-	if (index_only)
-		opts.index_only = 1;
+	memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
+	if (o->call_depth)
+		o->unpack_opts.index_only = 1;
 	else
-		opts.update = 1;
-	opts.merge = 1;
-	opts.head_idx = 2;
-	opts.fn = threeway_merge;
-	opts.src_index = &the_index;
-	opts.dst_index = &the_index;
-	setup_unpack_trees_porcelain(&opts, "merge");
+		o->unpack_opts.update = 1;
+	o->unpack_opts.merge = 1;
+	o->unpack_opts.head_idx = 2;
+	o->unpack_opts.fn = threeway_merge;
+	o->unpack_opts.src_index = &the_index;
+	o->unpack_opts.dst_index = &the_index;
+	setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
 	init_tree_desc_from_tree(t+0, common);
 	init_tree_desc_from_tree(t+1, head);
 	init_tree_desc_from_tree(t+2, merge);
 
-	rc = unpack_trees(3, t, &opts);
+	rc = unpack_trees(3, t, &o->unpack_opts);
+	/*
+	 * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
+	 * so set to the new index which will usually have modification
+	 * timestamp info copied over.
+	 */
+	o->unpack_opts.src_index = &the_index;
 	cache_tree_free(&active_cache_tree);
 	return rc;
 }
@@ -795,6 +800,20 @@ static int would_lose_untracked(const char *path)
 	return !was_tracked(path) && file_exists(path);
 }
 
+static int was_dirty(struct merge_options *o, const char *path)
+{
+	struct cache_entry *ce;
+	int dirty = 1;
+
+	if (o->call_depth || !was_tracked(path))
+		return !dirty;
+
+	ce = cache_file_exists(path, strlen(path), ignore_case);
+	dirty = (ce->ce_stat_data.sd_mtime.sec > 0 &&
+		 verify_uptodate(ce, &o->unpack_opts) != 0);
+	return dirty;
+}
+
 static int make_room_for_path(struct merge_options *o, const char *path)
 {
 	int status, i;
@@ -2687,6 +2706,7 @@ static int handle_modify_delete(struct merge_options *o,
 
 static int merge_content(struct merge_options *o,
 			 const char *path,
+			 int file_in_way,
 			 struct object_id *o_oid, int o_mode,
 			 struct object_id *a_oid, int a_mode,
 			 struct object_id *b_oid, int b_mode,
@@ -2761,7 +2781,7 @@ static int merge_content(struct merge_options *o,
 				return -1;
 	}
 
-	if (df_conflict_remains) {
+	if (df_conflict_remains || file_in_way) {
 		char *new_path;
 		if (o->call_depth) {
 			remove_file_from_cache(path);
@@ -2795,6 +2815,30 @@ static int merge_content(struct merge_options *o,
 	return mfi.clean;
 }
 
+static int conflict_rename_normal(struct merge_options *o,
+				  const char *path,
+				  struct object_id *o_oid, unsigned int o_mode,
+				  struct object_id *a_oid, unsigned int a_mode,
+				  struct object_id *b_oid, unsigned int b_mode,
+				  struct rename_conflict_info *ci)
+{
+	int clean_merge;
+	int file_in_the_way = 0;
+
+	if (was_dirty(o, path)) {
+		file_in_the_way = 1;
+		output(o, 1, _("Refusing to lose dirty file at %s"), path);
+	}
+
+	/* Merge the content and write it out */
+	clean_merge = merge_content(o, path, file_in_the_way,
+				    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
+				    ci);
+	if (clean_merge > 0 && file_in_the_way)
+		clean_merge = 0;
+	return clean_merge;
+}
+
 /* Per entry merge function */
 static int process_entry(struct merge_options *o,
 			 const char *path, struct stage_data *entry)
@@ -2814,9 +2858,12 @@ static int process_entry(struct merge_options *o,
 		switch (conflict_info->rename_type) {
 		case RENAME_NORMAL:
 		case RENAME_ONE_FILE_TO_ONE:
-			clean_merge = merge_content(o, path,
-						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
-						    conflict_info);
+			clean_merge = conflict_rename_normal(o,
+							     path,
+							     o_oid, o_mode,
+							     a_oid, a_mode,
+							     b_oid, b_mode,
+							     conflict_info);
 			break;
 		case RENAME_DIR:
 			clean_merge = 1;
@@ -2912,7 +2959,7 @@ static int process_entry(struct merge_options *o,
 	} else if (a_oid && b_oid) {
 		/* Case C: Added in both (check for same permissions) and */
 		/* case D: Modified in both, but differently. */
-		clean_merge = merge_content(o, path,
+		clean_merge = merge_content(o, path, 0 /* file_in_way */,
 					    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 					    NULL);
 	} else if (!o_oid && !a_oid && !b_oid) {
@@ -2953,7 +3000,7 @@ int merge_trees(struct merge_options *o,
 		return 1;
 	}
 
-	code = git_merge_trees(o->call_depth, common, head, merge);
+	code = git_merge_trees(o, common, head, merge);
 
 	if (code != 0) {
 		if (show(o, 4) || o->call_depth)
diff --git a/merge-recursive.h b/merge-recursive.h
index 50a4e6af4e..d863cf8867 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_RECURSIVE_H
 #define MERGE_RECURSIVE_H
 
+#include "unpack-trees.h"
 #include "string-list.h"
 
 struct merge_options {
@@ -27,6 +28,7 @@ struct merge_options {
 	struct strbuf obuf;
 	struct hashmap current_file_dir_set;
 	struct string_list df_conflict_file_set;
+	struct unpack_trees_options unpack_opts;
 };
 
 /*
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index ccbc118514..c9a1f783f5 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'cherry-pick works with dirty renamed file' '
+test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 0b60eb8053..b94ba066fe 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3298,7 +3298,7 @@ test_expect_success '11a-setup: Avoid losing dirty contents with simple rename'
 	)
 '
 
-test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+test_expect_success '11a-check: Avoid losing dirty contents with simple rename' '
 	(
 		cd 11a &&
 
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 9c422bcd7c..dd8ab7ede1 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -92,7 +92,7 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	test_cmp important c1.c
 '
 
-test_expect_failure 'will not overwrite unstaged changes in renamed file' '
+test_expect_success 'will not overwrite unstaged changes in renamed file' '
 	git reset --hard c1 &&
 	git mv c1.c other.c &&
 	git commit -m rename &&
diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..79fd97074e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1509,8 +1509,8 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		add_rejected_path(o, error_type, ce->name);
 }
 
-static int verify_uptodate(const struct cache_entry *ce,
-			   struct unpack_trees_options *o)
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o)
 {
 	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
 		return 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 6c48117b84..41178ada94 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -1,6 +1,7 @@
 #ifndef UNPACK_TREES_H
 #define UNPACK_TREES_H
 
+#include "tree-walk.h"
 #include "string-list.h"
 
 #define MAX_UNPACK_TREES 8
@@ -78,6 +79,9 @@ struct unpack_trees_options {
 extern int unpack_trees(unsigned n, struct tree_desc *t,
 		struct unpack_trees_options *options);
 
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o);
+
 int threeway_merge(const struct cache_entry * const *stages,
 		   struct unpack_trees_options *o);
 int twoway_merge(const struct cache_entry * const *src,
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 26/36] merge-recursive: fix remaining directory rename + dirty overwrite cases
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (24 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   | 25 ++++++++++++++++++++++---
 t/t6043-merge-rename-directories.sh |  8 ++++----
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7fdcba4f22..238711b038 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1324,11 +1324,22 @@ static int handle_file(struct merge_options *o,
 
 	add = filespec_from_entry(&other, dst_entry, stage ^ 1);
 	if (add) {
+		int ren_src_was_dirty = was_dirty(o, rename->path);
 		char *add_name = unique_path(o, rename->path, other_branch);
 		if (update_file(o, 0, &add->oid, add->mode, add_name))
 			return -1;
 
-		remove_file(o, 0, rename->path, 0);
+		if (ren_src_was_dirty) {
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       rename->path);
+		}
+		/*
+		 * Because the double negatives somehow keep confusing me...
+		 *    1) update_wd iff !ren_src_was_dirty.
+		 *    2) no_wd iff !update_wd
+		 *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
+		 */
+		remove_file(o, 0, rename->path, ren_src_was_dirty);
 		dst_name = unique_path(o, rename->path, cur_branch);
 	} else {
 		if (dir_in_way(rename->path, !o->call_depth, 0)) {
@@ -1466,7 +1477,10 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		if (would_lose_untracked(path))
+		if (was_dirty(o, path))
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       path);
+		else if (would_lose_untracked(path))
 			/*
 			 * Only way we get here is if both renames were from
 			 * a directory rename AND user had an untracked file
@@ -2075,6 +2089,7 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 {
 	struct string_list_item *item;
 	int stage = (tree == a_tree ? 2 : 3);
+	int update_wd;
 
 	/*
 	 * In all cases where we can do directory rename detection,
@@ -2085,7 +2100,11 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 	 * saying the file would have been overwritten), but it might
 	 * be dirty, though.
 	 */
-	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+	update_wd = !was_dirty(o, pair->two->path);
+	if (!update_wd)
+		output(o, 1, _("Refusing to lose dirty file at %s"),
+		       pair->two->path);
+	remove_file(o, 1, pair->two->path, !update_wd);
 
 	/* Find or create a new re->dst_entry */
 	item = string_list_lookup(entries, new_path);
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b94ba066fe..33e2649824 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3370,7 +3370,7 @@ test_expect_success '11b-setup: Avoid losing dirty file involved in directory re
 	)
 '
 
-test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+test_expect_success '11b-check: Avoid losing dirty file involved in directory rename' '
 	(
 		cd 11b &&
 
@@ -3512,7 +3512,7 @@ test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conf
 	)
 '
 
-test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+test_expect_success '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
 	(
 		cd 11d &&
 
@@ -3591,7 +3591,7 @@ test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
-test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+test_expect_success '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
 	(
 		cd 11e &&
 
@@ -3667,7 +3667,7 @@ test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
-test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+test_expect_success '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
 	(
 		cd 11f &&
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (25 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 26/36] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 28/36] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t6043-merge-rename-directories.sh | 296 ++++++++++++++++++++++++++++
 1 file changed, 296 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 33e2649824..5b84591445 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2871,6 +2872,68 @@ test_expect_failure '9g-check: Renamed directory that only contained immediate s
 	)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+#         a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+	test_create_repo 9h &&
+	(
+		cd 9h &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		echo more >>z/d &&
+		git add z/d &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+	(
+		cd 9h &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    A:z/d &&
+		test_cmp expect actual
+	)
+'
+
 ###########################################################################
 # Rules suggested by section 9:
 #
@@ -3704,4 +3767,237 @@ test_expect_success '11f-check: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
+###########################################################################
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###########################################################################
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+	test_create_repo 12a &&
+	(
+		cd 12a &&
+
+		mkdir -p node1 node2 &&
+		echo leaf1 >node1/leaf1 &&
+		echo leaf2 >node1/leaf2 &&
+		echo leaf3 >node2/leaf3 &&
+		echo leaf4 >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo leaf5 >node1/leaf5 &&
+		echo leaf6 >node2/leaf6 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '12a-check: Moving one directory hierarchy into another' '
+	(
+		cd 12a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+
+		git rev-parse >actual \
+			HEAD:node1/leaf1 HEAD:node1/leaf2 HEAD:node1/leaf5 \
+			HEAD:node1/node2/leaf3 \
+			HEAD:node1/node2/leaf4 \
+			HEAD:node1/node2/leaf6 &&
+		git rev-parse >expect \
+			O:node1/leaf1    O:node1/leaf2    B:node1/leaf5 \
+			O:node2/leaf3 \
+			O:node2/leaf4 \
+			B:node2/leaf6 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12b, Moving two directory hierarchies into each other
+#   (Related to testcases 1c and 12c)
+#   Commit O: node1/{leaf1, leaf2}, node2/{leaf3, leaf4}
+#   Commit A: node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
+#   Commit B: node2/{leaf3, leaf4, node1/{leaf1, leaf2}}
+#   Expected: node1/node2/node1/{leaf1, leaf2},
+#             node2/node1/node2/{leaf3, leaf4}
+#   NOTE: Without directory renames, we would expect
+#                   node2/node1/{leaf1, leaf2},
+#                   node1/node2/{leaf3, leaf4}
+#         with directory rename detection, we note that
+#             commit A renames node2/ -> node1/node2/
+#             commit B renames node1/ -> node2/node1/
+#         therefore, applying those directory renames to the initial result
+#         (making all four paths experience a transitive renaming), yields
+#         the expected result.
+#
+#         You may ask, is it weird to have two directories rename each other?
+#         To which, I can do no more than shrug my shoulders and say that
+#         even simple rules give weird results when given weird inputs.
+
+test_expect_success '12b-setup: Moving one directory hierarchy into another' '
+	test_create_repo 12b &&
+	(
+		cd 12b &&
+
+		mkdir -p node1 node2 &&
+		echo leaf1 >node1/leaf1 &&
+		echo leaf2 >node1/leaf2 &&
+		echo leaf3 >node2/leaf3 &&
+		echo leaf4 >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+	(
+		cd 12b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:node1/node2/node1/leaf1 \
+			HEAD:node1/node2/node1/leaf2 \
+			HEAD:node2/node1/node2/leaf3 \
+			HEAD:node2/node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12c, Moving two directory hierarchies into each other w/ content merge
+#   (Related to testcase 12b)
+#   Commit O: node1/{       leaf1_1, leaf2_1}, node2/{leaf3_1, leaf4_1}
+#   Commit A: node1/{       leaf1_2, leaf2_2,  node2/{leaf3_2, leaf4_2}}
+#   Commit B: node2/{node1/{leaf1_3, leaf2_3},        leaf3_3, leaf4_3}
+#   Expected: Content merge conflicts for each of:
+#               node1/node2/node1/{leaf1, leaf2},
+#               node2/node1/node2/{leaf3, leaf4}
+#   NOTE: This is *exactly* like 12c, except that every path is modified on
+#         each side of the merge.
+
+test_expect_success '12c-setup: Moving one directory hierarchy into another w/ content merge' '
+	test_create_repo 12c &&
+	(
+		cd 12c &&
+
+		mkdir -p node1 node2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf1\n" >node1/leaf1 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf2\n" >node1/leaf2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf3\n" >node2/leaf3 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf4\n" >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		for i in `git ls-files`; do echo side A >>$i; done &&
+		git add -u &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		for i in `git ls-files`; do echo side B >>$i; done &&
+		git add -u &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '12c-check: Moving one directory hierarchy into another w/ content merge' '
+	(
+		cd 12c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 &&
+
+		git ls-files -u >out &&
+		test_line_count = 12 out &&
+
+		git rev-parse >actual \
+			:1:node1/node2/node1/leaf1 \
+			:1:node1/node2/node1/leaf2 \
+			:1:node2/node1/node2/leaf3 \
+			:1:node2/node1/node2/leaf4 \
+			:2:node1/node2/node1/leaf1 \
+			:2:node1/node2/node1/leaf2 \
+			:2:node2/node1/node2/leaf3 \
+			:2:node2/node1/node2/leaf4 \
+			:3:node1/node2/node1/leaf1 \
+			:3:node1/node2/node1/leaf2 \
+			:3:node2/node1/node2/leaf3 \
+			:3:node2/node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 \
+			A:node1/leaf1 \
+			A:node1/leaf2 \
+			A:node1/node2/leaf3 \
+			A:node1/node2/leaf4 \
+			B:node2/node1/leaf1 \
+			B:node2/node1/leaf2 \
+			B:node2/leaf3 \
+			B:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 28/36] merge-recursive: avoid spurious rename/rename conflict from dir renames
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (26 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 29/36] merge-recursive: improve add_cacheinfo error handling Elijah Newren
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

If a file on one side of history was renamed, and merely modified on the
other side, then applying a directory rename to the modified side gives us
a rename/rename(1to2) conflict.  We should only apply directory renames to
pairs representing either adds or renames.

Making this change means that a directory rename testcase that was
previously reported as a rename/delete conflict will now be reported as a
modify/delete conflict.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 merge-recursive.c                   |  4 +--
 t/t6043-merge-rename-directories.sh | 55 +++++++++++++----------------
 2 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 238711b038..27278d51bb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1992,7 +1992,7 @@ static void compute_collisions(struct hashmap *collisions,
 		char *new_path;
 		struct diff_filepair *pair = pairs->queue[i];
 
-		if (pair->status == 'D')
+		if (pair->status != 'A' && pair->status != 'R')
 			continue;
 		dir_rename_ent = check_dir_renamed(pair->two->path,
 						   dir_renames);
@@ -2219,7 +2219,7 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct diff_filepair *pair = pairs->queue[i];
 		char *new_path; /* non-NULL only with directory renames */
 
-		if (pair->status == 'D') {
+		if (pair->status != 'A' && pair->status != 'R') {
 			diff_free_filepair(pair);
 			continue;
 		}
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 5b84591445..45f620633f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2078,18 +2078,23 @@ test_expect_success '8b-check: Dual-directory rename, one into the others way, w
 	)
 '
 
-# Testcase 8c, rename+modify/delete
-#   (Related to testcases 5b and 8d)
+# Testcase 8c, modify/delete or rename+modify/delete?
+#   (Related to testcases 5b, 8d, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: y/{b,c}
 #   Commit B: z/{b,c,d_modified,e}
-#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
+#   Expected: y/{b,c,e}, CONFLICT(modify/delete: on z/d)
 #
-#   Note: This testcase doesn't present any concerns for me...until you
-#         compare it with testcases 5b and 8d.  See notes in 8d for more
-#         details.
-
-test_expect_success '8c-setup: rename+modify/delete' '
+#   Note: It could easily be argued that the correct resolution here is
+#         y/{b,c,e}, CONFLICT(rename/delete: z/d -> y/d vs deleted)
+#         and that the modifed version of d should be present in y/ after
+#         the merge, just marked as conflicted.  Indeed, I previously did
+#         argue that.  But applying directory renames to the side of
+#         history where a file is merely modified results in spurious
+#         rename/rename(1to2) conflicts -- see testcase 9h.  See also
+#         notes in 8d.
+
+test_expect_success '8c-setup: modify/delete or rename+modify/delete?' '
 	test_create_repo 8c &&
 	(
 		cd 8c &&
@@ -2122,32 +2127,32 @@ test_expect_success '8c-setup: rename+modify/delete' '
 	)
 '
 
-test_expect_success '8c-check: rename+modify/delete' '
+test_expect_success '8c-check: modify/delete or rename+modify/delete' '
 	(
 		cd 8c &&
 
 		git checkout A^0 &&
 
 		test_must_fail git merge -s recursive B^0 >out &&
-		test_i18ngrep "CONFLICT (rename/delete).* z/d.*y/d" out &&
+		test_i18ngrep "CONFLICT (modify/delete).* z/d" out &&
 
 		git ls-files -s >out &&
-		test_line_count = 4 out &&
+		test_line_count = 5 out &&
 		git ls-files -u >out &&
-		test_line_count = 1 out &&
+		test_line_count = 2 out &&
 		git ls-files -o >out &&
 		test_line_count = 1 out &&
 
 		git rev-parse >actual \
-			:0:y/b :0:y/c :0:y/e :3:y/d &&
+			:0:y/b :0:y/c :0:y/e :1:z/d :3:z/d &&
 		git rev-parse >expect \
-			 O:z/b  O:z/c  B:z/e  B:z/d &&
+			 O:z/b  O:z/c  B:z/e  O:z/d  B:z/d &&
 		test_cmp expect actual &&
 
-		test_must_fail git rev-parse :1:y/d &&
-		test_must_fail git rev-parse :2:y/d &&
-		git ls-files -s y/d | grep ^100755 &&
-		test_path_is_file y/d
+		test_must_fail git rev-parse :2:z/d &&
+		git ls-files -s z/d | grep ^100755 &&
+		test_path_is_file z/d &&
+		test_path_is_missing y/d
 	)
 '
 
@@ -2161,16 +2166,6 @@ test_expect_success '8c-check: rename+modify/delete' '
 #
 #   Note: It would also be somewhat reasonable to resolve this as
 #             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
-#   The logic being that the only difference between this testcase and 8c
-#   is that there is no modification to d.  That suggests that instead of a
-#   rename/modify vs. delete conflict, we should just have a rename/delete
-#   conflict, otherwise we are being inconsistent.
-#
-#   However...as far as consistency goes, we didn't report a conflict for
-#   path d_1 in testcase 5b due to a different file being in the way.  So,
-#   we seem to be forced to have cases where users can change things
-#   slightly and get what they may perceive as inconsistent results.  It
-#   would be nice to avoid that, but I'm not sure I see how.
 #
 #   In this case, I'm leaning towards: commit A was the one that deleted z/d
 #   and it did the rename of z to y, so the two "conflicts" (rename vs.
@@ -2915,7 +2910,7 @@ test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
 	)
 '
 
-test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+test_expect_success '9h-check: Avoid dir rename on merely modified path' '
 	(
 		cd 9h &&
 
@@ -3959,7 +3954,7 @@ test_expect_success '12c-setup: Moving one directory hierarchy into another w/ c
 	)
 '
 
-test_expect_failure '12c-check: Moving one directory hierarchy into another w/ content merge' '
+test_expect_success '12c-check: Moving one directory hierarchy into another w/ content merge' '
 	(
 		cd 12c &&
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 29/36] merge-recursive: improve add_cacheinfo error handling
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (27 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 28/36] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 30/36] merge-recursive: move more is_dirty handling to merge_content Elijah Newren
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Four closely related changes all with the purpose of fixing error handling
in this function:
  - fix reported function name in add_cacheinfo error messages
  - differentiate between the two error messages
  - abort early when we hit the error (stop ignoring return code)
  - mark a test which was hitting this error as failing until we get the
    right fix

In more detail...

In commit 0424138d5715 ("Fix bogus error message from merge-recursive
error path", 2007-04-01), it was noted that the name of the function which
the error message claimed it was reported from did not match the actual
function name.  This was changed to something closer to the real function
name, but it still didn't match the actual function name.  Fix the
reported name to match.

Second, the two errors in this function had identical messages, preventing
us from knowing which error had been triggered.  Add a couple words to the
second error message to differentiate the two.

Next, make sure callers do not ignore the return code so that it will stop
processing further entries (processing further entries could result in
more output which could cause the error to scroll off the screen, or at
least be missed by the user) and make it clear the error is the cause of
the early abort.  These errors should never be triggered in production; if
either one is, it represents a bug in the calling path somewhere and is
likely to have resulted in mis-merged content.  The combination of
ignoring of the return code and continuing to print other standard
messages after hitting the error resulted in the following bug report from
Junio: "...the command pretends that everything went well and merged
cleanly in that path...[Behaving] in a buggy and unexplainable way is bad
enough, doing so silently is unexcusable."  Fix this.

Finally, there was one test in the testsuite that did hit this error path,
but was passing anyway.  This would have been easy to miss since it had a
test_must_fail and thus could have failed for the wrong reason, but in a
separate testing step I added an intentional NULL-dereference to the
codepath where these error messages are printed in order to flush out such
cases.  I could modify that test to explicitly check for this error and
fail the test if it is hit, but since this test operates in a bit of a
gray area and needed other changes, I went for a different fix.  The gray
area this test operates in is the following: If the merge of a certain
file results in the same version of the file that existed in HEAD, but
there are dirty modifications to the file, is that an error with a
"Refusing to overwrite existing file" expected, or a case where the merge
should succeed since we shouldn't have to touch the dirty file anyway?
Recent discussion on the list leaned towards saying it should be a
success.  Therefore, change the expected behavior of this test to match.
As a side effect, this makes the failed-due-to-hitting-add_cacheinfo-error
very clear, and we can mark the test as test_expect_failure.  A subsequent
commit will implement the necessary changes to get this test to pass
again.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

I thought the changes were small enough to just combine, but after
typing up the commit message and seeing how long it is, I'm wondering if
I should split this into two commits.  Thoughts?

 merge-recursive.c             | 13 ++++++++-----
 t/t3501-revert-cherry-pick.sh |  7 +++----
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 27278d51bb..b0f74cb243 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -316,7 +316,7 @@ static int add_cacheinfo(struct merge_options *o,
 
 	ce = make_cache_entry(mode, oid ? oid->hash : null_sha1, path, stage, 0);
 	if (!ce)
-		return err(o, _("addinfo_cache failed for path '%s'"), path);
+		return err(o, _("add_cacheinfo failed for path '%s'; merge aborting."), path);
 
 	ret = add_cache_entry(ce, options);
 	if (refresh) {
@@ -324,7 +324,7 @@ static int add_cacheinfo(struct merge_options *o,
 
 		nce = refresh_cache_entry(ce, CE_MATCH_REFRESH | CE_MATCH_IGNORE_MISSING);
 		if (!nce)
-			return err(o, _("addinfo_cache failed for path '%s'"), path);
+			return err(o, _("add_cacheinfo failed to refresh for path '%s'; merge aborting."), path);
 		if (nce != ce)
 			ret = add_cache_entry(nce, options);
 	}
@@ -942,7 +942,9 @@ static int update_file_flags(struct merge_options *o,
 	}
  update_index:
 	if (!ret && update_cache)
-		add_cacheinfo(o, mode, oid, path, 0, update_wd, ADD_CACHE_OK_TO_ADD);
+		if (add_cacheinfo(o, mode, oid, path, 0, update_wd,
+				  ADD_CACHE_OK_TO_ADD))
+			return -1;
 	return ret;
 }
 
@@ -2783,8 +2785,9 @@ static int merge_content(struct merge_options *o,
 		 */
 		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
 		if (!path_renamed_outside_HEAD) {
-			add_cacheinfo(o, mfi.mode, &mfi.oid, path,
-				      0, (!o->call_depth), 0);
+			if (add_cacheinfo(o, mfi.mode, &mfi.oid, path,
+					  0, (!o->call_depth), 0))
+				return -1;
 			return mfi.clean;
 		}
 	} else
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index c9a1f783f5..3871807d09 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_success 'cherry-pick works with dirty renamed file' '
+test_expect_failure 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
@@ -150,9 +150,8 @@ test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_tick &&
 	git commit -m renamed &&
 	echo modified >renamed &&
-	test_must_fail git cherry-pick refs/heads/unrelated >out &&
-	test_i18ngrep "Refusing to lose dirty file at renamed" out &&
-	test $(git rev-parse :0:renamed) = $(git rev-parse HEAD^:to-rename.t) &&
+	git cherry-pick refs/heads/unrelated >out &&
+	test $(git rev-parse :0:renamed) = $(git rev-parse HEAD~2:to-rename.t) &&
 	grep -q "^modified$" renamed
 '
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 30/36] merge-recursive: move more is_dirty handling to merge_content
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (28 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 29/36] merge-recursive: improve add_cacheinfo error handling Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 31/36] merge-recursive: avoid triggering add_cacheinfo error with dirty mod Elijah Newren
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

conflict_rename_normal() was doing some handling for dirty files that
more naturally belonged in merge_content.  Move it, and rename a
parameter for clarity while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 30 ++++++++++++------------------
 1 file changed, 12 insertions(+), 18 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index b0f74cb243..7b0081565a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2727,7 +2727,7 @@ static int handle_modify_delete(struct merge_options *o,
 
 static int merge_content(struct merge_options *o,
 			 const char *path,
-			 int file_in_way,
+			 int is_dirty,
 			 struct object_id *o_oid, int o_mode,
 			 struct object_id *a_oid, int a_mode,
 			 struct object_id *b_oid, int b_mode,
@@ -2803,7 +2803,7 @@ static int merge_content(struct merge_options *o,
 				return -1;
 	}
 
-	if (df_conflict_remains || file_in_way) {
+	if (df_conflict_remains || is_dirty) {
 		char *new_path;
 		if (o->call_depth) {
 			remove_file_from_cache(path);
@@ -2825,6 +2825,10 @@ static int merge_content(struct merge_options *o,
 
 		}
 		new_path = unique_path(o, path, rename_conflict_info->branch1);
+		if (is_dirty) {
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       path);
+		}
 		output(o, 1, _("Adding as %s instead"), new_path);
 		if (update_file(o, 0, &mfi.oid, mfi.mode, new_path)) {
 			free(new_path);
@@ -2834,7 +2838,7 @@ static int merge_content(struct merge_options *o,
 		mfi.clean = 0;
 	} else if (update_file(o, mfi.clean, &mfi.oid, mfi.mode, path))
 		return -1;
-	return mfi.clean;
+	return !is_dirty && mfi.clean;
 }
 
 static int conflict_rename_normal(struct merge_options *o,
@@ -2844,21 +2848,10 @@ static int conflict_rename_normal(struct merge_options *o,
 				  struct object_id *b_oid, unsigned int b_mode,
 				  struct rename_conflict_info *ci)
 {
-	int clean_merge;
-	int file_in_the_way = 0;
-
-	if (was_dirty(o, path)) {
-		file_in_the_way = 1;
-		output(o, 1, _("Refusing to lose dirty file at %s"), path);
-	}
-
 	/* Merge the content and write it out */
-	clean_merge = merge_content(o, path, file_in_the_way,
-				    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
-				    ci);
-	if (clean_merge > 0 && file_in_the_way)
-		clean_merge = 0;
-	return clean_merge;
+	return merge_content(o, path, was_dirty(o, path),
+			     o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
+			     ci);
 }
 
 /* Per entry merge function */
@@ -2981,7 +2974,8 @@ static int process_entry(struct merge_options *o,
 	} else if (a_oid && b_oid) {
 		/* Case C: Added in both (check for same permissions) and */
 		/* case D: Modified in both, but differently. */
-		clean_merge = merge_content(o, path, 0 /* file_in_way */,
+		int is_dirty = 0; /* unpack_trees would have bailed if dirty */
+		clean_merge = merge_content(o, path, is_dirty,
 					    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 					    NULL);
 	} else if (!o_oid && !a_oid && !b_oid) {
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 31/36] merge-recursive: avoid triggering add_cacheinfo error with dirty mod
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (29 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 30/36] merge-recursive: move more is_dirty handling to merge_content Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

If a cherry-pick or merge with a rename results in a skippable update
(due to the merged content matching what HEAD already had), but the
working directory is dirty, avoid trying to refresh the index as that
will fail.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c             | 2 +-
 t/t3501-revert-cherry-pick.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7b0081565a..b32e8d817a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2786,7 +2786,7 @@ static int merge_content(struct merge_options *o,
 		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
 		if (!path_renamed_outside_HEAD) {
 			if (add_cacheinfo(o, mfi.mode, &mfi.oid, path,
-					  0, (!o->call_depth), 0))
+					  0, (!o->call_depth && !is_dirty), 0))
 				return -1;
 			return mfi.clean;
 		}
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 3871807d09..d1c68af8c5 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'cherry-pick works with dirty renamed file' '
+test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (30 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 31/36] merge-recursive: avoid triggering add_cacheinfo error with dirty mod Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 20:26   ` SZEDER Gábor
  2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
                   ` (5 subsequent siblings)
  37 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Add several tests checking whether updates can be skipped in a merge.
Also add several similar testcases for where updates cannot be skipped in
a merge to make sure that we skip if and only if we should.

In particular:

  * Testcase 1a (particularly 1a-check-L) would have pointed out the
    problem Linus has been dealing with for year with his merges[1].

  * Testcase 2a (particularly 2a-check-L) would have pointed out the
    problem with my directory-rename-series before it broke master[2].

  * Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
    than 12b of t6043 making that one easier to understand.

  * There are several complementary testcases to make sure we're not just
    fixing those particular issues while regressing in the opposite
    direction.

  * There are also a pair of tests for the special case when a merge
    results in a skippable update AND the user has dirty modifications to
    the path.

[1] https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
---

Stefan Beller reviewed an RFC version of this patch and gave his
Reviewed-by for it, but I've significantly modified it since then,
including:

  - new tests for dirty files being present
  - new test for a carefully crafted rename/add conflict (which I
    constructed to try see if there was still a way to get mis-merges
    with the RFC patches)
  - better test cleanup/recovery (add git clean to go with git reset)
  - added modification timestamp checking to relevant tests to verify
    that when merge-recursive claims certain file updates were skipped
    during the merge that they really were skipped.

 t/t6046-merge-skip-unneeded-updates.sh | 761 +++++++++++++++++++++++++
 1 file changed, 761 insertions(+)
 create mode 100755 t/t6046-merge-skip-unneeded-updates.sh

diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh
new file mode 100755
index 0000000000..911e2f87a4
--- /dev/null
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -0,0 +1,761 @@
+#!/bin/sh
+
+test_description="merge cases"
+
+# The setup for all of them, pictorially, is:
+#
+#      A
+#      o
+#     / \
+#  O o   ?
+#     \ /
+#      o
+#      B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#    z/{b,c}   means  files z/b and z/c both exist
+#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
+#                     underscore notation is to differentiate different
+#                     files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###########################################################################
+# SECTION 1: Cases involving no renames (one side has subset of changes of
+#            the other side)
+###########################################################################
+
+# Testcase 1a, Changes on A, subset of changes on B
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: b_3
+#   Expected: b_2
+
+test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
+	test_create_repo 1a &&
+	(
+		cd 1a &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 1a reset --hard" &&
+	test_when_finished "git -C 1a clean -fd" &&
+	(
+		cd 1a &&
+
+		git checkout A^0 &&
+
+		test-tool chmtime =31337 b &&
+		test-tool chmtime -v +0 b >expected-mtime &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped b" out &&
+		test_must_be_empty err &&
+
+		test-tool chmtime -v +0 b >actual-mtime &&
+		test_cmp expected-mtime actual-mtime &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:b &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object b   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success '1a-check-R: Modify(A)/Modify(B), change on B subset of A' '
+	test_when_finished "git -C 1a reset --hard" &&
+	test_when_finished "git -C 1a clean -fd" &&
+	(
+		cd 1a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging b" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:b &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object b   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual
+	)
+'
+
+
+###########################################################################
+# SECTION 2: Cases involving basic renames
+###########################################################################
+
+# Testcase 2a, Changes on A, rename on B
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: c_1
+#   Expected: c_2
+
+test_expect_success '2a-setup: Modify(A)/rename(B)' '
+	test_create_repo 2a &&
+	(
+		cd 2a &&
+
+		test_seq 1 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_seq 1 11 > b &&
+		git add b &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '2a-check-L: Modify/rename, merge into modify side' '
+	test_when_finished "git -C 2a reset --hard" &&
+	test_when_finished "git -C 2a clean -fd" &&
+	(
+		cd 2a &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped c" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+test_expect_success '2a-check-R: Modify/rename, merge into rename side' '
+	test_when_finished "git -C 2a reset --hard" &&
+	test_when_finished "git -C 2a clean -fd" &&
+	(
+		cd 2a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped c" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:b >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+# Testcase 2b, Changed and renamed on A, subset of changes on B
+#   Commit O: b_1
+#   Commit A: c_2
+#   Commit B: b_3
+#   Expected: c_2
+
+test_expect_success '2b-setup: Rename+Mod(A)/Mod(B), B mods subset of A' '
+	test_create_repo 2b &&
+	(
+		cd 2b &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '2b-check-L: Rename+Mod(A)/Mod(B), B mods subset of A' '
+	test_when_finished "git -C 2b reset --hard" &&
+	test_when_finished "git -C 2b clean -fd" &&
+	(
+		cd 2b &&
+
+		git checkout A^0 &&
+
+		test-tool chmtime =31337 c &&
+		test-tool chmtime -v +0 c >expected-mtime &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped c" out &&
+		test_must_be_empty err &&
+
+		test-tool chmtime -v +0 c >actual-mtime &&
+		test_cmp expected-mtime actual-mtime &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:c &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:c >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+test_expect_success '2b-check-R: Rename+Mod(A)/Mod(B), B mods subset of A' '
+	test_when_finished "git -C 2b reset --hard" &&
+	test_when_finished "git -C 2b clean -fd" &&
+	(
+		cd 2b &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep "Auto-merging c" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual HEAD:c &&
+		git rev-parse >expect A:c &&
+		test_cmp expect actual &&
+
+		git hash-object c   >actual &&
+		git rev-parse   A:c >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+# Testcase 2c, Changes on A, rename on B
+#   Commit O: b_1
+#   Commit A: b_2, c_3
+#   Commit B: c_1
+#   Expected: rename/add conflict c_2 vs c_3
+#
+#   NOTE: Since A modified b_1->b_2, and B renamed b_1->c_1, the threeway
+#         merge of those files should result in c_2.  We then should have a
+#         rename/add conflict between c_2 and c_3.  However, if we note in
+#         merge_content() that A had the right contents (b_2 has same
+#         contents as c_2, just at a different name), and that A had the
+#         right path present (c_3 existed) and thus decides that it can
+#         skip the update, then we're in trouble.  This test verifies we do
+#         not make that particular mistake.
+
+test_expect_success '2c-setup: Modify b & add c VS rename b->c' '
+	test_create_repo 2c &&
+	(
+		cd 2c &&
+
+		test_seq 1 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_seq 1 11 >b &&
+		echo whatever >c &&
+		git add b c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '2c-check: Modify b & add c VS rename b->c' '
+	(
+		cd 2c &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 test_must_fail git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "CONFLICT (rename/add): Rename b->c" out &&
+		test_i18ngrep ! "Skipped c" out &&
+		test_must_be_empty err
+
+		# FIXME: rename/add conflicts are horribly broken right now;
+		# when I get back to my patch series fixing it and
+		# rename/rename(2to1) conflicts to bring them in line with
+		# how add/add conflicts behave, then checks like the below
+		# could be added.  But that patch series is waiting until
+		# the rename-directory-detection series lands, which this
+		# is part of.  And in the mean time, I do not want to further
+		# enforce broken behavior.  So for now, the main test is the
+		# one above that err is an empty file.
+
+		#git ls-files -s >index_files &&
+		#test_line_count = 2 index_files &&
+
+		#git rev-parse >actual :2:c :3:c &&
+		#git rev-parse >expect A:b  A:c  &&
+		#test_cmp expect actual &&
+
+		#git cat-file -p A:b >>merged &&
+		#git cat-file -p A:c >>merge-me &&
+		#>empty &&
+		#test_must_fail git merge-file \
+		#	-L "Temporary merge branch 1" \
+		#	-L "" \
+		#	-L "Temporary merge branch 2" \
+		#	merged empty merge-me &&
+		#sed -e "s/^\([<=>]\)/\1\1\1/" merged >merged-internal &&
+
+		#git hash-object c               >actual &&
+		#git hash-object merged-internal >expect &&
+		#test_cmp expect actual &&
+
+		#test_path_is_missing b
+	)
+'
+
+
+###########################################################################
+# SECTION 3: Cases involving directory renames
+#
+# NOTE:
+#   Directory renames only apply when one side renames a directory, and the
+#   other side adds or renames a path into that directory.  Applying the
+#   directory rename to that new path creates a new pathname that didn't
+#   exist on either side of history.  Thus, it is impossible for the
+#   merge contents to already be at the right path, so all of these checks
+#   exist just to make sure that updates are not skipped.
+###########################################################################
+
+# Testcase 3a, Change + rename into dir foo on A, dir rename foo->bar on B
+#   Commit O: bq_1, foo/whatever
+#   Commit A: foo/{bq_2, whatever}
+#   Commit B: bq_1, bar/whatever
+#   Expected: bar/{bq_2, whatever}
+
+test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_create_repo 3a &&
+	(
+		cd 3a &&
+
+		mkdir foo &&
+		test_seq 1 10 >bq &&
+		test_write_lines a b c d e f g h i j k >foo/whatever &&
+		git add bq foo/whatever &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_seq 1 11 > bq &&
+		git add bq &&
+		git mv bq foo/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv foo/ bar/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3a reset --hard" &&
+	test_when_finished "git -C 3a clean -fd" &&
+	(
+		cd 3a &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect A:foo/bq    A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq   bar/whatever   >actual &&
+		git rev-parse   A:foo/bq A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+test_expect_success '3a-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3a reset --hard" &&
+	test_when_finished "git -C 3a clean -fd" &&
+	(
+		cd 3a &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect A:foo/bq    A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq   bar/whatever   >actual &&
+		git rev-parse   A:foo/bq A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+# Testcase 3b, rename into dir foo on A, dir rename foo->bar + change on B
+#   Commit O: bq_1, foo/whatever
+#   Commit A: foo/{bq_1, whatever}
+#   Commit B: bq_2, bar/whatever
+#   Expected: bar/{bq_2, whatever}
+
+test_expect_success '3b-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_create_repo 3b &&
+	(
+		cd 3b &&
+
+		mkdir foo &&
+		test_seq 1 10 >bq &&
+		test_write_lines a b c d e f g h i j k >foo/whatever &&
+		git add bq foo/whatever &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv bq foo/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_seq 1 11 > bq &&
+		git add bq &&
+		git mv foo/ bar/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3b-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3b reset --hard" &&
+	test_when_finished "git -C 3b clean -fd" &&
+	(
+		cd 3b &&
+
+		git checkout A^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect B:bq        A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq bar/whatever   >actual &&
+		git rev-parse   B:bq   A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+test_expect_failure '3b-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+	test_when_finished "git -C 3b reset --hard" &&
+	test_when_finished "git -C 3b clean -fd" &&
+	(
+		cd 3b &&
+
+		git checkout B^0 &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive A^0 >out 2>err &&
+
+		test_i18ngrep ! "Skipped bar/bq" out &&
+		test_must_be_empty err &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 2 index_files &&
+
+		git rev-parse >actual HEAD:bar/bq HEAD:bar/whatever &&
+		git rev-parse >expect B:bq        A:foo/whatever &&
+		test_cmp expect actual &&
+
+		git hash-object bar/bq bar/whatever   >actual &&
+		git rev-parse   B:bq   A:foo/whatever >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:bq HEAD:foo/bq &&
+		test_path_is_missing bq foo/bq foo/whatever
+	)
+'
+
+###########################################################################
+# SECTION 4: Cases involving dirty changes
+###########################################################################
+
+# Testcase 4a, Changed on A, subset of changes on B, locally modified
+#   Commit O: b_1
+#   Commit A: b_2
+#   Commit B: b_3
+#   Working copy: b_4
+#   Expected: b_2 for merge, b_4 in working copy
+
+test_expect_success '4a-setup: Change on A, change on B subset of A, dirty mods present' '
+	test_create_repo 4a &&
+	(
+		cd 4a &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+# NOTE: For as long as we continue using unpack_trees() without index_only
+#   set to true, it will error out on a case like this claiming the the locally
+#   modified file would be overwritten by the merge.  Getting this testcase
+#   correct requires doing the merge in-memory first, then realizing that no
+#   updates to the file are necessary, and thus that we can just leave the path
+#   alone.
+test_expect_failure '4a-check: Change on A, change on B subset of A, dirty mods present' '
+	test_when_finished "git -C 4a reset --hard" &&
+	test_when_finished "git -C 4a clean -fd" &&
+	(
+		cd 4a &&
+
+		git checkout A^0 &&
+		echo "File rewritten" >b &&
+
+		test-tool chmtime =31337 b &&
+		test-tool chmtime -v +0 b >expected-mtime &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped b" out &&
+		test_must_be_empty err &&
+
+		test-tool chmtime -v +0 b >actual-mtime &&
+		test_cmp expected-mtime actual-mtime &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual :0:b &&
+		git rev-parse >expect A:b &&
+		test_cmp expect actual &&
+
+		git hash-object b >actual &&
+		echo "File rewritten" | git hash-object --stdin >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 4b, Changed+renamed on A, subset of changes on B, locally modified
+#   Commit O: b_1
+#   Commit A: c_2
+#   Commit B: b_3
+#   Working copy: c_4
+#   Expected: c_2
+
+test_expect_success '4b-setup: Rename+Mod(A)/Mod(B), change on B subset of A, dirty mods present' '
+	test_create_repo 4b &&
+	(
+		cd 4b &&
+
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		git add b &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 10.5 >b &&
+		git add b &&
+		git mv b c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_write_lines 1 2 3 4 5 5.5 6 7 8 9 10 >b &&
+		git add b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '4b-check: Rename+Mod(A)/Mod(B), change on B subset of A, dirty mods present' '
+	test_when_finished "git -C 4b reset --hard" &&
+	test_when_finished "git -C 4b clean -fd" &&
+	(
+		cd 4b &&
+
+		git checkout A^0 &&
+		echo "File rewritten" >c &&
+
+		test-tool chmtime =31337 c &&
+		test-tool chmtime -v +0 c >expected-mtime &&
+
+		GIT_MERGE_VERBOSITY=3 git merge -s recursive B^0 >out 2>err &&
+
+		test_i18ngrep "Skipped c" out &&
+		test_must_be_empty err &&
+
+		test-tool chmtime -v +0 c >actual-mtime &&
+		test_cmp expected-mtime actual-mtime &&
+
+		git ls-files -s >index_files &&
+		test_line_count = 1 index_files &&
+
+		git rev-parse >actual :0:c &&
+		git rev-parse >expect A:c &&
+		test_cmp expect actual &&
+
+		git hash-object c >actual &&
+		echo "File rewritten" | git hash-object --stdin >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:b &&
+		test_path_is_missing b
+	)
+'
+
+test_done
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (31 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 20:39   ` Martin Ågren
  2018-04-20 12:23   ` SZEDER Gábor
  2018-04-19 17:58 ` [PATCH v10 34/36] merge-recursive: fix remainder of was_dirty() to use original index Elijah Newren
                   ` (4 subsequent siblings)
  37 siblings, 2 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

In commit aacb82de3ff8 ("merge-recursive: Split was_tracked() out of
would_lose_untracked()", 2011-08-11), was_tracked() was split out of
would_lose_untracked() with the intent to provide a function that could
answer whether a path was tracked in the index before the merge.  Sadly,
it instead returned whether the path was in the working tree due to having
been tracked in the index before the merge OR having been written there by
unpack_trees().  The distinction is important when renames are involved,
e.g. for a merge where:

   HEAD:  modifies path b
   other: renames b->c

In this case, c was not tracked in the index before the merge, but would
have been added to the index at stage 0 and written to the working tree by
unpack_trees().  would_lose_untracked() is more interested in the
in-working-copy-for-either-reason behavior, while all other uses of
was_tracked() want just was-it-tracked-in-index-before-merge behavior.

Unsplit would_lose_untracked() and write a new was_tracked() function
which answers whether a path was tracked in the index before the merge
started.

This will also affect was_dirty(), helping it to return better results
since it can base answers off the original index rather than an index that
possibly only copied over some of the stat information.  However,
was_dirty() will need an additional change that will be made in a
subsequent patch.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

This patch is nearly identical to one I sent out as an RFC and which
was previously reviewed by Junio at

  https://public-inbox.org/git/CABPp-BFPTJsTUVoPxxN=2u5jEqn1ngdDvMNhp+VLZKTgZaUkvw@mail.gmail.com/

It is not clear whether my responses in that thread were sufficient, but
I did make the two changes I mentioned there:
  - Fix the broken comment in git_merge_trees()
  - Add a note to the comment in would_lose_untracked() about the
    annoying worktree-first-then-index requirement

 merge-recursive.c | 91 ++++++++++++++++++++++++++++++++++-------------
 merge-recursive.h |  1 +
 2 files changed, 68 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index b32e8d817a..097de7e5a7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -344,6 +344,7 @@ static int git_merge_trees(struct merge_options *o,
 {
 	int rc;
 	struct tree_desc t[3];
+	struct index_state tmp_index = { NULL };
 
 	memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
 	if (o->call_depth)
@@ -354,7 +355,7 @@ static int git_merge_trees(struct merge_options *o,
 	o->unpack_opts.head_idx = 2;
 	o->unpack_opts.fn = threeway_merge;
 	o->unpack_opts.src_index = &the_index;
-	o->unpack_opts.dst_index = &the_index;
+	o->unpack_opts.dst_index = &tmp_index;
 	setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
 	init_tree_desc_from_tree(t+0, common);
@@ -362,13 +363,18 @@ static int git_merge_trees(struct merge_options *o,
 	init_tree_desc_from_tree(t+2, merge);
 
 	rc = unpack_trees(3, t, &o->unpack_opts);
+	cache_tree_free(&active_cache_tree);
+
 	/*
-	 * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
-	 * so set to the new index which will usually have modification
-	 * timestamp info copied over.
+	 * Update the_index to match the new results, AFTER saving a copy
+	 * in o->orig_index.  Update src_index to point to the saved copy.
+	 * (verify_uptodate() checks src_index, and the original index is
+	 * the one that had the necessary modification timestamps.)
 	 */
-	o->unpack_opts.src_index = &the_index;
-	cache_tree_free(&active_cache_tree);
+	o->orig_index = the_index;
+	the_index = tmp_index;
+	o->unpack_opts.src_index = &o->orig_index;
+
 	return rc;
 }
 
@@ -773,31 +779,59 @@ static int dir_in_way(const char *path, int check_working_copy, int empty_ok)
 		!(empty_ok && is_empty_dir(path));
 }
 
-static int was_tracked(const char *path)
+/*
+ * Returns whether path was tracked in the index before the merge started
+ */
+static int was_tracked(struct merge_options *o, const char *path)
 {
-	int pos = cache_name_pos(path, strlen(path));
+	int pos = index_name_pos(&o->orig_index, path, strlen(path));
 
 	if (0 <= pos)
-		/* we have been tracking this path */
+		/* we were tracking this path before the merge */
 		return 1;
 
-	/*
-	 * Look for an unmerged entry for the path,
-	 * specifically stage #2, which would indicate
-	 * that "our" side before the merge started
-	 * had the path tracked (and resulted in a conflict).
-	 */
-	for (pos = -1 - pos;
-	     pos < active_nr && !strcmp(path, active_cache[pos]->name);
-	     pos++)
-		if (ce_stage(active_cache[pos]) == 2)
-			return 1;
 	return 0;
 }
 
 static int would_lose_untracked(const char *path)
 {
-	return !was_tracked(path) && file_exists(path);
+	/*
+	 * This may look like it can be simplified to:
+	 *   return !was_tracked(o, path) && file_exists(path)
+	 * but it can't.  This function needs to know whether path was in
+	 * the working tree due to EITHER having been tracked in the index
+	 * before the merge OR having been put into the working copy and
+	 * index by unpack_trees().  Due to that either-or requirement, we
+	 * check the current index instead of the original one.
+	 *
+	 * Note that we do not need to worry about merge-recursive itself
+	 * updating the index after unpack_trees() and before calling this
+	 * function, because we strictly require all code paths in
+	 * merge-recursive to update the working tree first and the index
+	 * second.  Doing otherwise would break
+	 * update_file()/would_lose_untracked(); see every comment in this
+	 * file which mentions "update_stages".
+	 */
+	int pos = cache_name_pos(path, strlen(path));
+
+	if (pos < 0)
+		pos = -1 - pos;
+	while (pos < active_nr &&
+	       !strcmp(path, active_cache[pos]->name)) {
+		/*
+		 * If stage #0, it is definitely tracked.
+		 * If it has stage #2 then it was tracked
+		 * before this merge started.  All other
+		 * cases the path was not tracked.
+		 */
+		switch (ce_stage(active_cache[pos])) {
+		case 0:
+		case 2:
+			return 0;
+		}
+		pos++;
+	}
+	return file_exists(path);
 }
 
 static int was_dirty(struct merge_options *o, const char *path)
@@ -805,7 +839,7 @@ static int was_dirty(struct merge_options *o, const char *path)
 	struct cache_entry *ce;
 	int dirty = 1;
 
-	if (o->call_depth || !was_tracked(path))
+	if (o->call_depth || !was_tracked(o, path))
 		return !dirty;
 
 	ce = cache_file_exists(path, strlen(path), ignore_case);
@@ -2419,7 +2453,7 @@ static int process_renames(struct merge_options *o,
 			 * add-source case).
 			 */
 			remove_file(o, 1, ren1_src,
-				    renamed_stage == 2 || !was_tracked(ren1_src));
+				    renamed_stage == 2 || !was_tracked(o, ren1_src));
 
 			oidcpy(&src_other.oid,
 			       &ren1->src_entry->stages[other_stage].oid);
@@ -2812,7 +2846,7 @@ static int merge_content(struct merge_options *o,
 				if (update_stages(o, path, &one, &a, &b))
 					return -1;
 			} else {
-				int file_from_stage2 = was_tracked(path);
+				int file_from_stage2 = was_tracked(o, path);
 				struct diff_filespec merged;
 				oidcpy(&merged.oid, &mfi.oid);
 				merged.mode = mfi.mode;
@@ -3081,6 +3115,15 @@ int merge_trees(struct merge_options *o,
 	else
 		clean = 1;
 
+	/* Free the extra index left from git_merge_trees() */
+	/*
+	 * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
+	 * tucked away in o->unpack_opts.msgs, but the problem is that only
+	 * half of it refers to dynamically allocated data, while the other
+	 * half points at static strings.
+	 */
+	discard_index(&o->orig_index);
+
 	if (o->call_depth && !(*result = write_tree_from_memory(o)))
 		return -1;
 
diff --git a/merge-recursive.h b/merge-recursive.h
index d863cf8867..248093e407 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,7 @@ struct merge_options {
 	struct hashmap current_file_dir_set;
 	struct string_list df_conflict_file_set;
 	struct unpack_trees_options unpack_opts;
+	struct index_state orig_index;
 };
 
 /*
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 34/36] merge-recursive: fix remainder of was_dirty() to use original index
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (32 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 35/36] merge-recursive: make "Auto-merging" comment show for other merges Elijah Newren
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

was_dirty() uses was_tracked(), which has been updated to use the original
index rather than the current one.  However, was_dirty() also had a
separate call to cache_file_exists(), causing it to still implicitly use
the current index.  Update that to instead use index_file_exists().

Also, was_dirty() had a hack where it would mark any file as non-dirty if
we simply didn't know its modification time.  This was due to using the
current index rather than the original index, because D/F conflicts and
such would cause unpack_trees() to not copy the modification times from
the original index to the current one.  Now that we are using the original
index, we can dispense with this hack.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 097de7e5a7..1a481fa3dc 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -842,9 +842,9 @@ static int was_dirty(struct merge_options *o, const char *path)
 	if (o->call_depth || !was_tracked(o, path))
 		return !dirty;
 
-	ce = cache_file_exists(path, strlen(path), ignore_case);
-	dirty = (ce->ce_stat_data.sd_mtime.sec > 0 &&
-		 verify_uptodate(ce, &o->unpack_opts) != 0);
+	ce = index_file_exists(o->unpack_opts.src_index,
+			       path, strlen(path), ignore_case);
+	dirty = verify_uptodate(ce, &o->unpack_opts) != 0;
 	return dirty;
 }
 
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 35/36] merge-recursive: make "Auto-merging" comment show for other merges
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (33 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 34/36] merge-recursive: fix remainder of was_dirty() to use original index Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 17:58 ` [PATCH v10 36/36] merge-recursive: fix check for skipability of working tree updates Elijah Newren
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

Previously, merge_content() would print "Auto-merging" whenever the final
content and mode aren't already available from HEAD.  There are a few
problems with this:

  1) There are other code paths doing merges that should probably have the
     same message printed, in particular rename/rename(2to1) which cannot
     call into the normal rename logic.

  2) If both sides of the merge have modifications, then a content merge
     is needed.  It may turn out that the end result matches one of the
     sides (because the other only had a subset of the same changes), but
     the merge was still needed.  Currently, the message will not print in
     that case, though it seems like it should.

Move the printing of this message to merge_file_1() in order to address
both issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

Part of the size of the diff was due to fixing the alignment of
function arguments while I was adding another argument to the list...

 merge-recursive.c | 65 ++++++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1a481fa3dc..212d34d268 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1063,12 +1063,13 @@ static int merge_3way(struct merge_options *o,
 }
 
 static int merge_file_1(struct merge_options *o,
-					   const struct diff_filespec *one,
-					   const struct diff_filespec *a,
-					   const struct diff_filespec *b,
-					   const char *branch1,
-					   const char *branch2,
-					   struct merge_file_info *result)
+			const struct diff_filespec *one,
+			const struct diff_filespec *a,
+			const struct diff_filespec *b,
+			const char *filename,
+			const char *branch1,
+			const char *branch2,
+			struct merge_file_info *result)
 {
 	result->merge = 0;
 	result->clean = 1;
@@ -1148,18 +1149,22 @@ static int merge_file_1(struct merge_options *o,
 			die("BUG: unsupported object type in the tree");
 	}
 
+	if (result->merge)
+		output(o, 2, _("Auto-merging %s"), filename);
+
 	return 0;
 }
 
 static int merge_file_special_markers(struct merge_options *o,
-			   const struct diff_filespec *one,
-			   const struct diff_filespec *a,
-			   const struct diff_filespec *b,
-			   const char *branch1,
-			   const char *filename1,
-			   const char *branch2,
-			   const char *filename2,
-			   struct merge_file_info *mfi)
+				      const struct diff_filespec *one,
+				      const struct diff_filespec *a,
+				      const struct diff_filespec *b,
+				      const char *target_filename,
+				      const char *branch1,
+				      const char *filename1,
+				      const char *branch2,
+				      const char *filename2,
+				      struct merge_file_info *mfi)
 {
 	char *side1 = NULL;
 	char *side2 = NULL;
@@ -1170,22 +1175,23 @@ static int merge_file_special_markers(struct merge_options *o,
 	if (filename2)
 		side2 = xstrfmt("%s:%s", branch2, filename2);
 
-	ret = merge_file_1(o, one, a, b,
+	ret = merge_file_1(o, one, a, b, target_filename,
 			   side1 ? side1 : branch1,
 			   side2 ? side2 : branch2, mfi);
+
 	free(side1);
 	free(side2);
 	return ret;
 }
 
 static int merge_file_one(struct merge_options *o,
-					 const char *path,
-					 const struct object_id *o_oid, int o_mode,
-					 const struct object_id *a_oid, int a_mode,
-					 const struct object_id *b_oid, int b_mode,
-					 const char *branch1,
-					 const char *branch2,
-					 struct merge_file_info *mfi)
+			  const char *path,
+			  const struct object_id *o_oid, int o_mode,
+			  const struct object_id *a_oid, int a_mode,
+			  const struct object_id *b_oid, int b_mode,
+			  const char *branch1,
+			  const char *branch2,
+			  struct merge_file_info *mfi)
 {
 	struct diff_filespec one, a, b;
 
@@ -1196,7 +1202,7 @@ static int merge_file_one(struct merge_options *o,
 	a.mode = a_mode;
 	oidcpy(&b.oid, b_oid);
 	b.mode = b_mode;
-	return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
+	return merge_file_1(o, &one, &a, &b, path, branch1, branch2, mfi);
 }
 
 static int conflict_rename_dir(struct merge_options *o,
@@ -1474,6 +1480,8 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 	struct diff_filespec *c1 = ci->pair1->two;
 	struct diff_filespec *c2 = ci->pair2->two;
 	char *path = c1->path; /* == c2->path */
+	char *path_side_1_desc;
+	char *path_side_2_desc;
 	struct merge_file_info mfi_c1;
 	struct merge_file_info mfi_c2;
 	int ret;
@@ -1487,13 +1495,19 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 	remove_file(o, 1, a->path, o->call_depth || would_lose_untracked(a->path));
 	remove_file(o, 1, b->path, o->call_depth || would_lose_untracked(b->path));
 
+	path_side_1_desc = xstrfmt("%s (was %s)", path, a->path);
+	path_side_2_desc = xstrfmt("%s (was %s)", path, b->path);
 	if (merge_file_special_markers(o, a, c1, &ci->ren1_other,
+				       path_side_1_desc,
 				       o->branch1, c1->path,
 				       o->branch2, ci->ren1_other.path, &mfi_c1) ||
 	    merge_file_special_markers(o, b, &ci->ren2_other, c2,
+				       path_side_2_desc,
 				       o->branch1, ci->ren2_other.path,
 				       o->branch2, c2->path, &mfi_c2))
 		return -1;
+	free(path_side_1_desc);
+	free(path_side_2_desc);
 
 	if (o->call_depth) {
 		/*
@@ -2802,7 +2816,7 @@ static int merge_content(struct merge_options *o,
 			       S_ISGITLINK(pair1->two->mode)))
 			df_conflict_remains = 1;
 	}
-	if (merge_file_special_markers(o, &one, &a, &b,
+	if (merge_file_special_markers(o, &one, &a, &b, path,
 				       o->branch1, path1,
 				       o->branch2, path2, &mfi))
 		return -1;
@@ -2824,8 +2838,7 @@ static int merge_content(struct merge_options *o,
 				return -1;
 			return mfi.clean;
 		}
-	} else
-		output(o, 2, _("Auto-merging %s"), path);
+	}
 
 	if (!mfi.clean) {
 		if (S_ISGITLINK(mfi.mode))
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 36/36] merge-recursive: fix check for skipability of working tree updates
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (34 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 35/36] merge-recursive: make "Auto-merging" comment show for other merges Elijah Newren
@ 2018-04-19 17:58 ` Elijah Newren
  2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  2018-04-23 17:28 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  37 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 17:58 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, torvalds, Elijah Newren

The can-working-tree-updates-be-skipped check has had a long and blemished
history.  The update can be skipped iff:
  a) The merge is clean
  b) The merge matches what was in HEAD (content, mode, pathname)
  c) The target path is usable (i.e. not involved in D/F conflict)

Traditionally, we split b into parts:
  b1) The merged result matches the content and mode found in HEAD
  b2) The merged target path existed in HEAD

Steps a & b1 are easy to check; we have always gotten those right.  While
it is easy to overlook step c, this was fixed seven years ago with commit
4ab9a157d069 ("merge_content(): Check whether D/F conflicts are still
present", 2010-09-20).  merge-recursive didn't have a readily available
way to directly check step b2, so various approximations were used:

  * In commit b2c8c0a76274 ("merge-recursive: When we detect we can skip
    an update, actually skip it", 2011-02-28), it was noted that although
    the code claimed it was skipping the update, it did not actually skip
    the update.  The code was made to skip it, but used lstat(path, ...)
    as an approximation to path-was-tracked-in-index-before-merge.

  * In commit 5b448b853030 ("merge-recursive: When we detect we can skip
    an update, actually skip it", 2011-08-11), the problem with using
    lstat was noted.  It was changed to the approximation
       path2 && strcmp(path, path2)
    which is also wrong.  !path2 || strcmp(path, path2) would have been
    better, but would have fallen short with directory renames.

  * In c5b761fb2711 ("merge-recursive: ensure we write updates for
    directory-renamed file", 2018-02-14), the problem with the previous
    approximation was noted and changed to
       was_tracked(path)
    That looks close to what we were trying to answer, but was_tracked()
    as implemented at the time should have been named is_tracked(); it
    returned something different than what we were looking for.

  * To make matters more complex, fixing was_tracked() isn't sufficient
    because the splitting of b into b1 and b2 is wrong.  Consider the
    following merge with a rename/add conflict:
       side A: modify foo, add unrelated bar
       side B: rename foo->bar (but don't modify the mode or contents)
    In this case, the three-way merge of original foo, A's foo, and B's
    bar will result in a desired pathname of bar with the same
    mode/contents that A had for foo.  Thus, A had the right mode and
    contents for the file, and it had the right pathname present (namely,
    bar), but the bar that was present was unrelated to the contents, so
    the working tree update was not skippable.

Fix this by introducing a new function:
   was_tracked_and_matches(o, path, &mfi.oid, mfi.mode)
and use it to directly check for condition b.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                      | 48 +++++++++++++++++---------
 t/t6022-merge-rename.sh                |  2 +-
 t/t6043-merge-rename-directories.sh    |  2 +-
 t/t6046-merge-skip-unneeded-updates.sh | 10 +++---
 4 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 212d34d268..1de8dc1c53 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -779,6 +779,25 @@ static int dir_in_way(const char *path, int check_working_copy, int empty_ok)
 		!(empty_ok && is_empty_dir(path));
 }
 
+/*
+ * Returns whether path was tracked in the index before the merge started,
+ * and its oid and mode match the specified values
+ */
+static int was_tracked_and_matches(struct merge_options *o, const char *path,
+				   const struct object_id *oid, unsigned mode)
+{
+	int pos = index_name_pos(&o->orig_index, path, strlen(path));
+	struct cache_entry *ce;
+
+	if (0 > pos)
+		/* we were not tracking this path before the merge */
+		return 0;
+
+	/* See if the file we were tracking before matches */
+	ce = o->orig_index.cache[pos];
+	return (oid_eq(&ce->oid, oid) && ce->ce_mode == mode);
+}
+
 /*
  * Returns whether path was tracked in the index before the merge started
  */
@@ -2821,23 +2840,20 @@ static int merge_content(struct merge_options *o,
 				       o->branch2, path2, &mfi))
 		return -1;
 
-	if (mfi.clean && !df_conflict_remains &&
-	    oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {
-		int path_renamed_outside_HEAD;
+	/*
+	 * We can skip updating the working tree file iff:
+	 *   a) The merge is clean
+	 *   b) The merge matches what was in HEAD (content, mode, pathname)
+	 *   c) The target path is usable (i.e. not involved in D/F conflict)
+	 */
+	if (mfi.clean &&
+	    was_tracked_and_matches(o, path, &mfi.oid, mfi.mode) &&
+	    !df_conflict_remains) {
 		output(o, 3, _("Skipped %s (merged same as existing)"), path);
-		/*
-		 * The content merge resulted in the same file contents we
-		 * already had.  We can return early if those file contents
-		 * are recorded at the correct path (which may not be true
-		 * if the merge involves a rename).
-		 */
-		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-		if (!path_renamed_outside_HEAD) {
-			if (add_cacheinfo(o, mfi.mode, &mfi.oid, path,
-					  0, (!o->call_depth && !is_dirty), 0))
-				return -1;
-			return mfi.clean;
-		}
+		if (add_cacheinfo(o, mfi.mode, &mfi.oid, path,
+				  0, (!o->call_depth && !is_dirty), 0))
+			return -1;
+		return mfi.clean;
 	}
 
 	if (!mfi.clean) {
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index a1fad6980b..6df2650c03 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -247,7 +247,7 @@ test_expect_success 'merge of identical changes in a renamed file' '
 	git reset --hard HEAD^ &&
 	git checkout change &&
 	GIT_MERGE_VERBOSITY=3 git merge change+rename >out &&
-	test_i18ngrep "^Skipped B" out
+	test_i18ngrep ! "^Skipped B" out
 '
 
 test_expect_success 'setup for rename + d/f conflicts' '
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 45f620633f..2e28f2908d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3884,7 +3884,7 @@ test_expect_success '12b-setup: Moving one directory hierarchy into another' '
 	)
 '
 
-test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+test_expect_success '12b-check: Moving one directory hierarchy into another' '
 	(
 		cd 12b &&
 
diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh
index 911e2f87a4..880cd782d7 100755
--- a/t/t6046-merge-skip-unneeded-updates.sh
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -64,7 +64,7 @@ test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
 	)
 '
 
-test_expect_failure '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
+test_expect_success '1a-check-L: Modify(A)/Modify(B), change on B subset of A' '
 	test_when_finished "git -C 1a reset --hard" &&
 	test_when_finished "git -C 1a clean -fd" &&
 	(
@@ -160,7 +160,7 @@ test_expect_success '2a-setup: Modify(A)/rename(B)' '
 	)
 '
 
-test_expect_failure '2a-check-L: Modify/rename, merge into modify side' '
+test_expect_success '2a-check-L: Modify/rename, merge into modify side' '
 	test_when_finished "git -C 2a reset --hard" &&
 	test_when_finished "git -C 2a clean -fd" &&
 	(
@@ -360,7 +360,7 @@ test_expect_success '2c-setup: Modify b & add c VS rename b->c' '
 	)
 '
 
-test_expect_failure '2c-check: Modify b & add c VS rename b->c' '
+test_expect_success '2c-check: Modify b & add c VS rename b->c' '
 	(
 		cd 2c &&
 
@@ -456,7 +456,7 @@ test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	)
 '
 
-test_expect_failure '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+test_expect_success '3a-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	test_when_finished "git -C 3a reset --hard" &&
 	test_when_finished "git -C 3a clean -fd" &&
 	(
@@ -579,7 +579,7 @@ test_expect_success '3b-check-L: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	)
 '
 
-test_expect_failure '3b-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
+test_expect_success '3b-check-R: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 	test_when_finished "git -C 3b reset --hard" &&
 	test_when_finished "git -C 3b clean -fd" &&
 	(
-- 
2.17.0.290.ge988e9ce2a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (35 preceding siblings ...)
  2018-04-19 17:58 ` [PATCH v10 36/36] merge-recursive: fix check for skipability of working tree updates Elijah Newren
@ 2018-04-19 18:35 ` Elijah Newren
  2018-04-19 18:41   ` Stefan Beller
                     ` (2 more replies)
  2018-04-23 17:28 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  37 siblings, 3 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 18:35 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Junio C Hamano, Elijah Newren, Derrick Stolee, Paul-Sebastian Ungureanu

On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
> This series is a reboot of the directory rename detection series that was
> merged to master and then reverted due to the final patch having a buggy
> can-skip-update check, as noted at
>   https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
> This series based on top of master.

...and merges cleanly to next but apparently has some minor conflicts
with both ds/lazy-load-trees and ps/test-chmtime-get from pu.

What's the preferred way to resolve this?  Rebase and resubmit my
series on pu, or something else?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
@ 2018-04-19 18:41   ` Stefan Beller
  2018-04-19 19:54     ` Derrick Stolee
  2018-04-19 20:22   ` Elijah Newren
  2018-04-20  3:05   ` Junio C Hamano
  2 siblings, 1 reply; 78+ messages in thread
From: Stefan Beller @ 2018-04-19 18:41 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Junio C Hamano, Derrick Stolee,
	Paul-Sebastian Ungureanu

On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <newren@gmail.com> wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

If you were to base it off of pu, this series would depend on all other
series that pu contains. This is bad for the progress of this series.
(If it were to be merged to next, all other series would automatically
merge to next as well)

If the conflicts are minor, then Junio resolves them; if you want to be
nice, pick your merge point as

    git checkout origin/master
    git merge ds/lazy-load-trees
    git merge ps/test-chmtime-get
    git tag my-anchor

and put the series on top of that anchor.

If you do this, you'd want to be reasonably sure that
those two series are not in too much flux.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 18:41   ` Stefan Beller
@ 2018-04-19 19:54     ` Derrick Stolee
  0 siblings, 0 replies; 78+ messages in thread
From: Derrick Stolee @ 2018-04-19 19:54 UTC (permalink / raw)
  To: Stefan Beller, Elijah Newren
  Cc: Git Mailing List, Junio C Hamano, Paul-Sebastian Ungureanu

On 4/19/2018 2:41 PM, Stefan Beller wrote:
> On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <newren@gmail.com> wrote:
>> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
>>> This series is a reboot of the directory rename detection series that was
>>> merged to master and then reverted due to the final patch having a buggy
>>> can-skip-update check, as noted at
>>>    https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>>> This series based on top of master.
>> ...and merges cleanly to next but apparently has some minor conflicts
>> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>>
>> What's the preferred way to resolve this?  Rebase and resubmit my
>> series on pu, or something else?
> If you were to base it off of pu, this series would depend on all other
> series that pu contains. This is bad for the progress of this series.
> (If it were to be merged to next, all other series would automatically
> merge to next as well)
>
> If the conflicts are minor, then Junio resolves them; if you want to be
> nice, pick your merge point as
>
>      git checkout origin/master
>      git merge ds/lazy-load-trees
>      git merge ps/test-chmtime-get
>      git tag my-anchor
>
> and put the series on top of that anchor.
>
> If you do this, you'd want to be reasonably sure that
> those two series are not in too much flux.

I believe ds/lazy-load-trees is queued for 'next'. I'm not surprised 
that there are some conflicts here. Any reference to the 'tree' member 
of a commit should be replaced with 'get_commit_tree(c)', or 
'get_commit_tree_oid(c)' if you only care about the tree's object id.

I think Stefan's suggestion is the best approach to get the right 
conflicts out of the way.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  2018-04-19 18:41   ` Stefan Beller
@ 2018-04-19 20:22   ` Elijah Newren
  2018-04-20  3:05   ` Junio C Hamano
  2 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 20:22 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Junio C Hamano, Elijah Newren, Derrick Stolee, Paul-Sebastian Ungureanu

On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren <newren@gmail.com> wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

Sorry, user error; there are no conflicts with my series.

(I accidentally included Junio's interim round of my own series and
while trying to spot problems I saw commits from these other series
touching relevant files in what looked like nearby areas.  Directly
merging with these other two series or even merging all of pu before
en/rename-directory-detection-reboot followed by individually merging
later series has no conflicts with any of my changes.)

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge
  2018-04-19 17:58 ` [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
@ 2018-04-19 20:26   ` SZEDER Gábor
  2018-04-19 20:55     ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: SZEDER Gábor @ 2018-04-19 20:26 UTC (permalink / raw)
  To: Elijah Newren; +Cc: SZEDER Gábor, git, sbeller, gitster, torvalds

Just a couple of minor things:

> +###########################################################################
> +# SECTION 1: Cases involving no renames (one side has subset of changes of
> +#            the other side)
> +###########################################################################
> +
> +# Testcase 1a, Changes on A, subset of changes on B
> +#   Commit O: b_1
> +#   Commit A: b_2
> +#   Commit B: b_3
> +#   Expected: b_2
> +
> +test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
> +	test_create_repo 1a &&
> +	(
> +		cd 1a &&
> +
> +		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b

Broken && chain.

<...>

> +###########################################################################
> +# SECTION 2: Cases involving basic renames
> +###########################################################################
> +
> +# Testcase 2a, Changes on A, rename on B
> +#   Commit O: b_1
> +#   Commit A: b_2
> +#   Commit B: c_1
> +#   Expected: c_2
> +
> +test_expect_success '2a-setup: Modify(A)/rename(B)' '
> +	test_create_repo 2a &&
> +	(
> +		cd 2a &&
> +
> +		test_seq 1 10 >b

Broken && chain.

> +		git add b &&
> +		test_tick &&
> +		git commit -m "O" &&
> +
> +		git branch O &&
> +		git branch A &&
> +		git branch B &&
> +
> +		git checkout A &&
> +		test_seq 1 11 > b &&

Nit: space between redirection operator and filename.

<...>

> +# Testcase 2b, Changed and renamed on A, subset of changes on B
> +#   Commit O: b_1
> +#   Commit A: c_2
> +#   Commit B: b_3
> +#   Expected: c_2
> +
> +test_expect_success '2b-setup: Rename+Mod(A)/Mod(B), B mods subset of A' '
> +	test_create_repo 2b &&
> +	(
> +		cd 2b &&
> +
> +		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b

Broken && chain.

<...>

> +# Testcase 2c, Changes on A, rename on B
> +#   Commit O: b_1
> +#   Commit A: b_2, c_3
> +#   Commit B: c_1
> +#   Expected: rename/add conflict c_2 vs c_3
> +#
> +#   NOTE: Since A modified b_1->b_2, and B renamed b_1->c_1, the threeway
> +#         merge of those files should result in c_2.  We then should have a
> +#         rename/add conflict between c_2 and c_3.  However, if we note in
> +#         merge_content() that A had the right contents (b_2 has same
> +#         contents as c_2, just at a different name), and that A had the
> +#         right path present (c_3 existed) and thus decides that it can
> +#         skip the update, then we're in trouble.  This test verifies we do
> +#         not make that particular mistake.
> +
> +test_expect_success '2c-setup: Modify b & add c VS rename b->c' '
> +	test_create_repo 2c &&
> +	(
> +		cd 2c &&
> +
> +		test_seq 1 10 >b

Broken && chain.

<...>

> +###########################################################################
> +# SECTION 3: Cases involving directory renames
> +#
> +# NOTE:
> +#   Directory renames only apply when one side renames a directory, and the
> +#   other side adds or renames a path into that directory.  Applying the
> +#   directory rename to that new path creates a new pathname that didn't
> +#   exist on either side of history.  Thus, it is impossible for the
> +#   merge contents to already be at the right path, so all of these checks
> +#   exist just to make sure that updates are not skipped.
> +###########################################################################
> +
> +# Testcase 3a, Change + rename into dir foo on A, dir rename foo->bar on B
> +#   Commit O: bq_1, foo/whatever
> +#   Commit A: foo/{bq_2, whatever}
> +#   Commit B: bq_1, bar/whatever
> +#   Expected: bar/{bq_2, whatever}
> +
> +test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
> +	test_create_repo 3a &&
> +	(
> +		cd 3a &&
> +
> +		mkdir foo &&
> +		test_seq 1 10 >bq &&
> +		test_write_lines a b c d e f g h i j k >foo/whatever &&
> +		git add bq foo/whatever &&
> +		test_tick &&
> +		git commit -m "O" &&
> +
> +		git branch O &&
> +		git branch A &&
> +		git branch B &&
> +
> +		git checkout A &&
> +		test_seq 1 11 > bq &&

Space between redirection operator and filename.

<...>

> +# Testcase 3b, rename into dir foo on A, dir rename foo->bar + change on B
> +#   Commit O: bq_1, foo/whatever
> +#   Commit A: foo/{bq_1, whatever}
> +#   Commit B: bq_2, bar/whatever
> +#   Expected: bar/{bq_2, whatever}
> +
> +test_expect_success '3b-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
> +	test_create_repo 3b &&
> +	(
> +		cd 3b &&
> +
> +		mkdir foo &&
> +		test_seq 1 10 >bq &&
> +		test_write_lines a b c d e f g h i j k >foo/whatever &&
> +		git add bq foo/whatever &&
> +		test_tick &&
> +		git commit -m "O" &&
> +
> +		git branch O &&
> +		git branch A &&
> +		git branch B &&
> +
> +		git checkout A &&
> +		git mv bq foo/ &&
> +		test_tick &&
> +		git commit -m "A" &&
> +
> +		git checkout B &&
> +		test_seq 1 11 > bq &&

Space between redirection operator and filename.

<...>

> +###########################################################################
> +# SECTION 4: Cases involving dirty changes
> +###########################################################################
> +
> +# Testcase 4a, Changed on A, subset of changes on B, locally modified
> +#   Commit O: b_1
> +#   Commit A: b_2
> +#   Commit B: b_3
> +#   Working copy: b_4
> +#   Expected: b_2 for merge, b_4 in working copy
> +
> +test_expect_success '4a-setup: Change on A, change on B subset of A, dirty mods present' '
> +	test_create_repo 4a &&
> +	(
> +		cd 4a &&
> +
> +		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b

Broken && chain.

<...>

> +# Testcase 4b, Changed+renamed on A, subset of changes on B, locally modified
> +#   Commit O: b_1
> +#   Commit A: c_2
> +#   Commit B: b_3
> +#   Working copy: c_4
> +#   Expected: c_2
> +
> +test_expect_success '4b-setup: Rename+Mod(A)/Mod(B), change on B subset of A, dirty mods present' '
> +	test_create_repo 4b &&
> +	(
> +		cd 4b &&
> +
> +		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b

Broken && chain.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
@ 2018-04-19 20:39   ` Martin Ågren
  2018-04-19 20:54     ` Elijah Newren
  2018-04-20 12:23   ` SZEDER Gábor
  1 sibling, 1 reply; 78+ messages in thread
From: Martin Ågren @ 2018-04-19 20:39 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On 19 April 2018 at 19:58, Elijah Newren <newren@gmail.com> wrote:
> +       /* Free the extra index left from git_merge_trees() */
> +       /*
> +        * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
> +        * tucked away in o->unpack_opts.msgs, but the problem is that only
> +        * half of it refers to dynamically allocated data, while the other
> +        * half points at static strings.
> +        */

Timing. I've been preparing a patch that provides
`clear_unpack_trees_porcelain()` and fixes all such leaks. (About 10% of
all the leaks that are reported when I run the test-suite!) My patch
conflicts with this series for obvious reasons. Figuring out the
conflict resolution might be non-trivial, and I suspect it would even be
an evil merge. I'll be holding off on that patch until this has landed.

BTW: s/also data/also free data/. But since I'm promising to get rid of
this TODO quite soon after this is merged... ;-)

Martin

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames
  2018-04-19 17:58 ` [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
@ 2018-04-19 20:48   ` Martin Ågren
  2018-04-19 20:54     ` Martin Ågren
  2018-04-19 21:06     ` Elijah Newren
  0 siblings, 2 replies; 78+ messages in thread
From: Martin Ågren @ 2018-04-19 20:48 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On 19 April 2018 at 19:58, Elijah Newren <newren@gmail.com> wrote:
> This fixes an issue that existed before my directory rename detection
> patches that affects both normal renames and renames implied by
> directory rename detection.  Additional codepaths that only affect
> overwriting of dirty files that are involved in directory rename
> detection will be added in a subsequent commit.
>
> Reviewed-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  merge-recursive.c                   | 85 ++++++++++++++++++++++-------
>  merge-recursive.h                   |  2 +
>  t/t3501-revert-cherry-pick.sh       |  2 +-
>  t/t6043-merge-rename-directories.sh |  2 +-
>  t/t7607-merge-overwrite.sh          |  2 +-
>  unpack-trees.c                      |  4 +-
>  unpack-trees.h                      |  4 ++
>  7 files changed, 77 insertions(+), 24 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index c1c4faf61e..7fdcba4f22 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -337,32 +337,37 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
>         init_tree_desc(desc, tree->buffer, tree->size);
>  }
>
> -static int git_merge_trees(int index_only,
> +static int git_merge_trees(struct merge_options *o,
>                            struct tree *common,
>                            struct tree *head,
>                            struct tree *merge)
>  {
>         int rc;
>         struct tree_desc t[3];
> -       struct unpack_trees_options opts;
>
> -       memset(&opts, 0, sizeof(opts));
> -       if (index_only)
> -               opts.index_only = 1;
> +       memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
> +       if (o->call_depth)
> +               o->unpack_opts.index_only = 1;
>         else
> -               opts.update = 1;
> -       opts.merge = 1;
> -       opts.head_idx = 2;
> -       opts.fn = threeway_merge;
> -       opts.src_index = &the_index;
> -       opts.dst_index = &the_index;
> -       setup_unpack_trees_porcelain(&opts, "merge");
> +               o->unpack_opts.update = 1;
> +       o->unpack_opts.merge = 1;
> +       o->unpack_opts.head_idx = 2;
> +       o->unpack_opts.fn = threeway_merge;
> +       o->unpack_opts.src_index = &the_index;
> +       o->unpack_opts.dst_index = &the_index;
> +       setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
>
>         init_tree_desc_from_tree(t+0, common);
>         init_tree_desc_from_tree(t+1, head);
>         init_tree_desc_from_tree(t+2, merge);
>
> -       rc = unpack_trees(3, t, &opts);
> +       rc = unpack_trees(3, t, &o->unpack_opts);
> +       /*
> +        * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
> +        * so set to the new index which will usually have modification
> +        * timestamp info copied over.
> +        */
> +       o->unpack_opts.src_index = &the_index;
>         cache_tree_free(&active_cache_tree);
>         return rc;
>  }

As mentioned in a reply to patch 33/36 [1], I've got a patch to add
`clear_unpack_trees_porcelain()` which frees the resources allocated by
`setup_unpack_trees_porcelain()`. Before this patch, I could easily call
it at the end of this function. After this, the ownership is less
obvious to me.

It turns out that the only user of `unpack_opts` outside this function
can indeed end up wanting to use the error messages that `clear_...()`
would set out to free. So yes, the call to `clear_...()` will need to go
elsewhere.

It does sort of make me wonder if we should memset `unpack_opts` to zero
somewhere early, so that we can then `clear_...()` it early here before
zeroizing it. So yes, we'd be constantly allocating and freeing those
strings. Am I right to assume that the code after your series would do
(roughly) the same number of calls to `setup_unpack_trees_porcelain()`,
i.e., `git_merge_trees()` as it did before?

All of this is arguably irrelevant for this series. It might be better
if I clarify this memory ownership and do any adjustments as part of my
patch (series), rather than you shuffling things around at this time.

Mostly thinking out loud. If you have any thoughts, feel free to share.

Martin

[1] https://public-inbox.org/git/CAN0heSquJboMMgay+5XomqXCGoHtXxf1mJBmY_L7y+AA4eG0KA@mail.gmail.com/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-19 20:39   ` Martin Ågren
@ 2018-04-19 20:54     ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 20:54 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On Thu, Apr 19, 2018 at 1:39 PM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 19 April 2018 at 19:58, Elijah Newren <newren@gmail.com> wrote:
>> +       /* Free the extra index left from git_merge_trees() */
>> +       /*
>> +        * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
>> +        * tucked away in o->unpack_opts.msgs, but the problem is that only
>> +        * half of it refers to dynamically allocated data, while the other
>> +        * half points at static strings.
>> +        */
>
> Timing. I've been preparing a patch that provides
> `clear_unpack_trees_porcelain()` and fixes all such leaks. (About 10% of
> all the leaks that are reported when I run the test-suite!) My patch

Nice!

> conflicts with this series for obvious reasons. Figuring out the
> conflict resolution might be non-trivial, and I suspect it would even be
> an evil merge. I'll be holding off on that patch until this has landed.
>
> BTW: s/also data/also free data/. But since I'm promising to get rid of
> this TODO quite soon after this is merged... ;-)

Oops, good catch.  I can fix it up since I need to fix the issues
SZEDER found, but yeah if you're just going to implement the fix and
rip this comment out then it's not that critical.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames
  2018-04-19 20:48   ` Martin Ågren
@ 2018-04-19 20:54     ` Martin Ågren
  2018-04-19 21:06     ` Elijah Newren
  1 sibling, 0 replies; 78+ messages in thread
From: Martin Ågren @ 2018-04-19 20:54 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On 19 April 2018 at 22:48, Martin Ågren <martin.agren@gmail.com> wrote:
> On 19 April 2018 at 19:58, Elijah Newren <newren@gmail.com> wrote:
>> -static int git_merge_trees(int index_only,
>> +static int git_merge_trees(struct merge_options *o,
>>                            struct tree *common,
>>                            struct tree *head,
>>                            struct tree *merge)
>>  {
[...]
>> +       memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
[...]
>> +       setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
[...]
>>  }
>
> As mentioned in a reply to patch 33/36 [1], I've got a patch to add
> `clear_unpack_trees_porcelain()` which frees the resources allocated by
> `setup_unpack_trees_porcelain()`. Before this patch, I could easily call
> it at the end of this function. After this, the ownership is less
> obvious to me.
>
> It turns out that the only user of `unpack_opts` outside this function
> can indeed end up wanting to use the error messages that `clear_...()`
> would set out to free. So yes, the call to `clear_...()` will need to go
> elsewhere.
>
> It does sort of make me wonder if we should memset `unpack_opts` to zero
> somewhere early, so that we can then `clear_...()` it early here before
> zeroizing it. So yes, we'd be constantly allocating and freeing those
> strings. Am I right to assume that the code after your series would do
> (roughly) the same number of calls to `setup_unpack_trees_porcelain()`,
> i.e., `git_merge_trees()` as it did before?

Or, of course, both `setup_...` and `clear_...` would go outside this
function to churn less memory... Anyway, this still holds:

> All of this is arguably irrelevant for this series. It might be better
> if I clarify this memory ownership and do any adjustments as part of my
> patch (series), rather than you shuffling things around at this time.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge
  2018-04-19 20:26   ` SZEDER Gábor
@ 2018-04-19 20:55     ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 20:55 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On Thu, Apr 19, 2018 at 1:26 PM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> Just a couple of minor things:

Sweet, thanks for taking a look; will get these all fixed up.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames
  2018-04-19 20:48   ` Martin Ågren
  2018-04-19 20:54     ` Martin Ågren
@ 2018-04-19 21:06     ` Elijah Newren
  1 sibling, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-19 21:06 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds

On Thu, Apr 19, 2018 at 1:48 PM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 19 April 2018 at 19:58, Elijah Newren <newren@gmail.com> wrote:
>> This fixes an issue that existed before my directory rename detection
>> patches that affects both normal renames and renames implied by
>> directory rename detection.  Additional codepaths that only affect
>> overwriting of dirty files that are involved in directory rename
>> detection will be added in a subsequent commit.
>>
>> Reviewed-by: Stefan Beller <sbeller@google.com>
>> Signed-off-by: Elijah Newren <newren@gmail.com>
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>> ---
>>  merge-recursive.c                   | 85 ++++++++++++++++++++++-------
>>  merge-recursive.h                   |  2 +
>>  t/t3501-revert-cherry-pick.sh       |  2 +-
>>  t/t6043-merge-rename-directories.sh |  2 +-
>>  t/t7607-merge-overwrite.sh          |  2 +-
>>  unpack-trees.c                      |  4 +-
>>  unpack-trees.h                      |  4 ++
>>  7 files changed, 77 insertions(+), 24 deletions(-)
>>
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index c1c4faf61e..7fdcba4f22 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -337,32 +337,37 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
>>         init_tree_desc(desc, tree->buffer, tree->size);
>>  }
>>
>> -static int git_merge_trees(int index_only,
>> +static int git_merge_trees(struct merge_options *o,
>>                            struct tree *common,
>>                            struct tree *head,
>>                            struct tree *merge)
>>  {
>>         int rc;
>>         struct tree_desc t[3];
>> -       struct unpack_trees_options opts;
>>
>> -       memset(&opts, 0, sizeof(opts));
>> -       if (index_only)
>> -               opts.index_only = 1;
>> +       memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
>> +       if (o->call_depth)
>> +               o->unpack_opts.index_only = 1;
>>         else
>> -               opts.update = 1;
>> -       opts.merge = 1;
>> -       opts.head_idx = 2;
>> -       opts.fn = threeway_merge;
>> -       opts.src_index = &the_index;
>> -       opts.dst_index = &the_index;
>> -       setup_unpack_trees_porcelain(&opts, "merge");
>> +               o->unpack_opts.update = 1;
>> +       o->unpack_opts.merge = 1;
>> +       o->unpack_opts.head_idx = 2;
>> +       o->unpack_opts.fn = threeway_merge;
>> +       o->unpack_opts.src_index = &the_index;
>> +       o->unpack_opts.dst_index = &the_index;
>> +       setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
>>
>>         init_tree_desc_from_tree(t+0, common);
>>         init_tree_desc_from_tree(t+1, head);
>>         init_tree_desc_from_tree(t+2, merge);
>>
>> -       rc = unpack_trees(3, t, &opts);
>> +       rc = unpack_trees(3, t, &o->unpack_opts);
>> +       /*
>> +        * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
>> +        * so set to the new index which will usually have modification
>> +        * timestamp info copied over.
>> +        */
>> +       o->unpack_opts.src_index = &the_index;
>>         cache_tree_free(&active_cache_tree);
>>         return rc;
>>  }
>
> As mentioned in a reply to patch 33/36 [1], I've got a patch to add
> `clear_unpack_trees_porcelain()` which frees the resources allocated by
> `setup_unpack_trees_porcelain()`. Before this patch, I could easily call
> it at the end of this function. After this, the ownership is less
> obvious to me.

I wouldn't put the call to clear_unpack_trees_porcelain() at the end
of this function, but rather at the end of merge_trees().
merge_trees() is the only caller of git_merge_trees() and it continues
using o->unpack_opts until the end of that function.  At the end of
that function, there is no further need for o->unpack_opts.
Basically, put it right where I put the "FIXME: Need to also free data
allocated by setup_unpack_trees_porcelain()" comment.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
  2018-04-19 18:41   ` Stefan Beller
  2018-04-19 20:22   ` Elijah Newren
@ 2018-04-20  3:05   ` Junio C Hamano
  2018-04-23 17:50     ` Elijah Newren
  2018-04-24 20:20     ` [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
  2 siblings, 2 replies; 78+ messages in thread
From: Junio C Hamano @ 2018-04-20  3:05 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Derrick Stolee, Paul-Sebastian Ungureanu

Elijah Newren <newren@gmail.com> writes:

> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

The series as-is is fine, I think, from the maintainer's point of
view.  Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
  2018-04-19 20:39   ` Martin Ågren
@ 2018-04-20 12:23   ` SZEDER Gábor
  2018-04-20 15:23     ` Elijah Newren
  2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
  1 sibling, 2 replies; 78+ messages in thread
From: SZEDER Gábor @ 2018-04-20 12:23 UTC (permalink / raw)
  To: Elijah Newren
  Cc: SZEDER Gábor, git, sbeller, gitster, torvalds, martin.agren

> In commit aacb82de3ff8 ("merge-recursive: Split was_tracked() out of
> would_lose_untracked()", 2011-08-11), was_tracked() was split out of
> would_lose_untracked() with the intent to provide a function that could
> answer whether a path was tracked in the index before the merge.  Sadly,
> it instead returned whether the path was in the working tree due to having
> been tracked in the index before the merge OR having been written there by
> unpack_trees().  The distinction is important when renames are involved,
> e.g. for a merge where:
> 
>    HEAD:  modifies path b
>    other: renames b->c
> 
> In this case, c was not tracked in the index before the merge, but would
> have been added to the index at stage 0 and written to the working tree by
> unpack_trees().  would_lose_untracked() is more interested in the
> in-working-copy-for-either-reason behavior, while all other uses of
> was_tracked() want just was-it-tracked-in-index-before-merge behavior.
> 
> Unsplit would_lose_untracked() and write a new was_tracked() function
> which answers whether a path was tracked in the index before the merge
> started.
> 
> This will also affect was_dirty(), helping it to return better results
> since it can base answers off the original index rather than an index that
> possibly only copied over some of the stat information.  However,
> was_dirty() will need an additional change that will be made in a
> subsequent patch.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---

This patch causes memory corruption when the split index feature is in
use, making several tests fail.  Now, while the split index feature
sure has its own set of problems, AFAIK those are not that bad to
cause memory corruption, they "only" tend to cause transient test
failures due to a variant of the classic racy git issue [1].

Here is a test failure:

  $ GIT_TEST_SPLIT_INDEX=DareISayYes ./t3030-merge-recursive.sh
  <...>
  ok 31 - merge-recursive simple w/submodule result
  *** Error in `/home/szeder/src/git/git': free(): invalid pointer: 0x0000000001f646d0 ***
  ======= Backtrace: =========
  /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f84e0c5b7e5]
  /lib/x86_64-linux-gnu/libc.so.6(+0x7f72a)[0x7f84e0c6372a]
  /lib/x86_64-linux-gnu/libc.so.6(cfree+0xf7)[0x7f84e0c685e7]
  /home/szeder/src/git/git[0x5181ee]
  /home/szeder/src/git/git[0x4f1e82]
  /home/szeder/src/git/git[0x4f394b]
  /home/szeder/src/git/git[0x44a37f]
  /home/szeder/src/git/git[0x44afa9]
  /home/szeder/src/git/git[0x406640]
  /home/szeder/src/git/git[0x4070f0]
  /home/szeder/src/git/git[0x4062a7]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f84e0c04830]
  /home/szeder/src/git/git[0x4062f9]
  ======= Memory map: ========
  00400000-00616000 r-xp 00000000 08:06 2255502                            /home/szeder/src/git/git
  00815000-00816000 r--p 00215000 08:06 2255502                            /home/szeder/src/git/git
  00816000-00823000 rw-p 00216000 08:06 2255502                            /home/szeder/src/git/git
  00823000-00866000 rw-p 00000000 00:00 0 
  01f63000-01fa6000 rw-p 00000000 00:00 0                                  [heap]
  7f84e09ce000-7f84e09e4000 r-xp 00000000 08:06 921674                     /lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e09e4000-7f84e0be3000 ---p 00016000 08:06 921674                     /lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e0be3000-7f84e0be4000 rw-p 00015000 08:06 921674                     /lib/x86_64-linux-gnu/libgcc_s.so.1
  7f84e0be4000-7f84e0da4000 r-xp 00000000 08:06 917791                     /lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0da4000-7f84e0fa4000 ---p 001c0000 08:06 917791                     /lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0fa4000-7f84e0fa8000 r--p 001c0000 08:06 917791                     /lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0fa8000-7f84e0faa000 rw-p 001c4000 08:06 917791                     /lib/x86_64-linux-gnu/libc-2.23.so
  7f84e0faa000-7f84e0fae000 rw-p 00000000 00:00 0 
  7f84e0fae000-7f84e0fb5000 r-xp 00000000 08:06 917825                     /lib/x86_64-linux-gnu/librt-2.23.so
  7f84e0fb5000-7f84e11b4000 ---p 00007000 08:06 917825                     /lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b4000-7f84e11b5000 r--p 00006000 08:06 917825                     /lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b5000-7f84e11b6000 rw-p 00007000 08:06 917825                     /lib/x86_64-linux-gnu/librt-2.23.so
  7f84e11b6000-7f84e11ce000 r-xp 00000000 08:06 917789                     /lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e11ce000-7f84e13cd000 ---p 00018000 08:06 917789                     /lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13cd000-7f84e13ce000 r--p 00017000 08:06 917789                     /lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13ce000-7f84e13cf000 rw-p 00018000 08:06 917789                     /lib/x86_64-linux-gnu/libpthread-2.23.so
  7f84e13cf000-7f84e13d3000 rw-p 00000000 00:00 0 
  7f84e13d3000-7f84e13ec000 r-xp 00000000 08:06 918601                     /lib/x86_64-linux-gnu/libz.so.1.2.8
  7f84e13ec000-7f84e15eb000 ---p 00019000 08:06 918601                     /lib/x86_64-linux-gnu/libz.so.1.2.8
  7f84e15eb000-7f84e15ec000 r--p 00018000 08:06 918601                     /lib/x86_64-linux-gnu/libz.so.1.2.8
  7f84e15ec000-7f84e15ed000 rw-p 00019000 08:06 918601                     /lib/x86_64-linux-gnu/libz.so.1.2.8
  7f84e15ed000-7f84e1613000 r-xp 00000000 08:06 917787                     /lib/x86_64-linux-gnu/ld-2.23.so
  7f84e1760000-7f84e17e5000 rw-p 00000000 00:00 0 
  7f84e1811000-7f84e1812000 rw-p 00000000 00:00 0 
  7f84e1812000-7f84e1813000 r--p 00025000 08:06 917787                     /lib/x86_64-linux-gnu/ld-2.23.so
  7f84e1813000-7f84e1814000 rw-p 00026000 08:06 917787                     /lib/x86_64-linux-gnu/ld-2.23.so
  7f84e1814000-7f84e1815000 rw-p 00000000 00:00 0 
  7ffff14d9000-7ffff14fa000 rw-p 00000000 00:00 0                          [stack]
  7ffff15cf000-7ffff15d2000 r--p 00000000 00:00 0                          [vvar]
  7ffff15d2000-7ffff15d4000 r-xp 00000000 00:00 0                          [vdso]
  ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
  not ok 32 - merge-recursive copy vs. rename
  #       
  #               git checkout -f copy &&
  #               git merge rename &&
  #               ( git ls-tree -r HEAD && git ls-files -s ) >actual &&
  #               (
  #                       echo "100644 blob $o0   b"
  #                       echo "100644 blob $o0   c"
  #                       echo "100644 blob $o0   d/e"
  #                       echo "100644 blob $o0   e"
  #                       echo "100644 $o0 0      b"
  #                       echo "100644 $o0 0      c"
  #                       echo "100644 $o0 0      d/e"
  #                       echo "100644 $o0 0      e"
  #               ) >expected &&
  #               test_cmp expected actual
  #       

And the gdb backtrace of that 'git merge rename' command:

  Program received signal SIGABRT, Aborted.
  0x00007ffff7403428 in __GI_raise (sig=sig@entry=6)
      at ../sysdeps/unix/sysv/linux/raise.c:54
  54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
  (gdb) bt
  #0  0x00007ffff7403428 in __GI_raise (sig=sig@entry=6)
      at ../sysdeps/unix/sysv/linux/raise.c:54
  #1  0x00007ffff740502a in __GI_abort () at abort.c:89
  #2  0x00007ffff74457ea in __libc_message (do_abort=do_abort@entry=2, 
      fmt=fmt@entry=0x7ffff755eed8 "*** Error in `%s': %s: 0x%s ***\n")
      at ../sysdeps/posix/libc_fatal.c:175
  #3  0x00007ffff744d72a in malloc_printerr (
      ar_ptr=0x7ffff7792b20 <main_arena>, ptr=<optimized out>, 
      str=0x7ffff755bcaf "free(): invalid pointer", action=<optimized out>)
      at malloc.c:5006
  #4  free_check (mem=<optimized out>, caller=<optimized out>) at hooks.c:314
  #5  0x00007ffff74525e7 in __GI___libc_free (mem=<optimized out>)
      at malloc.c:2942
  #6  0x00000000005181ee in discard_index (istate=istate@entry=0x7fffffffcc10)
      at read-cache.c:1934
  #7  0x00000000004f1e82 in merge_trees (o=o@entry=0x7fffffffc850, 
      head=<optimized out>, merge=<optimized out>, common=<optimized out>, 
      result=result@entry=0x7fffffffc7f0) at merge-recursive.c:3125
  #8  0x00000000004f394b in merge_recursive (o=o@entry=0x7fffffffc850, 
      h1=h1@entry=0x86efa0, h2=0x86f020, ca=0x0, 
      result=result@entry=0x7fffffffc840) at merge-recursive.c:3220
  #9  0x000000000044a37f in try_merge_strategy (strategy=<optimized out>, 
      strategy@entry=0x597ded "recursive", common=common@entry=0x8674c0, 
      remoteheads=remoteheads@entry=0x8673f0, head=head@entry=0x86efa0)
      at builtin/merge.c:690
  #10 0x000000000044afa9 in cmd_merge (argc=<optimized out>, 
      argv=<optimized out>, prefix=<optimized out>) at builtin/merge.c:1533
  #11 0x0000000000406640 in run_builtin (argv=<optimized out>, 
      argc=<optimized out>, p=<optimized out>) at git.c:350
  #12 handle_builtin (argc=2, argv=0x7fffffffdc30) at git.c:562
  #13 0x00000000004070f0 in run_argv (argv=0x7fffffffd9d0, argcp=0x7fffffffd9dc)
      at git.c:614
  #14 cmd_main (argc=2, argc@entry=3, argv=0x7fffffffdc30, 
      argv@entry=0x7fffffffdc28) at git.c:691
  #15 0x00000000004062a7 in main (argc=3, argv=0x7fffffffdc28)
      at common-main.c:45

Other failing tests are:

  t3030-merge-recursive.sh
  t3402-rebase-merge.sh
  t3501-revert-cherry-pick.sh
  t6022-merge-rename.sh
  t6032-merge-large-rename.sh
  t6034-merge-rename-nocruft.sh
  t6042-merge-rename-corner-cases.sh
  t6043-merge-rename-directories.sh
  t6046-merge-skip-unneeded-updates.sh
  t7003-filter-branch.sh
  t7601-merge-pull-config.sh

> diff --git a/merge-recursive.c b/merge-recursive.c
> index b32e8d817a..097de7e5a7 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c

> @@ -3081,6 +3115,15 @@ int merge_trees(struct merge_options *o,
>  	else
>  		clean = 1;
>  
> +	/* Free the extra index left from git_merge_trees() */
> +	/*
> +	 * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
> +	 * tucked away in o->unpack_opts.msgs, but the problem is that only
> +	 * half of it refers to dynamically allocated data, while the other
> +	 * half points at static strings.
> +	 */
> +	discard_index(&o->orig_index);

Removing this discard_index() call makes all those test failures go
away...  but I guess that isn't the right solution, is it.

And even with that call removed, the next patch will cause a
segmentation fault in 't6043-merge-rename-directories.sh's '72 -
9f-check: Renamed directory that only contained immediate subdirs'.


[1] Working on it: https://github.com/szeder/git split-index-racy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-20 12:23   ` SZEDER Gábor
@ 2018-04-20 15:23     ` Elijah Newren
  2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
  1 sibling, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-20 15:23 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Linus Torvalds,
	Martin Ågren

On Fri, Apr 20, 2018 at 5:23 AM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
> This patch causes memory corruption when the split index feature is in
> use, making several tests fail.  Now, while the split index feature
> sure has its own set of problems, AFAIK those are not that bad to
> cause memory corruption, they "only" tend to cause transient test
> failures due to a variant of the classic racy git issue [1].
>
> Here is a test failure:
>
>   $ GIT_TEST_SPLIT_INDEX=DareISayYes ./t3030-merge-recursive.sh

Running under valgrind shows that merge-recursive.c's add_cacheinfo
(which calls add_cache_entry()) results in data used by o->orig_index
getting free()'d.  That means that anything trying to use that memory
(whether a later call to discard_index, or just a call to was_dirty()
or was_tracked()) will be access'ing free'd memory.  (The exact same
tests run valgrind clean when GIT_TEST_SPLIT_INDEX is not turned on.)

The fact that add_cacheinfo() frees data used by o->orig_index
surprises me.  add_cacheinfo is only supposed to modify the_index.
Are o->orig_index and the_index sharing data somehow?  Did I do
something wrong or incomplete for the split index case when swapping
indexes?  My swapping logic, as shown in this patch was:

    /*
     * Update the_index to match the new results, AFTER saving a copy
     * in o->orig_index.  Update src_index to point to the saved copy.
     * (verify_uptodate() checks src_index, and the original index is
     * the one that had the necessary modification timestamps.)
     */
    o->orig_index = the_index;
    the_index = tmp_index;
    o->unpack_opts.src_index = &o->orig_index;

Do I need to do more?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-20 12:23   ` SZEDER Gábor
  2018-04-20 15:23     ` Elijah Newren
@ 2018-04-21 19:37     ` Elijah Newren
  2018-04-21 20:13       ` Elijah Newren
  2018-04-22 12:38       ` Duy Nguyen
  1 sibling, 2 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-21 19:37 UTC (permalink / raw)
  To: git; +Cc: sbeller, gitster, szeder.dev, martin.agren, pclouds, Elijah Newren

Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
Since we create a temporary index in o->result, then discard o->dst_index
and overwrite it with o->result, when o->src_index == o->dst_index it is
safe to just reuse o->src_index's split_index for o->result.  However,
o->src_index and o->dst_index are specified separately in order to allow
callers to have these be different.  In such a case, reusing
o->src_index's split_index for o->result will cause the split_index to be
shared.  If either index then has entries replaced or removed, it will
result in the other index referring to free()'d memory.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

I still haven't wrapped my head around the split_index stuff entirely, so
it's possible that

  - the performance optimization isn't even valid when src == dst.  Could
    the original index be different enough from the result that we don't
    want its split_index?

  - there's a better, more performant fix or there is some way to actually
    share a split_index between two independent index_state objects.

However, with this fix, all the tests pass both normally and under
GIT_TEST_SPLIT_INDEX=DareISayYes.  Without this patch, when
GIT_TEST_SPLIT_INDEX is set, my directory rename detection series will fail
several tests, as reported by SZEDER.

 unpack-trees.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 79fd97074e..b670415d4c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
-	o->result.split_index = o->src_index->split_index;
-	if (o->result.split_index)
+	if (!o->src_index->split_index) {
+		o->result.split_index = NULL;
+	} else if (o->src_index == o->dst_index) {
+		/*
+		 * o->dst_index (and thus o->src_index) will be discarded
+		 * and overwritten with o->result at the end of this function,
+		 * so just use src_index's split_index to avoid having to
+		 * create a new one.
+		 */
+		o->result.split_index = o->src_index->split_index;
 		o->result.split_index->refcount++;
+	} else {
+		o->result.split_index = init_split_index(&o->result);
+	}
 	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
-- 
2.17.0.296.gaac25b4b81


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
@ 2018-04-21 20:13       ` Elijah Newren
  2018-04-22 12:38       ` Duy Nguyen
  1 sibling, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-21 20:13 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Stefan Beller, Junio C Hamano, SZEDER Gábor,
	Martin Ågren, Nguyễn Thái Ngọc,
	Elijah Newren

On Sat, Apr 21, 2018 at 12:37 PM, Elijah Newren <newren@gmail.com> wrote:
> Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
> Since we create a temporary index in o->result, then discard o->dst_index
> and overwrite it with o->result, when o->src_index == o->dst_index it is
> safe to just reuse o->src_index's split_index for o->result.  However,
> o->src_index and o->dst_index are specified separately in order to allow
> callers to have these be different.  In such a case, reusing
> o->src_index's split_index for o->result will cause the split_index to be
> shared.  If either index then has entries replaced or removed, it will
> result in the other index referring to free()'d memory.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---

Also, I probably shouldn't have made this look like part of my series
(by marking it as "RFC PATCH v10 32.5/36").  It doesn't depend on my
series and is an independently valuable bugfix, though to avoid
breaking SZEDER and other split_index users, this patch should
probably go in before my series does.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
  2018-04-21 20:13       ` Elijah Newren
@ 2018-04-22 12:38       ` Duy Nguyen
  2018-04-23 17:09         ` Elijah Newren
  1 sibling, 1 reply; 78+ messages in thread
From: Duy Nguyen @ 2018-04-22 12:38 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano,
	SZEDER Gábor, Martin Ågren

On Sat, Apr 21, 2018 at 9:37 PM, Elijah Newren <newren@gmail.com> wrote:
> Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
> Since we create a temporary index in o->result, then discard o->dst_index
> and overwrite it with o->result, when o->src_index == o->dst_index it is
> safe to just reuse o->src_index's split_index for o->result.  However,
> o->src_index and o->dst_index are specified separately in order to allow
> callers to have these be different.  In such a case, reusing
> o->src_index's split_index for o->result will cause the split_index to be
> shared.  If either index then has entries replaced or removed, it will
> result in the other index referring to free()'d memory.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>
> I still haven't wrapped my head around the split_index stuff entirely, so
> it's possible that
>
>   - the performance optimization isn't even valid when src == dst.  Could
>     the original index be different enough from the result that we don't
>     want its split_index?

This really depends on the use case of course. But when git checkout
is used for switching branches, unpack-trees will be used and unless
you switch between to vastly different branches, the updated entries
may be small compared to the entire index that sharing is still good.
If the result index is so different that it results in a huge index
file anyway, I believe we have code to recreate a new shared index to
keep its size down next time.

>   - there's a better, more performant fix or there is some way to actually
>     share a split_index between two independent index_state objects.

A cleaner way of doing this would be something to the line [1]

    move_index_extensions(&o->result, o->dst_index);

near the end of this function. This could be where we compare the
result index with the source index's shared file and see if it's worth
keeping the shared index or not. Shared index is designed to work with
huge index files though, any operations that go through all index
entries will usually not be cheap. But at least it's safer.

> However, with this fix, all the tests pass both normally and under
> GIT_TEST_SPLIT_INDEX=DareISayYes.  Without this patch, when
> GIT_TEST_SPLIT_INDEX is set, my directory rename detection series will fail
> several tests, as reported by SZEDER.

Yes, the change looks good.

[1] To me the second parameter should be src_index, not dst_index.
We're copying entries from _source_ index to "result" and we should
also copy extensions from the source index. That line happens to work
only when dst_index is the same as src_index, which is the common use
case so far.

>  unpack-trees.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 79fd97074e..b670415d4c 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>         o->result.timestamp.sec = o->src_index->timestamp.sec;
>         o->result.timestamp.nsec = o->src_index->timestamp.nsec;
>         o->result.version = o->src_index->version;
> -       o->result.split_index = o->src_index->split_index;
> -       if (o->result.split_index)
> +       if (!o->src_index->split_index) {
> +               o->result.split_index = NULL;
> +       } else if (o->src_index == o->dst_index) {
> +               /*
> +                * o->dst_index (and thus o->src_index) will be discarded
> +                * and overwritten with o->result at the end of this function,
> +                * so just use src_index's split_index to avoid having to
> +                * create a new one.
> +                */
> +               o->result.split_index = o->src_index->split_index;
>                 o->result.split_index->refcount++;
> +       } else {
> +               o->result.split_index = init_split_index(&o->result);
> +       }
>         hashcpy(o->result.sha1, o->src_index->sha1);
>         o->merge_size = len;
>         mark_all_ce_unused(o->src_index);
> --
> 2.17.0.296.gaac25b4b81
>



-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-22 12:38       ` Duy Nguyen
@ 2018-04-23 17:09         ` Elijah Newren
  2018-04-23 17:37           ` Duy Nguyen
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-23 17:09 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano,
	SZEDER Gábor, Martin Ågren

Hi,

On Sun, Apr 22, 2018 at 5:38 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>   - there's a better, more performant fix or there is some way to actually
>>     share a split_index between two independent index_state objects.
>
> A cleaner way of doing this would be something to the line [1]
>
>     move_index_extensions(&o->result, o->dst_index);
>
> near the end of this function. This could be where we compare the
> result index with the source index's shared file and see if it's worth
> keeping the shared index or not. Shared index is designed to work with
> huge index files though, any operations that go through all index
> entries will usually not be cheap. But at least it's safer.

Yeah, it looks like move_index_extensions() currently has no logic for
the split_index.  Adding it sounds to me like a patch series of its
own, and I'm keen to limit additional changes since my patch series
already broke things pretty badly once already.

>> However, with this fix, all the tests pass both normally and under
>> GIT_TEST_SPLIT_INDEX=DareISayYes.  Without this patch, when
>> GIT_TEST_SPLIT_INDEX is set, my directory rename detection series will fail
>> several tests, as reported by SZEDER.
>
> Yes, the change looks good.

Great, thanks for looking over it.

> [1] To me the second parameter should be src_index, not dst_index.
> We're copying entries from _source_ index to "result" and we should
> also copy extensions from the source index. That line happens to work
> only when dst_index is the same as src_index, which is the common use
> case so far.

That makes sense; this sounds like another fix that should be
submitted.  Did you want to submit a patch making that change?  Do you
want me to?

Elijah

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
                   ` (36 preceding siblings ...)
  2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
@ 2018-04-23 17:28 ` Elijah Newren
  2018-04-23 23:46   ` Junio C Hamano
  37 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-23 17:28 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Stefan Beller, Junio C Hamano, Linus Torvalds, Elijah Newren

On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
> Additional testing:
>
>   * I've re-merged all ~13k merge commits in git.git with both
>     git-2.17.0 and this version of git, comparing the results to each
>     other in detail.  (Including stdout & stderr, as well as the output
>     of subsequent commands like `git status`, `git ls-files -s`, `git
>     diff -M`, `git diff -M --staged`).  The only differences were in 23
>     merges of either git-gui or gitk which involved directory renames
>     (e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
>     or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
>     'gitk-git/po/ru.po')
>
>   * I'm trying to do the same with linux.git, but it looks like that will
>     take nearly a week to complete...

Results after restarting[1] and throwing some big hardware at it to
get faster completion:

Out of 53288 merge commits with exactly two parents in linux.git:
  - 48491 merged identically
  - 4737 merged the same other than a few different "Auto-merging
    <filename>" output lines (as expected due to patch 35/36)
  - 53 merged the same other than different "Checking out files: ..."
    output (I just did a plain merge; no flags like --no-progress)
  - the remaining 7 commits had non-trivial merge differences, all
    attributable to directory rename detection kicking in

So, it looks good to me.  If anyone has suggestions for other testing
to do, let me know.

[1] Restarted so it could include my unpack_trees fix (from
Message-Id20180421193736.12722-1-newren@gmail.com) plus a couple minor
fixup commits (fixing some testcase nits and a comment typo).

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-23 17:09         ` Elijah Newren
@ 2018-04-23 17:37           ` Duy Nguyen
  2018-04-23 18:05             ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Duy Nguyen @ 2018-04-23 17:37 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano,
	SZEDER Gábor, Martin Ågren

On Mon, Apr 23, 2018 at 7:09 PM, Elijah Newren <newren@gmail.com> wrote:
> Hi,
>
> On Sun, Apr 22, 2018 at 5:38 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>>   - there's a better, more performant fix or there is some way to actually
>>>     share a split_index between two independent index_state objects.
>>
>> A cleaner way of doing this would be something to the line [1]
>>
>>     move_index_extensions(&o->result, o->dst_index);
>>
>> near the end of this function. This could be where we compare the
>> result index with the source index's shared file and see if it's worth
>> keeping the shared index or not. Shared index is designed to work with
>> huge index files though, any operations that go through all index
>> entries will usually not be cheap. But at least it's safer.
>
> Yeah, it looks like move_index_extensions() currently has no logic for
> the split_index.  Adding it sounds to me like a patch series of its
> own, and I'm keen to limit additional changes since my patch series
> already broke things pretty badly once already.

Oh I'm not suggesting that you do it. I was simply pointing out
something I saw while I looked at this patch and surrounding area. And
it's definitely should be done separately (by whoever) since merge
logic is quite twisted as I understand it (then top it off with rename
logic)

>> [1] To me the second parameter should be src_index, not dst_index.
>> We're copying entries from _source_ index to "result" and we should
>> also copy extensions from the source index. That line happens to work
>> only when dst_index is the same as src_index, which is the common use
>> case so far.
>
> That makes sense; this sounds like another fix that should be
> submitted.  Did you want to submit a patch making that change?  Do you
> want me to?

I did not look careful enough to make sure it was right and submit a
patch. But it sounds like it could be another regression if dst_index
is now not the same as src_index (sorry I didn't look at your whole
stories and don't if dst_index != src_index is a new thing or not). If
dst_index is new, moving extensions from that to result index is
basically no-op, in other words we fail to copy necessary extensions
over.
-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-20  3:05   ` Junio C Hamano
@ 2018-04-23 17:50     ` Elijah Newren
  2018-04-24 20:20     ` [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
  1 sibling, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-23 17:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

Hi Junio,

On Thu, Apr 19, 2018 at 8:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren <newren@gmail.com> wrote:
>>> This series is a reboot of the directory rename detection series that was
>>> merged to master and then reverted due to the final patch having a buggy
>>> can-skip-update check, as noted at
>>>   https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
>>> This series based on top of master.
>
> The series as-is is fine, I think, from the maintainer's point of
> view.  Thanks.

Sorry to be a pest, but now I'm unsure how I should handle the next
round.  I've got:
- two minor fixup commits that can be trivially squashed in (not yet
sent), affecting just the final few patches
- a "year" vs "years" typo in commit message of patch 32 (which is now
in pu as commit 3daa9b3eb6dd)
- an (independent-ish) unpack_trees fix (Message-ID:
20180421193736.12722-1-newren@gmail.com), possibly to be supplemented
by another fix/improvement suggested by Duy

Should I...
- send out a reroll of everything, and include the unpack_trees
fix(es) in the series?
- just resend patches 32-36 with the fixes, and renumber the patches
to include the unpack_trees stuff in the middle?
- just send the two fixup commits, ignore the minor typo, and keep the
unpack_trees fix(es) as a separate topic that we'll just want to
advance first?
- something else?

Thanks,
Elijah

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst
  2018-04-23 17:37           ` Duy Nguyen
@ 2018-04-23 18:05             ` Elijah Newren
  2018-04-24  0:24               ` [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-23 18:05 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano,
	SZEDER Gábor, Martin Ågren

On Mon, Apr 23, 2018 at 10:37 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>> [1] To me the second parameter should be src_index, not dst_index.
>>> We're copying entries from _source_ index to "result" and we should
>>> also copy extensions from the source index. That line happens to work
>>> only when dst_index is the same as src_index, which is the common use
>>> case so far.
>>
>> That makes sense; this sounds like another fix that should be
>> submitted.  Did you want to submit a patch making that change?  Do you
>> want me to?
>
> I did not look careful enough to make sure it was right and submit a
> patch. But it sounds like it could be another regression if dst_index
> is now not the same as src_index (sorry I didn't look at your whole
> stories and don't if dst_index != src_index is a new thing or not). If
> dst_index is new, moving extensions from that to result index is
> basically no-op, in other words we fail to copy necessary extensions
> over.

Ah, got it, sounds like it should be included in this patch.

A quick summary for you so you don't have to review my whole series:
- All callers of unpack_trees() have dst_index == src_index, until my
  series.
- My series makes merge-recursive.c call unpack_trees() with
  dst_index != src_index (all other callsites unchanged)
- In merge-recursive.c, dst_index points to an entirely new index, so
  yeah we'd be dropping the extensions from the original src_index.

I think all the relevant parts of my series as far as this change is
concerned is the first few diff hunks at
  https://public-inbox.org/git/20180419175823.7946-34-newren@gmail.com/

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-23 17:28 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
@ 2018-04-23 23:46   ` Junio C Hamano
  2018-04-24  0:15     ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2018-04-23 23:46 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Stefan Beller, Linus Torvalds

Elijah Newren <newren@gmail.com> writes:

> Out of 53288 merge commits with exactly two parents in linux.git:
>   - 48491 merged identically
>   - 4737 merged the same other than a few different "Auto-merging
>     <filename>" output lines (as expected due to patch 35/36)
>   - 53 merged the same other than different "Checking out files: ..."
>     output (I just did a plain merge; no flags like --no-progress)
>   - the remaining 7 commits had non-trivial merge differences, all
>     attributable to directory rename detection kicking in
>
> So, it looks good to me.  If anyone has suggestions for other testing
> to do, let me know.

There must have been some merges that stopped due to conflicts among
those 50k, and I am interested to hear how they were different.  Or
are they included in the above numbers (e.g. among 48491 there were
ones that stopped with conflicts, but the results these conflictted
merge left in the working tree and the index were identical)?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 00/36] Add directory rename detection to git
  2018-04-23 23:46   ` Junio C Hamano
@ 2018-04-24  0:15     ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-24  0:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Stefan Beller, Linus Torvalds

On Mon, Apr 23, 2018 at 4:46 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
>> Out of 53288 merge commits with exactly two parents in linux.git:
>>   - 48491 merged identically
>>   - 4737 merged the same other than a few different "Auto-merging
>>     <filename>" output lines (as expected due to patch 35/36)
>>   - 53 merged the same other than different "Checking out files: ..."
>>     output (I just did a plain merge; no flags like --no-progress)
>>   - the remaining 7 commits had non-trivial merge differences, all
>>     attributable to directory rename detection kicking in
>>
>> So, it looks good to me.  If anyone has suggestions for other testing
>> to do, let me know.
>
> There must have been some merges that stopped due to conflicts among
> those 50k, and I am interested to hear how they were different.  Or
> are they included in the above numbers (e.g. among 48491 there were
> ones that stopped with conflicts, but the results these conflictted
> merge left in the working tree and the index were identical)?

They are included in the categories listed above.  What my comparison
did was for each of the 53288 commits:

1) Do the merge, capture stdout and stderr, and the exit status
2) Record output of 'git ls-files -s'
3) Record output of 'git status | grep -v detached'
4) Record contents of every untracked file (could be created e.g. due
to D/F conflicts)
5) Record contents of 'git diff -M --staged'
6) Record contents of 'git diff -M'
(all of this stuff in 1-6 is recorded into a single text file with
some nice headers to split the sections up).

7) Repeat steps 1-6 with the new version of git, but recording into a
different filename
8) Compare the two text files to see what was different between the
two merges, if anything.
(If they are different, save the files somewhere for me to look at later.)


Then after each merge, there's a bunch of cleanup to make sure things
are in a pristine state for the next merge.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-23 18:05             ` Elijah Newren
@ 2018-04-24  0:24               ` Elijah Newren
  2018-04-24  1:51                 ` Junio C Hamano
  2018-04-24  3:05                 ` Junio C Hamano
  0 siblings, 2 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-24  0:24 UTC (permalink / raw)
  To: git; +Cc: pclouds, gitster, Elijah Newren

Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
The code in unpack_trees() does not correctly handle them being different.
There are two separate issues:

First, there is the possibility of memory corruption.  Since
unpack_trees() creates a temporary index in o->result and then discards
o->dst_index and overwrites it with o->result, in the special case that
o->src_index == o->dst_index, it is safe to just reuse o->src_index's
split_index for o->result.  However, when src and dst are different,
reusing o->src_index's split_index for o->result will cause the
split_index to be shared.  If either index then has entries replaced or
removed, it will result in the other index referring to free()'d memory.

Second, we can drop the index extensions.  Previously, we were moving
index extensions from o->dst_index to o->result.  Since o->src_index is
the one that will have the necessary extensions (o->dst_index is likely to
be a new index temporary index created to store the results), we should be
moving the index extensions from there.  (Thanks to Duy Nguyen for
noticing and suggesting this fix.)

Helped by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---

Marked as PATCH v2, though I marked the previous one as "RFC PATCH v10
32.5/36" because I thought I was going to put it in my series.  But it is
an independent fix that my series needs.

Also, I reran my merge-all-linux.git merges comparison[1]; as expected,
this updated patch didn't change the results.

[1] https://public-inbox.org/git/CABPp-BHMt1Hjr8A_wkxvSExV9ALgG5032vV5uEE2-HtpYuA9QQ@mail.gmail.com/

 unpack-trees.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..08f6cab82e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
-	o->result.split_index = o->src_index->split_index;
-	if (o->result.split_index)
+	if (!o->src_index->split_index) {
+		o->result.split_index = NULL;
+	} else if (o->src_index == o->dst_index) {
+		/*
+		 * o->dst_index (and thus o->src_index) will be discarded
+		 * and overwritten with o->result at the end of this function,
+		 * so just use src_index's split_index to avoid having to
+		 * create a new one.
+		 */
+		o->result.split_index = o->src_index->split_index;
 		o->result.split_index->refcount++;
+	} else {
+		o->result.split_index = init_split_index(&o->result);
+	}
 	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
@@ -1412,7 +1423,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
-		move_index_extensions(&o->result, o->dst_index);
+		move_index_extensions(&o->result, o->src_index);
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {
-- 
2.17.0.296.gaac25b4b81


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-24  0:24               ` [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index Elijah Newren
@ 2018-04-24  1:51                 ` Junio C Hamano
  2018-04-24  3:05                 ` Junio C Hamano
  1 sibling, 0 replies; 78+ messages in thread
From: Junio C Hamano @ 2018-04-24  1:51 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, pclouds

Elijah Newren <newren@gmail.com> writes:

> Marked as PATCH v2, though I marked the previous one as "RFC PATCH v10
> 32.5/36" because I thought I was going to put it in my series.  But it is
> an independent fix that my series needs.

Thanks.  Let's take this before the remainder of the series ;-)

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-24  0:24               ` [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index Elijah Newren
  2018-04-24  1:51                 ` Junio C Hamano
@ 2018-04-24  3:05                 ` Junio C Hamano
  2018-04-24  6:50                   ` [PATCH v3] " Elijah Newren
  1 sibling, 1 reply; 78+ messages in thread
From: Junio C Hamano @ 2018-04-24  3:05 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, pclouds

Elijah Newren <newren@gmail.com> writes:

>  unpack-trees.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index e73745051e..08f6cab82e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  	o->result.timestamp.sec = o->src_index->timestamp.sec;
>  	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
>  	o->result.version = o->src_index->version;
> -	o->result.split_index = o->src_index->split_index;
> -	if (o->result.split_index)
> +	if (!o->src_index->split_index) {
> +		o->result.split_index = NULL;
> +	} else if (o->src_index == o->dst_index) {
> +		/*
> +		 * o->dst_index (and thus o->src_index) will be discarded
> +		 * and overwritten with o->result at the end of this function,
> +		 * so just use src_index's split_index to avoid having to
> +		 * create a new one.
> +		 */
> +		o->result.split_index = o->src_index->split_index;
>  		o->result.split_index->refcount++;
> +	} else {
> +		o->result.split_index = init_split_index(&o->result);
> +	}
>  	hashcpy(o->result.sha1, o->src_index->sha1);
>  	o->merge_size = len;
>  	mark_all_ce_unused(o->src_index);
> @@ -1412,7 +1423,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  						  WRITE_TREE_SILENT |
>  						  WRITE_TREE_REPAIR);
>  		}
> -		move_index_extensions(&o->result, o->dst_index);
> +		move_index_extensions(&o->result, o->src_index);

Can src_index be NULL here?  I am getting segfaults everywhere,
starting from t0000-basic that populates the index by reading one
tree object via read-tree.

>  		discard_index(o->dst_index);
>  		*o->dst_index = o->result;
>  	} else {

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-24  3:05                 ` Junio C Hamano
@ 2018-04-24  6:50                   ` " Elijah Newren
  2018-04-29 18:05                     ` Duy Nguyen
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-24  6:50 UTC (permalink / raw)
  To: gitster; +Cc: git, pclouds, Elijah Newren

Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
The code in unpack_trees() does not correctly handle them being different.
There are two separate issues:

First, there is the possibility of memory corruption.  Since
unpack_trees() creates a temporary index in o->result and then discards
o->dst_index and overwrites it with o->result, in the special case that
o->src_index == o->dst_index, it is safe to just reuse o->src_index's
split_index for o->result.  However, when src and dst are different,
reusing o->src_index's split_index for o->result will cause the
split_index to be shared.  If either index then has entries replaced or
removed, it will result in the other index referring to free()'d memory.

Second, we can drop the index extensions.  Previously, we were moving
index extensions from o->dst_index to o->result.  Since o->src_index is
the one that will have the necessary extensions (o->dst_index is likely to
be a new index temporary index created to store the results), we should be
moving the index extensions from there.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

Differences from v2:
  - Don't NULLify src_index until we're done using it
  - Actually built and tested[1]

But it now passes the testsuite on both linux and mac[2], and I even re-merged
all 53288 merge commits in linux.git (with a merge of this patch together with
the directory rename detection series) for good measure.  [Only 7 commits
showed a difference, all due to directory rename detection kicking in.]

[1] Turns out that getting all fancy with an m4.10xlarge and nice levels of
parallelization are great until you realize that your new setup omitted a
critical step, leaving you running a slightly stale version of git instead...
:-(

[2] Actually, I get two test failures on mac from t0050-filesystem.sh, both
with unicode normalization tests, but those two tests fail before my changes
too.  All the other tests pass.

 unpack-trees.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index e73745051e..49526d70aa 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
 	o->result.timestamp.nsec = o->src_index->timestamp.nsec;
 	o->result.version = o->src_index->version;
-	o->result.split_index = o->src_index->split_index;
-	if (o->result.split_index)
+	if (!o->src_index->split_index) {
+		o->result.split_index = NULL;
+	} else if (o->src_index == o->dst_index) {
+		/*
+		 * o->dst_index (and thus o->src_index) will be discarded
+		 * and overwritten with o->result at the end of this function,
+		 * so just use src_index's split_index to avoid having to
+		 * create a new one.
+		 */
+		o->result.split_index = o->src_index->split_index;
 		o->result.split_index->refcount++;
+	} else {
+		o->result.split_index = init_split_index(&o->result);
+	}
 	hashcpy(o->result.sha1, o->src_index->sha1);
 	o->merge_size = len;
 	mark_all_ce_unused(o->src_index);
@@ -1401,7 +1412,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		}
 	}
 
-	o->src_index = NULL;
 	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index) {
 		if (!ret) {
@@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
-		move_index_extensions(&o->result, o->dst_index);
+		move_index_extensions(&o->result, o->src_index);
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {
 		discard_index(&o->result);
 	}
+	o->src_index = NULL;
 
 done:
 	clear_exclude_list(&el);
-- 
2.17.0.253.g32393f1d0a


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths
  2018-04-20  3:05   ` Junio C Hamano
  2018-04-23 17:50     ` Elijah Newren
@ 2018-04-24 20:20     ` Elijah Newren
  2018-04-24 20:21       ` [PATCH v10 2/2] fixup! t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
  1 sibling, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-24 20:20 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, martin.agren, Elijah Newren

---
 merge-recursive.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1de8dc1c53..f2cbad4f10 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3146,10 +3146,10 @@ int merge_trees(struct merge_options *o,
 
 	/* Free the extra index left from git_merge_trees() */
 	/*
-	 * FIXME: Need to also data allocated by setup_unpack_trees_porcelain()
-	 * tucked away in o->unpack_opts.msgs, but the problem is that only
-	 * half of it refers to dynamically allocated data, while the other
-	 * half points at static strings.
+	 * FIXME: Need to also free data allocated by
+	 * setup_unpack_trees_porcelain() tucked away in o->unpack_opts.msgs,
+	 * but the problem is that only half of it refers to dynamically
+	 * allocated data, while the other half points at static strings.
 	 */
 	discard_index(&o->orig_index);
 
-- 
2.17.0.295.g791b7256b2.dirty


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH v10 2/2] fixup! t6046: testcases checking whether updates can be skipped in a merge
  2018-04-24 20:20     ` [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
@ 2018-04-24 20:21       ` Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-04-24 20:21 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, martin.agren, Elijah Newren

---
 t/t6046-merge-skip-unneeded-updates.sh | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/t/t6046-merge-skip-unneeded-updates.sh b/t/t6046-merge-skip-unneeded-updates.sh
index 880cd782d7..fcefffcaec 100755
--- a/t/t6046-merge-skip-unneeded-updates.sh
+++ b/t/t6046-merge-skip-unneeded-updates.sh
@@ -41,7 +41,7 @@ test_expect_success '1a-setup: Modify(A)/Modify(B), change on B subset of A' '
 	(
 		cd 1a &&
 
-		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
@@ -138,7 +138,7 @@ test_expect_success '2a-setup: Modify(A)/rename(B)' '
 	(
 		cd 2a &&
 
-		test_seq 1 10 >b
+		test_seq 1 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
@@ -148,7 +148,7 @@ test_expect_success '2a-setup: Modify(A)/rename(B)' '
 		git branch B &&
 
 		git checkout A &&
-		test_seq 1 11 > b &&
+		test_seq 1 11 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "A" &&
@@ -229,7 +229,7 @@ test_expect_success '2b-setup: Rename+Mod(A)/Mod(B), B mods subset of A' '
 	(
 		cd 2b &&
 
-		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
@@ -337,7 +337,7 @@ test_expect_success '2c-setup: Modify b & add c VS rename b->c' '
 	(
 		cd 2c &&
 
-		test_seq 1 10 >b
+		test_seq 1 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
@@ -443,7 +443,7 @@ test_expect_success '3a-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 		git branch B &&
 
 		git checkout A &&
-		test_seq 1 11 > bq &&
+		test_seq 1 11 >bq &&
 		git add bq &&
 		git mv bq foo/ &&
 		test_tick &&
@@ -542,7 +542,7 @@ test_expect_success '3b-setup: bq_1->foo/bq_2 on A, foo/->bar/ on B' '
 		git commit -m "A" &&
 
 		git checkout B &&
-		test_seq 1 11 > bq &&
+		test_seq 1 11 >bq &&
 		git add bq &&
 		git mv foo/ bar/ &&
 		test_tick &&
@@ -624,7 +624,7 @@ test_expect_success '4a-setup: Change on A, change on B subset of A, dirty mods
 	(
 		cd 4a &&
 
-		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
@@ -698,7 +698,7 @@ test_expect_success '4b-setup: Rename+Mod(A)/Mod(B), change on B subset of A, di
 	(
 		cd 4b &&
 
-		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b
+		test_write_lines 1 2 3 4 5 6 7 8 9 10 >b &&
 		git add b &&
 		test_tick &&
 		git commit -m "O" &&
-- 
2.17.0.295.g791b7256b2.dirty


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-24  6:50                   ` [PATCH v3] " Elijah Newren
@ 2018-04-29 18:05                     ` Duy Nguyen
  2018-04-29 20:53                       ` Johannes Schindelin
  0 siblings, 1 reply; 78+ messages in thread
From: Duy Nguyen @ 2018-04-29 18:05 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Junio C Hamano, Git Mailing List

On Tue, Apr 24, 2018 at 8:50 AM, Elijah Newren <newren@gmail.com> wrote:
> Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
> The code in unpack_trees() does not correctly handle them being different.
> There are two separate issues:
>
> First, there is the possibility of memory corruption.  Since
> unpack_trees() creates a temporary index in o->result and then discards
> o->dst_index and overwrites it with o->result, in the special case that
> o->src_index == o->dst_index, it is safe to just reuse o->src_index's
> split_index for o->result.  However, when src and dst are different,
> reusing o->src_index's split_index for o->result will cause the
> split_index to be shared.  If either index then has entries replaced or
> removed, it will result in the other index referring to free()'d memory.
>
> Second, we can drop the index extensions.  Previously, we were moving
> index extensions from o->dst_index to o->result.  Since o->src_index is
> the one that will have the necessary extensions (o->dst_index is likely to
> be a new index temporary index created to store the results), we should be
> moving the index extensions from there.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>
> Differences from v2:
>   - Don't NULLify src_index until we're done using it
>   - Actually built and tested[1]
>
> But it now passes the testsuite on both linux and mac[2], and I even re-merged
> all 53288 merge commits in linux.git (with a merge of this patch together with
> the directory rename detection series) for good measure.  [Only 7 commits
> showed a difference, all due to directory rename detection kicking in.]
>
> [1] Turns out that getting all fancy with an m4.10xlarge and nice levels of
> parallelization are great until you realize that your new setup omitted a
> critical step, leaving you running a slightly stale version of git instead...
> :-(
>
> [2] Actually, I get two test failures on mac from t0050-filesystem.sh, both
> with unicode normalization tests, but those two tests fail before my changes
> too.  All the other tests pass.
>
>  unpack-trees.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index e73745051e..49526d70aa 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>         o->result.timestamp.sec = o->src_index->timestamp.sec;
>         o->result.timestamp.nsec = o->src_index->timestamp.nsec;
>         o->result.version = o->src_index->version;
> -       o->result.split_index = o->src_index->split_index;
> -       if (o->result.split_index)
> +       if (!o->src_index->split_index) {
> +               o->result.split_index = NULL;
> +       } else if (o->src_index == o->dst_index) {
> +               /*
> +                * o->dst_index (and thus o->src_index) will be discarded
> +                * and overwritten with o->result at the end of this function,
> +                * so just use src_index's split_index to avoid having to
> +                * create a new one.
> +                */
> +               o->result.split_index = o->src_index->split_index;
>                 o->result.split_index->refcount++;
> +       } else {
> +               o->result.split_index = init_split_index(&o->result);
> +       }
>         hashcpy(o->result.sha1, o->src_index->sha1);
>         o->merge_size = len;
>         mark_all_ce_unused(o->src_index);
> @@ -1401,7 +1412,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>                 }
>         }
>
> -       o->src_index = NULL;
>         ret = check_updates(o) ? (-2) : 0;
>         if (o->dst_index) {
>                 if (!ret) {
> @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>                                                   WRITE_TREE_SILENT |
>                                                   WRITE_TREE_REPAIR);
>                 }
> -               move_index_extensions(&o->result, o->dst_index);
> +               move_index_extensions(&o->result, o->src_index);

While this looks like the right thing to do on paper, I believe it's
actually broken for a specific case of untracked cache. In short,
please do not touch this line. I will send a patch to revert
edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
which essentially deletes this line, with proper explanation and
perhaps a test if I could come up with one.

When we update the index, we depend on the fact that all updates must
invalidate the right untracked cache correctly. In this unpack
operations, we start copying entries over from src to result. Since
'result' (at least from the beginning) does not have an untracked
cache, it has nothing to invalidate when we copy entries over. By the
time we have done preparing 'result', what's recorded in src's (or
dst's for that matter) untracked cache may or may not apply to
'result'  index anymore. This copying only leads to more problems when
untracked cache is used.

Sorry I didn't notice this earlier :(

>                 discard_index(o->dst_index);
>                 *o->dst_index = o->result;
>         } else {
>                 discard_index(&o->result);
>         }
> +       o->src_index = NULL;
>
>  done:
>         clear_exclude_list(&el);
> --
> 2.17.0.253.g32393f1d0a
>
-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-29 18:05                     ` Duy Nguyen
@ 2018-04-29 20:53                       ` Johannes Schindelin
  2018-04-30 14:42                         ` Duy Nguyen
  0 siblings, 1 reply; 78+ messages in thread
From: Johannes Schindelin @ 2018-04-29 20:53 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Elijah Newren, Junio C Hamano, Git Mailing List

Hi Duy,

On Sun, 29 Apr 2018, Duy Nguyen wrote:

> On Tue, Apr 24, 2018 at 8:50 AM, Elijah Newren <newren@gmail.com> wrote:
> > Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
> > The code in unpack_trees() does not correctly handle them being different.
> > There are two separate issues:
> >
> > First, there is the possibility of memory corruption.  Since
> > unpack_trees() creates a temporary index in o->result and then discards
> > o->dst_index and overwrites it with o->result, in the special case that
> > o->src_index == o->dst_index, it is safe to just reuse o->src_index's
> > split_index for o->result.  However, when src and dst are different,
> > reusing o->src_index's split_index for o->result will cause the
> > split_index to be shared.  If either index then has entries replaced or
> > removed, it will result in the other index referring to free()'d memory.
> >
> > Second, we can drop the index extensions.  Previously, we were moving
> > index extensions from o->dst_index to o->result.  Since o->src_index is
> > the one that will have the necessary extensions (o->dst_index is likely to
> > be a new index temporary index created to store the results), we should be
> > moving the index extensions from there.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >
> > Differences from v2:
> >   - Don't NULLify src_index until we're done using it
> >   - Actually built and tested[1]
> >
> > But it now passes the testsuite on both linux and mac[2], and I even re-merged
> > all 53288 merge commits in linux.git (with a merge of this patch together with
> > the directory rename detection series) for good measure.  [Only 7 commits
> > showed a difference, all due to directory rename detection kicking in.]
> >
> > [1] Turns out that getting all fancy with an m4.10xlarge and nice levels of
> > parallelization are great until you realize that your new setup omitted a
> > critical step, leaving you running a slightly stale version of git instead...
> > :-(
> >
> > [2] Actually, I get two test failures on mac from t0050-filesystem.sh, both
> > with unicode normalization tests, but those two tests fail before my changes
> > too.  All the other tests pass.
> >
> >  unpack-trees.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/unpack-trees.c b/unpack-trees.c
> > index e73745051e..49526d70aa 100644
> > --- a/unpack-trees.c
> > +++ b/unpack-trees.c
> > @@ -1284,9 +1284,20 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >         o->result.timestamp.sec = o->src_index->timestamp.sec;
> >         o->result.timestamp.nsec = o->src_index->timestamp.nsec;
> >         o->result.version = o->src_index->version;
> > -       o->result.split_index = o->src_index->split_index;
> > -       if (o->result.split_index)
> > +       if (!o->src_index->split_index) {
> > +               o->result.split_index = NULL;
> > +       } else if (o->src_index == o->dst_index) {
> > +               /*
> > +                * o->dst_index (and thus o->src_index) will be discarded
> > +                * and overwritten with o->result at the end of this function,
> > +                * so just use src_index's split_index to avoid having to
> > +                * create a new one.
> > +                */
> > +               o->result.split_index = o->src_index->split_index;
> >                 o->result.split_index->refcount++;
> > +       } else {
> > +               o->result.split_index = init_split_index(&o->result);
> > +       }
> >         hashcpy(o->result.sha1, o->src_index->sha1);
> >         o->merge_size = len;
> >         mark_all_ce_unused(o->src_index);
> > @@ -1401,7 +1412,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >                 }
> >         }
> >
> > -       o->src_index = NULL;
> >         ret = check_updates(o) ? (-2) : 0;
> >         if (o->dst_index) {
> >                 if (!ret) {
> > @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >                                                   WRITE_TREE_SILENT |
> >                                                   WRITE_TREE_REPAIR);
> >                 }
> > -               move_index_extensions(&o->result, o->dst_index);
> > +               move_index_extensions(&o->result, o->src_index);
> 
> While this looks like the right thing to do on paper, I believe it's
> actually broken for a specific case of untracked cache. In short,
> please do not touch this line. I will send a patch to revert
> edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
> which essentially deletes this line, with proper explanation and
> perhaps a test if I could come up with one.
> 
> When we update the index, we depend on the fact that all updates must
> invalidate the right untracked cache correctly. In this unpack
> operations, we start copying entries over from src to result. Since
> 'result' (at least from the beginning) does not have an untracked
> cache, it has nothing to invalidate when we copy entries over. By the
> time we have done preparing 'result', what's recorded in src's (or
> dst's for that matter) untracked cache may or may not apply to
> 'result'  index anymore. This copying only leads to more problems when
> untracked cache is used.

Is there really no way to invalidate just individual entries?

I have a couple of worktrees which are *huge*. And edf3b90553 really
helped relieve the pain a bit when running `git status`. Now you say that
even a `git checkout -b new-branch` would blow the untracked cache away
again?

It would be *really* nice if we could prevent that performance regression
somehow.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-29 20:53                       ` Johannes Schindelin
@ 2018-04-30 14:42                         ` Duy Nguyen
  2018-04-30 14:45                           ` Duy Nguyen
  0 siblings, 1 reply; 78+ messages in thread
From: Duy Nguyen @ 2018-04-30 14:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Elijah Newren, Junio C Hamano, Git Mailing List

On Sun, Apr 29, 2018 at 10:53 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>> > @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>> >                                                   WRITE_TREE_SILENT |
>> >                                                   WRITE_TREE_REPAIR);
>> >                 }
>> > -               move_index_extensions(&o->result, o->dst_index);
>> > +               move_index_extensions(&o->result, o->src_index);
>>
>> While this looks like the right thing to do on paper, I believe it's
>> actually broken for a specific case of untracked cache. In short,
>> please do not touch this line. I will send a patch to revert
>> edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
>> which essentially deletes this line, with proper explanation and
>> perhaps a test if I could come up with one.
>>
>> When we update the index, we depend on the fact that all updates must
>> invalidate the right untracked cache correctly. In this unpack
>> operations, we start copying entries over from src to result. Since
>> 'result' (at least from the beginning) does not have an untracked
>> cache, it has nothing to invalidate when we copy entries over. By the
>> time we have done preparing 'result', what's recorded in src's (or
>> dst's for that matter) untracked cache may or may not apply to
>> 'result'  index anymore. This copying only leads to more problems when
>> untracked cache is used.
>
> Is there really no way to invalidate just individual entries?

Grr.... the short answer is the current code (i.e. without Elijah's
changes) works but in a twisted way. So you get to keep untracked
cache in the end.

I was right about the invalidation stuff. I knew about
invalidate_ce_path() in this file. What I didn't remember was this
function actually invalidates entries from the _source_ index, not the
result one. What kind of logic is that? You copy/move entries from
source to result than you go invalidate the source. Since the original
move_index_extensions() call moves extensions from the source, these
are already properly invalidated (both untracked cache and cache
tree), it it looks like it does the right thing. Two wrongs make a
right, I guess.

Sorry for venting. I was not happy with what I found. And sorry for
wasting your time making this move_index.. change then remove it.

> I have a couple of worktrees which are *huge*. And edf3b90553 really
> helped relieve the pain a bit when running `git status`. Now you say that
> even a `git checkout -b new-branch` would blow the untracked cache away
> again?
>
> It would be *really* nice if we could prevent that performance regression
> somehow.
>
> Ciao,
> Dscho



-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-30 14:42                         ` Duy Nguyen
@ 2018-04-30 14:45                           ` Duy Nguyen
  2018-04-30 16:19                             ` Elijah Newren
  0 siblings, 1 reply; 78+ messages in thread
From: Duy Nguyen @ 2018-04-30 14:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Elijah Newren, Junio C Hamano, Git Mailing List

On Mon, Apr 30, 2018 at 4:42 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sun, Apr 29, 2018 at 10:53 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>>> > @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>> >                                                   WRITE_TREE_SILENT |
>>> >                                                   WRITE_TREE_REPAIR);
>>> >                 }
>>> > -               move_index_extensions(&o->result, o->dst_index);
>>> > +               move_index_extensions(&o->result, o->src_index);
>>>
>>> While this looks like the right thing to do on paper, I believe it's
>>> actually broken for a specific case of untracked cache. In short,
>>> please do not touch this line. I will send a patch to revert
>>> edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
>>> which essentially deletes this line, with proper explanation and
>>> perhaps a test if I could come up with one.
>>>
>>> When we update the index, we depend on the fact that all updates must
>>> invalidate the right untracked cache correctly. In this unpack
>>> operations, we start copying entries over from src to result. Since
>>> 'result' (at least from the beginning) does not have an untracked
>>> cache, it has nothing to invalidate when we copy entries over. By the
>>> time we have done preparing 'result', what's recorded in src's (or
>>> dst's for that matter) untracked cache may or may not apply to
>>> 'result'  index anymore. This copying only leads to more problems when
>>> untracked cache is used.
>>
>> Is there really no way to invalidate just individual entries?
>
> Grr.... the short answer is the current code (i.e. without Elijah's
> changes) works but in a twisted way. So you get to keep untracked
> cache in the end.

GAAAHH.. it works _with_ Elijah's changes (since he made the change
from dst to src) not without (and no performance regression). This
file really messes my brain up.
-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-30 14:45                           ` Duy Nguyen
@ 2018-04-30 16:19                             ` Elijah Newren
  2018-04-30 16:29                               ` Duy Nguyen
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2018-04-30 16:19 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Johannes Schindelin, Junio C Hamano, Git Mailing List

Hi Duy,

On Mon, Apr 30, 2018 at 7:45 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, Apr 30, 2018 at 4:42 PM, Duy Nguyen <pclouds@gmail.com> wrote:
>> On Sun, Apr 29, 2018 at 10:53 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>>> > @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>>> >                                                   WRITE_TREE_SILENT |
>>>> >                                                   WRITE_TREE_REPAIR);
>>>> >                 }
>>>> > -               move_index_extensions(&o->result, o->dst_index);
>>>> > +               move_index_extensions(&o->result, o->src_index);
>>>>
>>>> While this looks like the right thing to do on paper, I believe it's
>>>> actually broken for a specific case of untracked cache. In short,
>>>> please do not touch this line. I will send a patch to revert
>>>> edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
>>>> which essentially deletes this line, with proper explanation and
>>>> perhaps a test if I could come up with one.
>>>>
>>>> When we update the index, we depend on the fact that all updates must
>>>> invalidate the right untracked cache correctly. In this unpack
>>>> operations, we start copying entries over from src to result. Since
>>>> 'result' (at least from the beginning) does not have an untracked
>>>> cache, it has nothing to invalidate when we copy entries over. By the
>>>> time we have done preparing 'result', what's recorded in src's (or
>>>> dst's for that matter) untracked cache may or may not apply to
>>>> 'result'  index anymore. This copying only leads to more problems when
>>>> untracked cache is used.
>>>
>>> Is there really no way to invalidate just individual entries?
>>
>> Grr.... the short answer is the current code (i.e. without Elijah's
>> changes) works but in a twisted way. So you get to keep untracked
>> cache in the end.
>
> GAAAHH.. it works _with_ Elijah's changes (since he made the change
> from dst to src) not without (and no performance regression).

So...is that an Acked-by for the patch, or does the "two wrong make a
right, I guess" comment suggest that we should still drop the
move_index_extensions change (essentially reverting to v1 of the PATCH
as found at 20180421193736.12722-1-newren@gmail.com), and you'll fix
things up further in a separate series?

> This file really messes my brain up.

I'm glad I'm not the only one.  :-)


Elijah

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v3] unpack_trees: fix breakage when o->src_index != o->dst_index
  2018-04-30 16:19                             ` Elijah Newren
@ 2018-04-30 16:29                               ` Duy Nguyen
  0 siblings, 0 replies; 78+ messages in thread
From: Duy Nguyen @ 2018-04-30 16:29 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Johannes Schindelin, Junio C Hamano, Git Mailing List

On Mon, Apr 30, 2018 at 6:19 PM, Elijah Newren <newren@gmail.com> wrote:
> Hi Duy,
>
> On Mon, Apr 30, 2018 at 7:45 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>> On Mon, Apr 30, 2018 at 4:42 PM, Duy Nguyen <pclouds@gmail.com> wrote:
>>> On Sun, Apr 29, 2018 at 10:53 PM, Johannes Schindelin
>>> <Johannes.Schindelin@gmx.de> wrote:
>>>>> > @@ -1412,12 +1422,13 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>>>> >                                                   WRITE_TREE_SILENT |
>>>>> >                                                   WRITE_TREE_REPAIR);
>>>>> >                 }
>>>>> > -               move_index_extensions(&o->result, o->dst_index);
>>>>> > +               move_index_extensions(&o->result, o->src_index);
>>>>>
>>>>> While this looks like the right thing to do on paper, I believe it's
>>>>> actually broken for a specific case of untracked cache. In short,
>>>>> please do not touch this line. I will send a patch to revert
>>>>> edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08),
>>>>> which essentially deletes this line, with proper explanation and
>>>>> perhaps a test if I could come up with one.
>>>>>
>>>>> When we update the index, we depend on the fact that all updates must
>>>>> invalidate the right untracked cache correctly. In this unpack
>>>>> operations, we start copying entries over from src to result. Since
>>>>> 'result' (at least from the beginning) does not have an untracked
>>>>> cache, it has nothing to invalidate when we copy entries over. By the
>>>>> time we have done preparing 'result', what's recorded in src's (or
>>>>> dst's for that matter) untracked cache may or may not apply to
>>>>> 'result'  index anymore. This copying only leads to more problems when
>>>>> untracked cache is used.
>>>>
>>>> Is there really no way to invalidate just individual entries?
>>>
>>> Grr.... the short answer is the current code (i.e. without Elijah's
>>> changes) works but in a twisted way. So you get to keep untracked
>>> cache in the end.
>>
>> GAAAHH.. it works _with_ Elijah's changes (since he made the change
>> from dst to src) not without (and no performance regression).
>
> So...is that an Acked-by for the patch

Yes, Acked-by: me.

> or does the "two wrong make a
> right, I guess" comment suggest that we should still drop the
> move_index_extensions change (essentially reverting to v1 of the PATCH
> as found at 20180421193736.12722-1-newren@gmail.com), and you'll fix
> things up further in a separate series?

I think I'll stay away from this file for a while. When I gather
enough courage, I'll need to read it through since it sounds like a
mine field.
-- 
Duy

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 18/36] merge-recursive: add get_directory_renames()
  2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
@ 2018-05-06 23:41   ` SZEDER Gábor
  2018-05-07 15:45     ` [PATCH] fixup! " Elijah Newren
  2019-10-09 20:38   ` [PATCH v10 18/36] " Johannes Schindelin
  1 sibling, 1 reply; 78+ messages in thread
From: SZEDER Gábor @ 2018-05-06 23:41 UTC (permalink / raw)
  To: Elijah Newren; +Cc: SZEDER Gábor, git, sbeller, gitster, torvalds

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 30894c1cc7..22c5e8e5c9 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c

> +static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
> +					     struct tree *tree)
> +{
> +	struct hashmap *dir_renames;
> +	struct hashmap_iter iter;
> +	struct dir_rename_entry *entry;
> +	int i;
> +
> +	/*
> +	 * Typically, we think of a directory rename as all files from a
> +	 * certain directory being moved to a target directory.  However,
> +	 * what if someone first moved two files from the original
> +	 * directory in one commit, and then renamed the directory
> +	 * somewhere else in a later commit?  At merge time, we just know
> +	 * that files from the original directory went to two different
> +	 * places, and that the bulk of them ended up in the same place.
> +	 * We want each directory rename to represent where the bulk of the
> +	 * files from that directory end up; this function exists to find
> +	 * where the bulk of the files went.
> +	 *
> +	 * The first loop below simply iterates through the list of file
> +	 * renames, finding out how often each directory rename pair
> +	 * possibility occurs.
> +	 */
> +	dir_renames = xmalloc(sizeof(struct hashmap));

Please use xmalloc(sizeof(*dir_renames)) instead, to avoid repeating the
data type.

> +	dir_rename_init(dir_renames);
> +	for (i = 0; i < pairs->nr; ++i) {
> +		struct string_list_item *item;
> +		int *count;
> +		struct diff_filepair *pair = pairs->queue[i];
> +		char *old_dir, *new_dir;
> +
> +		/* File not part of directory rename if it wasn't renamed */
> +		if (pair->status != 'R')
> +			continue;
> +
> +		get_renamed_dir_portion(pair->one->path, pair->two->path,
> +					&old_dir,        &new_dir);
> +		if (!old_dir)
> +			/* Directory didn't change at all; ignore this one. */
> +			continue;
> +
> +		entry = dir_rename_find_entry(dir_renames, old_dir);
> +		if (!entry) {
> +			entry = xmalloc(sizeof(struct dir_rename_entry));

Similarly: xmalloc(sizeof(*entry))


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH] fixup! merge-recursive: add get_directory_renames()
  2018-05-06 23:41   ` SZEDER Gábor
@ 2018-05-07 15:45     ` " Elijah Newren
  0 siblings, 0 replies; 78+ messages in thread
From: Elijah Newren @ 2018-05-07 15:45 UTC (permalink / raw)
  To: gitster; +Cc: git, szeder.dev, Elijah Newren

---
 merge-recursive.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 5f42c677d5..9b9a4b8213 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1851,7 +1851,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 	 * renames, finding out how often each directory rename pair
 	 * possibility occurs.
 	 */
-	dir_renames = xmalloc(sizeof(struct hashmap));
+	dir_renames = xmalloc(sizeof(*dir_renames));
 	dir_rename_init(dir_renames);
 	for (i = 0; i < pairs->nr; ++i) {
 		struct string_list_item *item;
@@ -1871,7 +1871,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 
 		entry = dir_rename_find_entry(dir_renames, old_dir);
 		if (!entry) {
-			entry = xmalloc(sizeof(struct dir_rename_entry));
+			entry = xmalloc(sizeof(*entry));
 			dir_rename_entry_init(entry, old_dir);
 			hashmap_put(dir_renames, entry);
 		} else {
-- 
2.16.0.32.gc5b761fb27.dirty


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 18/36] merge-recursive: add get_directory_renames()
  2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
  2018-05-06 23:41   ` SZEDER Gábor
@ 2019-10-09 20:38   ` " Johannes Schindelin
  2019-10-11 20:02     ` Elijah Newren
  1 sibling, 1 reply; 78+ messages in thread
From: Johannes Schindelin @ 2019-10-09 20:38 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, sbeller, gitster, torvalds

Hi Elijah,

sorry about the blast from the past, but I just stumbled over something
I could not even find any discussion about:

On Thu, 19 Apr 2018, Elijah Newren wrote:

> This populates a set of directory renames for us.  The set of directory
> renames is not yet used, but will be in subsequent commits.
>
> Note that the use of a string_list for possible_new_dirs in the new
> dir_rename_entry struct implies an O(n^2) algorithm; however, in practice
> I expect the number of distinct directories that files were renamed into
> from a single original directory to be O(1).  My guess is that n has a
> mode of 1 and a mean of less than 2, so, for now, string_list seems good
> enough for possible_new_dirs.
>
> Reviewed-by: Stefan Beller <sbeller@google.com>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  merge-recursive.c | 224 +++++++++++++++++++++++++++++++++++++++++++++-
>  merge-recursive.h |  18 ++++
>  2 files changed, 239 insertions(+), 3 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 30894c1cc7..22c5e8e5c9 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> [...]
> @@ -1357,6 +1395,169 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
>  	return ret;
>  }
>
> +static void get_renamed_dir_portion(const char *old_path, const char *new_path,
> +				    char **old_dir, char **new_dir)
> +{
> +	char *end_of_old, *end_of_new;
> +	int old_len, new_len;
> +
> +	*old_dir = NULL;
> +	*new_dir = NULL;
> +
> +	/*
> +	 * For
> +	 *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
> +	 * the "e/foo.c" part is the same, we just want to know that
> +	 *    "a/b/c/d" was renamed to "a/b/some/thing/else"
> +	 * so, for this example, this function returns "a/b/c/d" in
> +	 * *old_dir and "a/b/some/thing/else" in *new_dir.
> +	 *
> +	 * Also, if the basename of the file changed, we don't care.  We
> +	 * want to know which portion of the directory, if any, changed.
> +	 */
> +	end_of_old = strrchr(old_path, '/');
> +	end_of_new = strrchr(new_path, '/');
> +
> +	if (end_of_old == NULL || end_of_new == NULL)
> +		return;
> +	while (*--end_of_new == *--end_of_old &&
> +	       end_of_old != old_path &&
> +	       end_of_new != new_path)
> +		; /* Do nothing; all in the while loop */
> +	/*
> +	 * We've found the first non-matching character in the directory
> +	 * paths.  That means the current directory we were comparing
> +	 * represents the rename.  Move end_of_old and end_of_new back
> +	 * to the full directory name.
> +	 */
> +	if (*end_of_old == '/')
> +		end_of_old++;
> +	if (*end_of_old != '/')
> +		end_of_new++;

Is this intentional? Even after thinking about it for fifteen minutes, I
think it was probable meant to test for `*end_of_new == '/'` instead of
`*end_of_old != '/'`. And...

> +	end_of_old = strchr(end_of_old, '/');
> +	end_of_new = strchr(end_of_new, '/');

... while I satisfied myself that these calls cannot return `NULL` at
this point, it took quite a few minutes of reasoning.

So I think we might want to rewrite these past 6 lines, to make
everything quite a bit more obvious, like this:

	if (end_of_old != old_path)
		while (*(++end_of_old) != '/')
			; /* keep looking */
	if (end_of_new != new_path)
		while (*(++end_of_new) != '/')
			; /* keep looking */

There is _still_ one thing that makes this harder than trivial to reason
about: the case where one of `*end_of_old` and `*end_of_new` is a slash.
At this point, we assume that `*end_of_old != *end_of_new` (more about
that assumption in the next paragraph), therefore only one of them can
be a slash, and we want to advance beyond it. But even if the pointer
does not point at a slash, we want to look for one, so we want to
advance beyond it.

I also think that we need an extra guard: we do not handle the case
`a/b/c` -> `a/b/d` well. As stated a few lines above, "if the basename
of the file changed, we don't care". So we start looking at the last
slash, then go backwards, and since everything matches, end up with
`end_of_old == old_path` and `end_of_new == new_path`. The current code
will advance `end_of_new` (which I think is wrong) and then looks for
the next slash in both `end_of_new` and `end_of_old` (which is also
wrong).

Is my reading correct?

Ciao,
Dscho

> +
> +	/*
> +	 * It may have been the case that old_path and new_path were the same
> +	 * directory all along.  Don't claim a rename if they're the same.
> +	 */
> +	old_len = end_of_old - old_path;
> +	new_len = end_of_new - new_path;
> +
> +	if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
> +		*old_dir = xstrndup(old_path, old_len);
> +		*new_dir = xstrndup(new_path, new_len);
> +	}
> +}
> [...]

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 18/36] merge-recursive: add get_directory_renames()
  2019-10-09 20:38   ` [PATCH v10 18/36] " Johannes Schindelin
@ 2019-10-11 20:02     ` Elijah Newren
  2019-10-12 19:23       ` Johannes Schindelin
  0 siblings, 1 reply; 78+ messages in thread
From: Elijah Newren @ 2019-10-11 20:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List

// Dropping a few folks from the cc list as the thread is so old that
I think it should just be the normal git mailing list.

Hi Dscho,

On Wed, Oct 9, 2019 at 1:39 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> sorry about the blast from the past, but I just stumbled over something
> I could not even find any discussion about:

I'm curious what brought you to this part of the codebase, but either
way, thanks for sending an email with your findings.

More comments below...

[...]
> > @@ -1357,6 +1395,169 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
> >       return ret;
> >  }
> >
> > +static void get_renamed_dir_portion(const char *old_path, const char *new_path,
> > +                                 char **old_dir, char **new_dir)
> > +{
> > +     char *end_of_old, *end_of_new;
> > +     int old_len, new_len;
> > +
> > +     *old_dir = NULL;
> > +     *new_dir = NULL;
> > +
> > +     /*
> > +      * For
> > +      *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
> > +      * the "e/foo.c" part is the same, we just want to know that
> > +      *    "a/b/c/d" was renamed to "a/b/some/thing/else"
> > +      * so, for this example, this function returns "a/b/c/d" in
> > +      * *old_dir and "a/b/some/thing/else" in *new_dir.
> > +      *
> > +      * Also, if the basename of the file changed, we don't care.  We
> > +      * want to know which portion of the directory, if any, changed.
> > +      */
> > +     end_of_old = strrchr(old_path, '/');
> > +     end_of_new = strrchr(new_path, '/');
> > +
> > +     if (end_of_old == NULL || end_of_new == NULL)
> > +             return;
> > +     while (*--end_of_new == *--end_of_old &&
> > +            end_of_old != old_path &&
> > +            end_of_new != new_path)
> > +             ; /* Do nothing; all in the while loop */
> > +     /*
> > +      * We've found the first non-matching character in the directory
> > +      * paths.  That means the current directory we were comparing
> > +      * represents the rename.  Move end_of_old and end_of_new back
> > +      * to the full directory name.
> > +      */
> > +     if (*end_of_old == '/')
> > +             end_of_old++;
> > +     if (*end_of_old != '/')
> > +             end_of_new++;
>
> Is this intentional? Even after thinking about it for fifteen minutes, I
> think it was probable meant to test for `*end_of_new == '/'` instead of
> `*end_of_old != '/'`. And...

Yeah, looks like a mess-up, and yes your suspicion is correct about
what was intended.

Hilariously, though, no bug results from this.  Since these are paths,
as canonicalized by git (i.e. not as specified by the user where they
might accidentally type multiple consecutive slashes), there will
never be two slashes in a row (because we can't have directories with
an empty name).  Thus, it is guaranteed at this point that *end_of_old
!= '/', and end_of_new is thus unconditionally advanced.  Further,
since we wanted to find the _next_ '/' character after end_of_new,
then there were two cases: (1) end_of_new already pointed at a slash
character in which case we needed it to be advanced, or (2) end_of_new
didn't point to a slash character so it wouldn't hurt at all to
advance it.

> > +     end_of_old = strchr(end_of_old, '/');
> > +     end_of_new = strchr(end_of_new, '/');
>
> ... while I satisfied myself that these calls cannot return `NULL` at
> this point, it took quite a few minutes of reasoning.
>
> So I think we might want to rewrite these past 6 lines, to make
> everything quite a bit more obvious, like this:
>
>         if (end_of_old != old_path)
>                 while (*(++end_of_old) != '/')
>                         ; /* keep looking */
>         if (end_of_new != new_path)
>                 while (*(++end_of_new) != '/')
>                         ; /* keep looking */

I think your if-checks here are not correct.  Let's say that old_path
was "tar/foo.c" and new_path was "star/foo.c".  The initial strrchr
will bring both end_of_* variables back to the slash.  The moving left
while equal will move end_of_old back to old_path (i.e. pointing to
the "t") and end_of_new back to pointing at "t" as well.  Here's where
your six alternate lines would kick in, and would leave end_of_old at
old_path, while moving end_of_new to the '/', making it look like we
had a rename of "" (the empty string or root directory) to "star"
instead of a rename of "tar" to "star".  If you dropped your if-checks
(just having the while loops), then I think it does the right thing.

> There is _still_ one thing that makes this harder than trivial to reason
> about: the case where one of `*end_of_old` and `*end_of_new` is a slash.
> At this point, we assume that `*end_of_old != *end_of_new` (more about
> that assumption in the next paragraph), therefore only one of them can
> be a slash, and we want to advance beyond it. But even if the pointer
> does not point at a slash, we want to look for one, so we want to
> advance beyond it.

I should probably add a comment that we want to advance BOTH to the
next slash.  I would have just used strchr() but it wouldn't advance
the string if it already points to what I'm looking for.  Actually, I
guess I could simplify the code by unconditionally advancing by one
character, then calling strchr().  In other words, simplifying these
six lines to just

       end_of_old = strchr(++end_of_old, '/');
       end_of_new = strchr(++end_of_new, '/');

> I also think that we need an extra guard: we do not handle the case
> `a/b/c` -> `a/b/d` well. As stated a few lines above, "if the basename
> of the file changed, we don't care". So we start looking at the last
> slash, then go backwards, and since everything matches, end up with
> `end_of_old == old_path` and `end_of_new == new_path`. The current code
> will advance `end_of_new` (which I think is wrong) and then looks for
> the next slash in both `end_of_new` and `end_of_old` (which is also
> wrong).

The current code is slightly convoluted, but I would say it's not
wrong for this case.  If we renamed a/b/c -> a/b/d, then there isn't a
directory rename; the leading directory (a/b/) is the same for both.
You are right that the advancing of end_of_old and end_of_new to the
next slash would result in what looks like a rename of "a" to "a", but
the checks at the end checked for this case and only returned
something for *old_dir and *new_dir if these didn't match; in fact,
it's the part of the code at the end of your email that you didn't
comment on, here:

> > +
> > +     /*
> > +      * It may have been the case that old_path and new_path were the same
> > +      * directory all along.  Don't claim a rename if they're the same.
> > +      */
> > +     old_len = end_of_old - old_path;
> > +     new_len = end_of_new - new_path;
> > +
> > +     if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
> > +             *old_dir = xstrndup(old_path, old_len);
> > +             *new_dir = xstrndup(new_path, new_len);
> > +     }
> > +}
> > [...]

However, we could drop this late check by just doing a simpler earlier
check to see if end_of_old == old_path and end_of_new == new_path
after the "find first non-equal character" step and before advancing
to the next '/', and if that condition is found, then return early
with no match.


However, since you highlighted this code, there are two other special
cases that might be interesting:

1) What if we are renaming e.g. foo/bar/baz.c ->
leading/dir/foo/bar/baz.c?  Then after trying to find the first
non-matching char we'll have end_of_old == old_path and *end_of_new ==
'f', and the advancing makes it look like "foo" being renamed to
"leading/dir/foo".  Since the root directory cannot be renamed (it
always exists on both sides of history), this probably makes sense as
the right thing to return.
2) What if the renaming went the other way, from
leading/dir/foo/bar/baz.c -> foo/bar/baz.c?  The whole advancing thing
makes this look like "leading/dir/foo" being renamed to "foo", instead
of "leading/dir" being renamed to "" (the root directory).  If we
don't detect it as "leading/dir" being renamed (merged into) the root
directory, then new files added directly within leading/dir/ on the
other side of history won't be moved by directory rename detection
into the root directory.

> Is my reading correct?

I'm not sure if I've answered your questions; let me know if not.  I
have generated a couple patches to (1) make the code easier to follow,
and (2) support the rename/merge of a subdirectory into the root
directory.  They're waiting for the gitgitgadget CI checks right now,
then I'll send them to the list.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH v10 18/36] merge-recursive: add get_directory_renames()
  2019-10-11 20:02     ` Elijah Newren
@ 2019-10-12 19:23       ` Johannes Schindelin
  0 siblings, 0 replies; 78+ messages in thread
From: Johannes Schindelin @ 2019-10-12 19:23 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Hi Elijah,

On Fri, 11 Oct 2019, Elijah Newren wrote:

> On Wed, Oct 9, 2019 at 1:39 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > sorry about the blast from the past, but I just stumbled over something
> > I could not even find any discussion about:
>
> I'm curious what brought you to this part of the codebase, but either
> way, thanks for sending an email with your findings.

Well, you know, it's a loooooong story.

> More comments below...

Thank you so much, they unpuzzled me quite a bit.

Ciao,
Dscho

>
> [...]
> > > @@ -1357,6 +1395,169 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
> > >       return ret;
> > >  }
> > >
> > > +static void get_renamed_dir_portion(const char *old_path, const char *new_path,
> > > +                                 char **old_dir, char **new_dir)
> > > +{
> > > +     char *end_of_old, *end_of_new;
> > > +     int old_len, new_len;
> > > +
> > > +     *old_dir = NULL;
> > > +     *new_dir = NULL;
> > > +
> > > +     /*
> > > +      * For
> > > +      *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
> > > +      * the "e/foo.c" part is the same, we just want to know that
> > > +      *    "a/b/c/d" was renamed to "a/b/some/thing/else"
> > > +      * so, for this example, this function returns "a/b/c/d" in
> > > +      * *old_dir and "a/b/some/thing/else" in *new_dir.
> > > +      *
> > > +      * Also, if the basename of the file changed, we don't care.  We
> > > +      * want to know which portion of the directory, if any, changed.
> > > +      */
> > > +     end_of_old = strrchr(old_path, '/');
> > > +     end_of_new = strrchr(new_path, '/');
> > > +
> > > +     if (end_of_old == NULL || end_of_new == NULL)
> > > +             return;
> > > +     while (*--end_of_new == *--end_of_old &&
> > > +            end_of_old != old_path &&
> > > +            end_of_new != new_path)
> > > +             ; /* Do nothing; all in the while loop */
> > > +     /*
> > > +      * We've found the first non-matching character in the directory
> > > +      * paths.  That means the current directory we were comparing
> > > +      * represents the rename.  Move end_of_old and end_of_new back
> > > +      * to the full directory name.
> > > +      */
> > > +     if (*end_of_old == '/')
> > > +             end_of_old++;
> > > +     if (*end_of_old != '/')
> > > +             end_of_new++;
> >
> > Is this intentional? Even after thinking about it for fifteen minutes, I
> > think it was probable meant to test for `*end_of_new == '/'` instead of
> > `*end_of_old != '/'`. And...
>
> Yeah, looks like a mess-up, and yes your suspicion is correct about
> what was intended.
>
> Hilariously, though, no bug results from this.  Since these are paths,
> as canonicalized by git (i.e. not as specified by the user where they
> might accidentally type multiple consecutive slashes), there will
> never be two slashes in a row (because we can't have directories with
> an empty name).  Thus, it is guaranteed at this point that *end_of_old
> != '/', and end_of_new is thus unconditionally advanced.  Further,
> since we wanted to find the _next_ '/' character after end_of_new,
> then there were two cases: (1) end_of_new already pointed at a slash
> character in which case we needed it to be advanced, or (2) end_of_new
> didn't point to a slash character so it wouldn't hurt at all to
> advance it.
>
> > > +     end_of_old = strchr(end_of_old, '/');
> > > +     end_of_new = strchr(end_of_new, '/');
> >
> > ... while I satisfied myself that these calls cannot return `NULL` at
> > this point, it took quite a few minutes of reasoning.
> >
> > So I think we might want to rewrite these past 6 lines, to make
> > everything quite a bit more obvious, like this:
> >
> >         if (end_of_old != old_path)
> >                 while (*(++end_of_old) != '/')
> >                         ; /* keep looking */
> >         if (end_of_new != new_path)
> >                 while (*(++end_of_new) != '/')
> >                         ; /* keep looking */
>
> I think your if-checks here are not correct.  Let's say that old_path
> was "tar/foo.c" and new_path was "star/foo.c".  The initial strrchr
> will bring both end_of_* variables back to the slash.  The moving left
> while equal will move end_of_old back to old_path (i.e. pointing to
> the "t") and end_of_new back to pointing at "t" as well.  Here's where
> your six alternate lines would kick in, and would leave end_of_old at
> old_path, while moving end_of_new to the '/', making it look like we
> had a rename of "" (the empty string or root directory) to "star"
> instead of a rename of "tar" to "star".  If you dropped your if-checks
> (just having the while loops), then I think it does the right thing.
>
> > There is _still_ one thing that makes this harder than trivial to reason
> > about: the case where one of `*end_of_old` and `*end_of_new` is a slash.
> > At this point, we assume that `*end_of_old != *end_of_new` (more about
> > that assumption in the next paragraph), therefore only one of them can
> > be a slash, and we want to advance beyond it. But even if the pointer
> > does not point at a slash, we want to look for one, so we want to
> > advance beyond it.
>
> I should probably add a comment that we want to advance BOTH to the
> next slash.  I would have just used strchr() but it wouldn't advance
> the string if it already points to what I'm looking for.  Actually, I
> guess I could simplify the code by unconditionally advancing by one
> character, then calling strchr().  In other words, simplifying these
> six lines to just
>
>        end_of_old = strchr(++end_of_old, '/');
>        end_of_new = strchr(++end_of_new, '/');
>
> > I also think that we need an extra guard: we do not handle the case
> > `a/b/c` -> `a/b/d` well. As stated a few lines above, "if the basename
> > of the file changed, we don't care". So we start looking at the last
> > slash, then go backwards, and since everything matches, end up with
> > `end_of_old == old_path` and `end_of_new == new_path`. The current code
> > will advance `end_of_new` (which I think is wrong) and then looks for
> > the next slash in both `end_of_new` and `end_of_old` (which is also
> > wrong).
>
> The current code is slightly convoluted, but I would say it's not
> wrong for this case.  If we renamed a/b/c -> a/b/d, then there isn't a
> directory rename; the leading directory (a/b/) is the same for both.
> You are right that the advancing of end_of_old and end_of_new to the
> next slash would result in what looks like a rename of "a" to "a", but
> the checks at the end checked for this case and only returned
> something for *old_dir and *new_dir if these didn't match; in fact,
> it's the part of the code at the end of your email that you didn't
> comment on, here:
>
> > > +
> > > +     /*
> > > +      * It may have been the case that old_path and new_path were the same
> > > +      * directory all along.  Don't claim a rename if they're the same.
> > > +      */
> > > +     old_len = end_of_old - old_path;
> > > +     new_len = end_of_new - new_path;
> > > +
> > > +     if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
> > > +             *old_dir = xstrndup(old_path, old_len);
> > > +             *new_dir = xstrndup(new_path, new_len);
> > > +     }
> > > +}
> > > [...]
>
> However, we could drop this late check by just doing a simpler earlier
> check to see if end_of_old == old_path and end_of_new == new_path
> after the "find first non-equal character" step and before advancing
> to the next '/', and if that condition is found, then return early
> with no match.
>
>
> However, since you highlighted this code, there are two other special
> cases that might be interesting:
>
> 1) What if we are renaming e.g. foo/bar/baz.c ->
> leading/dir/foo/bar/baz.c?  Then after trying to find the first
> non-matching char we'll have end_of_old == old_path and *end_of_new ==
> 'f', and the advancing makes it look like "foo" being renamed to
> "leading/dir/foo".  Since the root directory cannot be renamed (it
> always exists on both sides of history), this probably makes sense as
> the right thing to return.
> 2) What if the renaming went the other way, from
> leading/dir/foo/bar/baz.c -> foo/bar/baz.c?  The whole advancing thing
> makes this look like "leading/dir/foo" being renamed to "foo", instead
> of "leading/dir" being renamed to "" (the root directory).  If we
> don't detect it as "leading/dir" being renamed (merged into) the root
> directory, then new files added directly within leading/dir/ on the
> other side of history won't be moved by directory rename detection
> into the root directory.
>
> > Is my reading correct?
>
> I'm not sure if I've answered your questions; let me know if not.  I
> have generated a couple patches to (1) make the code easier to follow,
> and (2) support the rename/merge of a subdirectory into the root
> directory.  They're waiting for the gitgitgadget CI checks right now,
> then I'll send them to the list.
>

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, back to index

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-19 17:57 [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-19 17:57 ` [PATCH v10 01/36] directory rename detection: basic testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 02/36] directory rename detection: directory splitting testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 03/36] directory rename detection: testcases to avoid taking detection too far Elijah Newren
2018-04-19 17:57 ` [PATCH v10 04/36] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
2018-04-19 17:57 ` [PATCH v10 05/36] directory rename detection: files/directories in the way of some renames Elijah Newren
2018-04-19 17:57 ` [PATCH v10 06/36] directory rename detection: testcases checking which side did the rename Elijah Newren
2018-04-19 17:57 ` [PATCH v10 07/36] directory rename detection: more involved edge/corner testcases Elijah Newren
2018-04-19 17:57 ` [PATCH v10 08/36] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
2018-04-19 17:57 ` [PATCH v10 09/36] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
2018-04-19 17:57 ` [PATCH v10 10/36] directory rename detection: tests for handling overwriting untracked files Elijah Newren
2018-04-19 17:57 ` [PATCH v10 11/36] directory rename detection: tests for handling overwriting dirty files Elijah Newren
2018-04-19 17:57 ` [PATCH v10 12/36] merge-recursive: move the get_renames() function Elijah Newren
2018-04-19 17:58 ` [PATCH v10 13/36] merge-recursive: introduce new functions to handle rename logic Elijah Newren
2018-04-19 17:58 ` [PATCH v10 14/36] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 15/36] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
2018-04-19 17:58 ` [PATCH v10 16/36] merge-recursive: split out code for determining diff_filepairs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 17/36] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 18/36] merge-recursive: add get_directory_renames() Elijah Newren
2018-05-06 23:41   ` SZEDER Gábor
2018-05-07 15:45     ` [PATCH] fixup! " Elijah Newren
2019-10-09 20:38   ` [PATCH v10 18/36] " Johannes Schindelin
2019-10-11 20:02     ` Elijah Newren
2019-10-12 19:23       ` Johannes Schindelin
2018-04-19 17:58 ` [PATCH v10 19/36] merge-recursive: check for directory level conflicts Elijah Newren
2018-04-19 17:58 ` [PATCH v10 20/36] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
2018-04-19 17:58 ` [PATCH v10 21/36] merge-recursive: check for file level conflicts then get new name Elijah Newren
2018-04-19 17:58 ` [PATCH v10 22/36] merge-recursive: when comparing files, don't include trees Elijah Newren
2018-04-19 17:58 ` [PATCH v10 23/36] merge-recursive: apply necessary modifications for directory renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 24/36] merge-recursive: avoid clobbering untracked files with " Elijah Newren
2018-04-19 17:58 ` [PATCH v10 25/36] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
2018-04-19 20:48   ` Martin Ågren
2018-04-19 20:54     ` Martin Ågren
2018-04-19 21:06     ` Elijah Newren
2018-04-19 17:58 ` [PATCH v10 26/36] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
2018-04-19 17:58 ` [PATCH v10 27/36] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
2018-04-19 17:58 ` [PATCH v10 28/36] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
2018-04-19 17:58 ` [PATCH v10 29/36] merge-recursive: improve add_cacheinfo error handling Elijah Newren
2018-04-19 17:58 ` [PATCH v10 30/36] merge-recursive: move more is_dirty handling to merge_content Elijah Newren
2018-04-19 17:58 ` [PATCH v10 31/36] merge-recursive: avoid triggering add_cacheinfo error with dirty mod Elijah Newren
2018-04-19 17:58 ` [PATCH v10 32/36] t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
2018-04-19 20:26   ` SZEDER Gábor
2018-04-19 20:55     ` Elijah Newren
2018-04-19 17:58 ` [PATCH v10 33/36] merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
2018-04-19 20:39   ` Martin Ågren
2018-04-19 20:54     ` Elijah Newren
2018-04-20 12:23   ` SZEDER Gábor
2018-04-20 15:23     ` Elijah Newren
2018-04-21 19:37     ` [RFC PATCH v10 32.5/36] unpack_trees: fix memory corruption with split_index when src != dst Elijah Newren
2018-04-21 20:13       ` Elijah Newren
2018-04-22 12:38       ` Duy Nguyen
2018-04-23 17:09         ` Elijah Newren
2018-04-23 17:37           ` Duy Nguyen
2018-04-23 18:05             ` Elijah Newren
2018-04-24  0:24               ` [PATCH v2] unpack_trees: fix breakage when o->src_index != o->dst_index Elijah Newren
2018-04-24  1:51                 ` Junio C Hamano
2018-04-24  3:05                 ` Junio C Hamano
2018-04-24  6:50                   ` [PATCH v3] " Elijah Newren
2018-04-29 18:05                     ` Duy Nguyen
2018-04-29 20:53                       ` Johannes Schindelin
2018-04-30 14:42                         ` Duy Nguyen
2018-04-30 14:45                           ` Duy Nguyen
2018-04-30 16:19                             ` Elijah Newren
2018-04-30 16:29                               ` Duy Nguyen
2018-04-19 17:58 ` [PATCH v10 34/36] merge-recursive: fix remainder of was_dirty() to use original index Elijah Newren
2018-04-19 17:58 ` [PATCH v10 35/36] merge-recursive: make "Auto-merging" comment show for other merges Elijah Newren
2018-04-19 17:58 ` [PATCH v10 36/36] merge-recursive: fix check for skipability of working tree updates Elijah Newren
2018-04-19 18:35 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-19 18:41   ` Stefan Beller
2018-04-19 19:54     ` Derrick Stolee
2018-04-19 20:22   ` Elijah Newren
2018-04-20  3:05   ` Junio C Hamano
2018-04-23 17:50     ` Elijah Newren
2018-04-24 20:20     ` [PATCH v10 1/2] fixup! merge-recursive: fix was_tracked() to quit lying with some renamed paths Elijah Newren
2018-04-24 20:21       ` [PATCH v10 2/2] fixup! t6046: testcases checking whether updates can be skipped in a merge Elijah Newren
2018-04-23 17:28 ` [PATCH v10 00/36] Add directory rename detection to git Elijah Newren
2018-04-23 23:46   ` Junio C Hamano
2018-04-24  0:15     ` Elijah Newren

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox