git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 00/30] Add directory rename detection to git
@ 2017-11-10 19:05 Elijah Newren
  2017-11-10 19:05 ` [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking Elijah Newren
                   ` (30 more replies)
  0 siblings, 31 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

[This series is entirely independent of my rename detection limits series.
However, I have a separate rename detection performance series that depends
on both this series and the rename detection limits series.]

In this patchset, I introduce directory rename detection to merge-recursive,
predominantly so that when files are added to directories on one side of
history and those directories are renamed on the other side of history, the
files will end up in the proper location after a merge or cherry-pick.

However, this isn't limited to that simplistic case.  More interesting
possibilities exist, such as:

  * a file being renamed into a directory which is renamed on the other
    side of history, causing the need for a transitive rename.

  * two (or three or N) directories being merged (with no conflicts so
    long as files/directories within the merged directory have different
    names), and the "merging" being detected as a directory rename for
    each original directory.

  * not all files in a directory being renamed to the same location;
    i.e. perhaps the directory was renamed, but some files within it were
    renamed to a different location

  * a directory being renamed, which also contained a subdirectory that
    was renamed to some entirely different location.  (And perhaps the
    inner directory itself contained inner directories that were renamed
    to yet other locations).

Also, I found it useful to allow all files within the directory being
renamed to themselves be renamed and still detect the directory rename.
For example, if goal/a and goal/b are renamed to priority/alpha and
priority/bravo, we can detect that goal/ was renamed to priority/, so that
if someone adds goal/c on the other side of history, after the merge we'll
end up with priority/c.  (In the absence of a readily available
libmindread.so library that I can link to, we can't rename directly from
goal/c to priority/charlie automatically, and will need to have priority/c
suffice.)

Naturally, an attempt to do all of the above brings up all kinds of
interesting edge and corner cases, some of which result in conflicts
that cannot be represented in the index, and others of which might be
considered too complex for users to understand and resolve.  For
example:

  * An add/add/add/.../add conflict, all on one side of history (see
    testcase 9e in the new t6043, or any of the testcases in section 5)

  * Doubly, triply, or N-fold transitive renames (testcases 9c & 9d)

In order to prevent such problems, I introduce a couple basic rules that
limit when directory rename detection applies:

  1) If a subset of to-be-renamed files have a file or directory in the
     way (or would be in the way of each other), "turn off" the directory
     rename for those specific sub-paths and report the conflict to the
     user.

  2) If the other side of history did a directory rename to a path that
     your side of history renamed away, then ignore that particular
     rename from the other side of history for any implicit directory
     renames (but warn the user).

Further, there's a basic question about when directory rename detection
should be applied at all.  I have a simple rule:

  3) If a given directory still exists on both sides of a merge, we do
     not consider it to have been renamed.

Rule 3 may sound obvious at first, but it will probably arise as a
question for some users -- what if someone "mostly" moved a directory but
still left some files around, or, equivalently (from the perspective of the
three-way merge that merge-recursive performs), fully renamed a directory
in one commmit and then recreated that directory in a later commit adding
some new files and then tried to merge?  See the big comment in section 4
of the new t6043 for further discussion of this rule.

This set of rules seems to be reasonably easy to explain, is
self-consistent, allows all conflict cases to be represented without
changing any on-disk data structures or introducing new terminology or
commands for users, prevents excessively complex conflicts that users
might struggle to understand, and brings peace to the middle east.
Actually, maybe not that last one.

While I feel that this directory rename detection reduces the number of
suboptimal merges and cherry-picks that git performs, there are sadly
still a number of cases that remain suboptimal, or that even newly appear
to be not-quite-consistent with other cases.  The fact that one file
layout might trigger some of the rules above while another "slightly"
different file layout doesn't might occasionally cause some user
grumblings.  I've tried to explore and document these cases in section 8
of the new t6043-merge-rename-directories.sh

Finally, from an implementation perspective, there's another strong
advantage to the ruleset above: it means that any path to which we want
to apply an implicit directory rename will have a free and open spot
for us to move it into.  Thus, we can just adjust the diff_filepair
from an add or modify into a rename (or adjust a rename diff_filepair
to change the target a little more), and then let process_renames and
process_entry do all their magic.  That allows us to rely on all the
heavy testing already done for those code paths to handle a large
variety of edge and corner cases (e.g. D/F, rename/rename, criss-cross
merges, etc.)  The big trick is just making sure to do all the
necessary checks that we can apply directory rename detection, and then
fixing things up to put it in the expected format, with enough test
cases to make sure we actually got it into the right format.

Okay, the last paragraph had a small lie (though I didn't know that when
I originally wrote it): the fact that unpack_trees() aborts early if it
detects an untracked or dirty file would be overwritten by a merge, and
if not it immediately proceeds to start modifying the working tree before
passing control back to merge-recursive, causes some problems.  Not only
has it always made the code more complex, but the fact that
unpack_trees() doesn't understand renames means that it can't
appropriately abort early if a path involved in a rename has untracked
or dirty contents in the way of the merge.  But by the time we detect
renames, it's too late to abort early.  So we have to instead figure out
ways of emitting warnings messages and writing something sensible to the
working copy without overwriting any of their data.  This was a problem
before directory rename detection, but directory rename detection
increases the number of places where we have to worry about this.

Elijah Newren (30):
  Tighten and correct a few testcases for merging and cherry-picking
  merge-recursive: Fix logic ordering issue
  merge-recursive: Add explanation for src_entry and dst_entry

These three patches provide a few miscellaneous fixups that could be
submitted independent of this series, though the series partially
depends on the fixes in the first one, and the second fix becomes more
important with the rest of the changes in this series.

  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
    far
  directory rename detection: partially renamed directory
    testcase/discussion
  directory rename detection: files/directories in the way of some
    renames
  directory rename detection: testcases checking which side did the
    rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
    merges
  directory rename detection: miscellaneous testcases to complete
    coverage
  directory rename detection: tests for handling overwriting untracked
    files
  directory rename detection: tests for handling overwriting dirty files

These patches add testcases for directory rename detection, trying to
cover the space of possibilities as exhaustively as I can while trying
to avoid excessive overlap in testcases

  merge-recursive: Move the get_renames() function
  merge-recursive: Introduce new functions to handle rename logic
  merge-recursive: Fix leaks of allocated renames and diff_filepairs
  merge-recursive: Make !o->detect_rename codepath more obvious
  merge-recursive: Split out code for determining diff_filepairs

These four patches make small code reorganizations in preparation for
further changes, though they include some memory leak fixes.

  merge-recursive: Add a new hashmap for storing directory renames
  merge-recursive: Add get_directory_renames()
  merge-recursive: Check for directory level conflicts
  merge-recursive: Add a new hashmap for storing file collisions
  merge-recursive: Add computation of collisions due to dir rename &
    merging
  merge-recursive: Check for file level conflicts then get new name
  merge-recursive: When comparing files, don't include trees
  merge-recursive: Apply necessary modifications for directory renames

These eight patches implement the directory rename detection logic.
  
  merge-recursive: Avoid clobbering untracked files with directory
    renames
  merge-recursive: Fix overwriting dirty files involved in renames
  merge-recursive: Fix remaining directory rename + dirty overwrite
    cases

These last three deal with untracked and dirty file overwriting
headaches.  The middle patch in particular, isn't just a fix for
directory rename detection but fixes a bug in current versions of git
in overwriting dirty files that are involved in a rename.  That patch
could be backported and submitted independent of this series, but the
final patch depends heavily on it.

 merge-recursive.c                   | 1212 +++++++++++--
 merge-recursive.h                   |   17 +
 t/t3501-revert-cherry-pick.sh       |    5 +-
 t/t6043-merge-rename-directories.sh | 3277 +++++++++++++++++++++++++++++++++++
 t/t7607-merge-overwrite.sh          |    7 +-
 unpack-trees.c                      |    4 +-
 unpack-trees.h                      |    4 +
 7 files changed, 4413 insertions(+), 113 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh

-- 
2.15.0.5.g9567be9905

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 19:32   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 02/30] merge-recursive: Fix logic ordering issue Elijah Newren
                   ` (29 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

t3501 had a testcase originally added to ensure cherry-pick wouldn't
segfault when working with a dirty file involved in a rename.  While
the segfault was fixed, there was another problem this test demonstrated:
namely, that git would overwrite a dirty file involved in a rename.
Further, the test encoded a "successful merge" and overwriting of this
file as correct behavior.  Modify the test so that it would still catch
the segfault, but to require the correct behavior.

t7607 had a test specific to looking for a merge overwriting a dirty file
involved in a rename, but it too actually encoded what I would term
incorrect behavior: it expected the merge to succeed.  Fix that, and add
a few more checks to make sure that the merge really does produce the
expected results.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t3501-revert-cherry-pick.sh | 7 +++++--
 t/t7607-merge-overwrite.sh    | 5 ++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 4f2a263b63..783bdbf59d 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_success 'cherry-pick works with dirty renamed file' '
+test_expect_failure 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
@@ -150,7 +150,10 @@ test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_tick &&
 	git commit -m renamed &&
 	echo modified >renamed &&
-	git cherry-pick refs/heads/unrelated
+	test_must_fail git cherry-pick refs/heads/unrelated >out &&
+	test_i18ngrep "Refusing to lose dirty file at renamed" out &&
+	test $(git rev-parse :0:renamed) = $(git rev-parse HEAD^:to-rename.t) &&
+	grep -q "^modified$" renamed
 '
 
 test_done
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 9444d6a9b9..00617dadf8 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -97,7 +97,10 @@ test_expect_failure 'will not overwrite unstaged changes in renamed file' '
 	git mv c1.c other.c &&
 	git commit -m rename &&
 	cp important other.c &&
-	git merge c1a &&
+	test_must_fail git merge c1a >out &&
+	test_i18ngrep "Refusing to lose dirty file at other.c" out &&
+	test -f other.c~HEAD &&
+	test $(git hash-object other.c~HEAD) = $(git rev-parse c1a:c1.c) &&
 	test_cmp important other.c
 '
 
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
  2017-11-10 19:05 ` [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 19:48   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry Elijah Newren
                   ` (28 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

merge_trees() did a variety of work, including:
  * Calling get_unmerged() to get unmerged entries
  * Calling record_df_conflict_files() with all unmerged entries to
    do some work to ensure we could handle D/F conflicts correctly
  * Calling get_renames() to check for renames.

An easily overlooked issue is that get_renames() can create more
unmerged entries and add them to the list, which have the possibility of
being involved in D/F conflicts.  So the call to
record_df_conflict_files() should really be moved after all the rename
detection.  I didn't come up with any testcases demonstrating any bugs
with the old ordering, but I suspect there were some for both normal
renames and for directory renames.  Fix the ordering.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1d3f8f0d22..52521faf09 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1981,10 +1981,10 @@ int merge_trees(struct merge_options *o,
 		get_files_dirs(o, merge);
 
 		entries = get_unmerged();
-		record_df_conflict_files(o, entries);
 		re_head  = get_renames(o, head, common, head, merge, entries);
 		re_merge = get_renames(o, merge, common, head, merge, entries);
 		clean = process_renames(o, re_head, re_merge);
+		record_df_conflict_files(o, entries);
 		if (clean < 0)
 			goto cleanup;
 		for (i = entries->nr-1; 0 <= i; i--) {
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
  2017-11-10 19:05 ` [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking Elijah Newren
  2017-11-10 19:05 ` [PATCH 02/30] merge-recursive: Fix logic ordering issue Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 21:06   ` Stefan Beller
  2017-11-14  1:26   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 04/30] directory rename detection: basic testcases Elijah Newren
                   ` (27 subsequent siblings)
  30 siblings, 2 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

If I have to walk through the debugger and inspect the values found in
here in order to figure out their meaning, despite having known these
things inside and out some years back, then they probably need a comment
for the casual reader to explain their purpose.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 52521faf09..3526c8d0b8 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -513,6 +513,28 @@ static void record_df_conflict_files(struct merge_options *o,
 
 struct rename {
 	struct diff_filepair *pair;
+	/*
+	 * Because I keep forgetting every few years what src_entry and
+	 * dst_entry are and have to walk through a debugger and puzzle
+	 * through it to remind myself...
+	 *
+	 * If 'before' is renamed to 'after' then src_entry will contain
+	 * the versions of 'before' from the merge_base, HEAD, and MERGE in
+	 * stages 1, 2, and 3; dst_entry will contain the versions of
+	 * 'after' from the merge_base, HEAD, and MERGE in stages 1, 2, and
+	 * 3.  Thus, we have a total of six modes and oids, though some
+	 * will be null.  (Stage 0 is ignored; we're interested in handling
+	 * conflicts.)
+	 *
+	 * Since we don't turn on break-rewrites by default, neither
+	 * src_entry nor dst_entry can have all three of their stages have
+	 * non-null oids, meaning at most four of the six will be non-null.
+	 * Also, since this is a rename, both src_entry and dst_entry will
+	 * have at least one non-null oid, meaning at least two will be
+	 * non-null.  Of the six oids, a typical rename will have three be
+	 * non-null.  Only two implies a rename/delete, and four implies a
+	 * rename/add.
+	 */
 	struct stage_data *src_entry;
 	struct stage_data *dst_entry;
 	unsigned processed:1;
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 04/30] directory rename detection: basic testcases
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (2 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 22:04   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 05/30] directory rename detection: directory splitting testcases Elijah Newren
                   ` (26 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 391 ++++++++++++++++++++++++++++++++++++
 1 file changed, 391 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 0000000000..b737b0a105
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,391 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#      B
+#      o
+#     / \
+#  A o   ?
+#     \ /
+#      o
+#      C
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits A, B, and C contain.
+#
+# Notation:
+#    z/{b,c}   means  files z/b and z/c both exist
+#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
+#                     underscore notation is to differentiate different
+#                     files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###########################################################################
+# SECTION 1: Basic cases we should be able to handle
+###########################################################################
+
+# Testcase 1a, Basic directory rename.
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo d >z/d &&
+	mkdir z/e &&
+	echo f >z/e/f &&
+	git add z/d z/e/f &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse C:z/d) &&
+	test "$(git hash-object y/d)" = $(git rev-parse C:z/d) &&
+	test $(git rev-parse HEAD:y/e/f) = $(git rev-parse C:z/e/f) &&
+	test_must_fail git rev-parse HEAD:z/d &&
+	test_must_fail git rev-parse HEAD:z/e/f &&
+	test ! -d z/d &&
+	test ! -d z/e/f
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit A: z/{b,c},   y/d
+#   Commit B: z/{b,c,e}, y/d
+#   Commit C: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir y &&
+	echo d >y/d &&
+	git add z y &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo e >z/e &&
+	git add z/e &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z/b y &&
+	git mv z/c y &&
+	rmdir z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1b-check: Merge a directory with another' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse A:y/d) &&
+	test $(git rev-parse HEAD:y/e) = $(git rev-parse B:z/e) &&
+	test_must_fail git rev-parse HEAD:z/e
+'
+
+# Testcase 1c, Transitive renaming
+#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
+#   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   Commit A: z/{b,c},   x/d
+#   Commit B: y/{b,c},   x/d
+#   Commit C: z/{b,c,d}
+#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)
+
+test_expect_success '1c-setup: Transitive renaming' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1c-check: Transitive renaming' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&
+	test_must_fail git rev-parse HEAD:x/d &&
+	test_must_fail git rev-parse HEAD:z/d &&
+	test ! -f z/d
+'
+
+# Testcase 1d, Directory renames (merging two directories into one new one)
+#              cause a rename/rename(2to1) conflict
+#   (Related to testcases 1c and 7b)
+#   Commit A. z/{b,c},        y/{d,e}
+#   Commit B. x/{b,c},        y/{d,e,m,wham}
+#   Commit C. z/{b,c,n,wham}, x/{d,e}
+#   Expected: x/{b,c,d,e,m,n}, CONFLICT:(y/wham & z/wham -> x/wham)
+#   Note: y/m & z/n should definitely move into x.  By the same token, both
+#         y/wham & z/wham should to...giving us a conflict.
+
+test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) conflict' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir y &&
+	echo d >y/d &&
+	echo e >y/e &&
+	git add z y &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z x &&
+	echo m >y/m &&
+	echo wham1 >y/wham &&
+	git add y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv y x &&
+	echo n >z/n &&
+	echo wham2 >z/wham &&
+	git add z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+	test 8 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:x/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:x/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:x/d) = $(git rev-parse A:y/d) &&
+	test $(git rev-parse :0:x/e) = $(git rev-parse A:y/e) &&
+	test $(git rev-parse :0:x/m) = $(git rev-parse B:y/m) &&
+	test $(git rev-parse :0:x/n) = $(git rev-parse C:z/n) &&
+
+	test_must_fail git rev-parse :0:x/wham &&
+	test $(git rev-parse :2:x/wham) = $(git rev-parse B:y/wham) &&
+	test $(git rev-parse :3:x/wham) = $(git rev-parse C:z/wham) &&
+
+	test ! -f x/wham &&
+	test -f x/wham~HEAD &&
+	test -f x/wham~C^0 &&
+
+	test $(git hash-object x/wham~HEAD) = $(git rev-parse B:y/wham) &&
+	test $(git hash-object x/wham~C^0) = $(git rev-parse C:z/wham)
+'
+
+# Testcase 1e, Renamed directory, with all filenames being renamed too
+#   Commit A: z/{oldb,oldc}
+#   Commit B: y/{newb,newc}
+#   Commit C: z/{oldb,oldc,d}
+#   Expected: y/{newb,newc,d}
+
+test_expect_success '1e-setup: Renamed directory, with all files being renamed too' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/oldb &&
+	echo c >z/oldc &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir y &&
+	git mv z/oldb y/newb &&
+	git mv z/oldc y/newc &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo d >z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/newb) = $(git rev-parse A:z/oldb) &&
+	test $(git rev-parse HEAD:y/newc) = $(git rev-parse A:z/oldc) &&
+	test $(git rev-parse HEAD:y/d)    = $(git rev-parse C:z/d) &&
+	test_must_fail git rev-parse HEAD:z/d
+'
+
+# Testcase 1f, Split a directory into two other directories
+#   (Related to testcases 3a, all of section 2, and all of section 4)
+#   Commit A: z/{b,c,d,e,f}
+#   Commit B: z/{b,c,d,e,f,g}
+#   Commit C: y/{b,c}, x/{d,e,f}
+#   Expected: y/{b,c}, x/{d,e,f,g}
+
+test_expect_success '1f-setup: Split a directory into two other directories' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >z/d &&
+	echo e >z/e &&
+	echo f >z/f &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo g >z/g &&
+	git add z/g &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir y &&
+	mkdir x &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	git mv z/d x/ &&
+	git mv z/e x/ &&
+	git mv z/f x/ &&
+	rmdir z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '1f-check: Split a directory into two other directories' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:x/d) = $(git rev-parse A:z/d) &&
+	test $(git rev-parse HEAD:x/e) = $(git rev-parse A:z/e) &&
+	test $(git rev-parse HEAD:x/f) = $(git rev-parse A:z/f) &&
+	test $(git rev-parse HEAD:x/g) = $(git rev-parse B:z/g) &&
+	test ! -f z/g &&
+	test_must_fail git rev-parse HEAD:z/g
+'
+
+###########################################################################
+# Rules suggested by testcases in section 1:
+#
+#   We should still detect the directory rename even if it wasn't just
+#   the directory renamed, but the files within it. (see 1b)
+#
+#   If renames split a directory into two or more others, the directory
+#   with the most renames, "wins" (see 1c).  However, see the testcases
+#   in section 2, plus testcases 3a and 4a.
+###########################################################################
+
+test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 05/30] directory rename detection: directory splitting testcases
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (3 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 04/30] directory rename detection: basic testcases Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 23:20   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far Elijah Newren
                   ` (25 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 125 ++++++++++++++++++++++++++++++++++++
 1 file changed, 125 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b737b0a105..00811f512a 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -388,4 +388,129 @@ test_expect_failure '1f-check: Split a directory into two other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###########################################################################
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of paths
+#   Commit A: z/{b,c}
+#   Commit B: y/b, w/c
+#   Commit C: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with equal numbers of paths' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir y &&
+	mkdir w &&
+	git mv z/b y/ &&
+	git mv z/c w/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo d >z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:w/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:z/d) = $(git rev-parse C:z/d) &&
+	test_i18ngrep "CONFLICT.*directory rename split" out
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of paths
+#   Commit A: z/{b,c}
+#   Commit B: y/b, w/c
+#   Commit C: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with equal numbers of paths' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir y &&
+	mkdir w &&
+	git mv z/b y/ &&
+	git mv z/c w/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir x &&
+	echo d >x/d &&
+	git add x/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with equal numbers of paths' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 >out &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:w/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:x/d) = $(git rev-parse C:x/d) &&
+	! test_i18ngrep "CONFLICT.*directory rename split" out
+'
+
+###########################################################################
+# Rules suggested by section 2:
+#
+#   None; the rule was already covered in section 1.  These testcases are
+#   here just to make sure the conflict resolution and necessary warning
+#   messages are handled correctly.
+###########################################################################
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (4 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 05/30] directory rename detection: directory splitting testcases Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-13 23:25   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
                   ` (24 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 137 ++++++++++++++++++++++++++++++++++++
 1 file changed, 137 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 00811f512a..021513ec00 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -513,4 +513,141 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
 #   messages are handled correctly.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###########################################################################
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit A: z/{b,c,d}
+#   Commit B: z/{b,c,d} (no change)
+#   Commit C: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on other side' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	test_tick &&
+	git commit --allow-empty -m "B" &&
+
+	git checkout C &&
+	mkdir y &&
+	mkdir x &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	git mv z/d x/ &&
+	rmdir z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on other side' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:x/d) = $(git rev-parse A:z/d)
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit A: z/{b,c,d}
+#   Commit B: y/{b,c}, x/d
+#   Commit C: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+#         a source in a file rename on the same side of history, that we don't
+#         get it involved in directory rename detection.  If it were, we might
+#         end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+#         rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on current side' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir y &&
+	mkdir x &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	git mv z/d x/ &&
+	rmdir z &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir w &&
+	git mv z/d w/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on current side' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+
+	test 5 -eq $(git ls-files -s | wc -l) &&
+	test 3 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+
+	test $(git rev-parse :1:z/d) = $(git rev-parse A:z/d) &&
+	test $(git rev-parse :2:x/d) = $(git rev-parse A:z/d) &&
+	test $(git rev-parse :3:w/d) = $(git rev-parse A:z/d) &&
+	test ! -f z/d &&
+	test $(git hash-object x/d) = $(git rev-parse A:z/d) &&
+	test $(git hash-object w/d) = $(git rev-parse A:z/d) &&
+
+	test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
+	! test_i18ngrep CONFLICT.*rename/rename.*y/d
+'
+
+###########################################################################
+# Rules suggested by section 3:
+#
+#   Avoid directory-rename-detection for a path, if that path is the source
+#   of a rename on either side of a merge.
+###########################################################################
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (5 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  0:07   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 08/30] directory rename detection: files/directories in the way of some renames Elijah Newren
                   ` (23 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 100 ++++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 021513ec00..ec054b210a 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -650,4 +650,104 @@ test_expect_success '3b-check: Avoid implicit rename if involved as source on cu
 #   of a rename on either side of a merge.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# This section contains a test for this partially-renamed-directory case.
+###########################################################################
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit A: z/{b,c,d,e}
+#   Commit B: y/{b,c,d}, z/e
+#   Commit C: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still present' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >z/d &&
+	echo e >z/e &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir y &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	git mv z/d y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo f >z/f &&
+	git add z/f &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '4a-check: Directory split, with original directory still present' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 5 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse A:z/d) &&
+	test $(git rev-parse HEAD:z/e) = $(git rev-parse A:z/e) &&
+	test $(git rev-parse HEAD:z/f) = $(git rev-parse C:z/f)
+'
+
+###########################################################################
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###########################################################################
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 08/30] directory rename detection: files/directories in the way of some renames
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (6 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  0:15   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 09/30] directory rename detection: testcases checking which side did the rename Elijah Newren
                   ` (22 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 303 ++++++++++++++++++++++++++++++++++++
 1 file changed, 303 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index ec054b210a..d15153c652 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -750,4 +750,307 @@ test_expect_success '4a-check: Directory split, with original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###########################################################################
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit A: z/{b,c},       y/d
+#   Commit B: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit C: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+#         become y/f, we did not apply this for z/e_1 because that would
+#         give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+#         this add/add, is that both versions of y/e are from the same side
+#         of history, giving us no way to represent this conflict in the
+#         index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to original and target' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir y &&
+	echo d >y/d &&
+	git add z y &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo e1 >z/e &&
+	echo f >z/f &&
+	echo e2 >y/e &&
+	git add z/e z/f y/e &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	rmdir z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:y/d) = $(git rev-parse A:y/d) &&
+
+	test $(git rev-parse :0:y/e) = $(git rev-parse B:y/e) &&
+	test $(git rev-parse :0:z/e) = $(git rev-parse B:z/e) &&
+
+	test $(git rev-parse :0:y/f) = $(git rev-parse B:z/f) &&
+
+	test_i18ngrep "CONFLICT.*implicit dir rename" out
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit A: z/{b,c,d_1}
+#   Commit B: y/{b,c,d_2}
+#   Commit C: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit C were to be involved in dir rename detection, as
+#         we normaly would since z/ is being renamed to y/, then this would be
+#         a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+#         conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+#         representable in the index, so the existence of y/d_3 needs to
+#         cause us to bail on directory rename detection for that path, falling
+#         back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflict' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d1 >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/d &&
+	git mv z y &&
+	echo d2 >y/d &&
+	git add y/d &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir y &&
+	echo d3 >y/d &&
+	echo e >z/e &&
+	git add y/d z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+	test 5 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:y/e) = $(git rev-parse C:z/e) &&
+
+	test_must_fail git rev-parse :1:y/d &&
+	test $(git rev-parse :2:y/d) = $(git rev-parse B:y/d) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse C:y/d) &&
+	test -f y/d
+'
+
+# Testcase 5c, Transitive rename would cause rename/rename/rename/add/add/add
+#   (Directory rename detection would result in transitive rename vs.
+#    rename/rename(1to2) and turn it into a rename/rename(1to3).  Further,
+#    rename paths conflict with separate adds on the other side)
+#   (Related to testcases 3b and 7c)
+#   Commit A: z/{b,c}, x/d_1
+#   Commit B: y/{b,c,d_2}, w/d_1
+#   Commit C: z/{b,c,d_1,e}, w/d_3, y/d_4
+#   Expected: A mess, but only a rename/rename(1to2)/add/add mess.  Use the
+#             presence of y/d_4 in C to avoid doing transitive rename of
+#             x/d_1 -> z/d_1 -> y/d_1, so that the only paths we have at
+#             y/d are y/d_2 and y/d_4.  We still do the move from z/e to y/e,
+#             though, because it doesn't have anything in the way.
+
+test_expect_success '5c-setup: Transitive rename would cause rename/rename/rename/add/add/add' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d1 >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	echo d2 >y/d &&
+	git add y/d &&
+	git mv x w &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/ &&
+	mkdir w &&
+	mkdir y &&
+	echo d3 >w/d &&
+	echo d4 >y/d &&
+	echo e >z/e &&
+	git add w/ y/ z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*z/d" out &&
+	test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+	test 9 -eq $(git ls-files -s | wc -l) &&
+	test 6 -eq $(git ls-files -u | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:y/e) = $(git rev-parse C:z/e) &&
+
+	test_must_fail git rev-parse :1:y/d &&
+	test $(git rev-parse :2:w/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse :3:w/d) = $(git rev-parse C:w/d) &&
+	test $(git rev-parse :1:x/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse :2:y/d) = $(git rev-parse B:y/d) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse C:y/d) &&
+	test $(git rev-parse :3:z/d) = $(git rev-parse A:x/d) &&
+
+	test $(git hash-object w/d~HEAD) = $(git rev-parse A:x/d) &&
+	test $(git hash-object w/d~C^0) = $(git rev-parse C:w/d) &&
+	test ! -f x/d &&
+	test -f y/d &&
+	grep -q "<<<<" y/d &&  # conflict markers should be present
+	test $(git hash-object z/d) = $(git rev-parse A:x/d)
+'
+
+# Testcase 5d, Directory/file/file conflict due to directory rename
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c,d_1}
+#   Commit C: z/{b,c,d_2,f}, y/d/e
+#   Expected: y/{b,c,d/e,f}, z/d_2, CONFLICT(file/directory), y/d_1~HEAD
+#   Note: The fact that y/d/ exists in C makes us bail on directory rename
+#         detection for z/d_2, but that doesn't prevent us from applying the
+#         directory rename detection for z/f -> y/f.
+
+test_expect_success '5d-setup: Directory/file/file conflict due to directory rename' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	echo d1 >y/d &&
+	git add y/d &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir -p y/d &&
+	echo e >y/d/e &&
+	echo d2 >z/d &&
+	echo f >z/f &&
+	git add y/d/e z/d z/f &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (file/directory).*y/d" out &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 2 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:z/d) = $(git rev-parse C:z/d) &&
+	test $(git rev-parse :0:y/f) = $(git rev-parse C:z/f) &&
+
+	test $(git rev-parse :2:y/d) = $(git rev-parse B:y/d) &&
+	test $(git rev-parse :0:y/d/e) = $(git rev-parse C:y/d/e) &&
+
+	test $(git hash-object y/d~HEAD) = $(git rev-parse B:y/d)
+'
+
+###########################################################################
+# Rules suggested by section 5:
+#
+#   If a subset of to-be-renamed files have a file or directory in the way,
+#   "turn off" the directory rename for those specific sub-paths, falling
+#   back to old handling.  But, sadly, see testcases 8a and 8b.
+###########################################################################
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 09/30] directory rename detection: testcases checking which side did the rename
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (7 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 08/30] directory rename detection: files/directories in the way of some renames Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  0:25   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 10/30] directory rename detection: more involved edge/corner testcases Elijah Newren
                   ` (21 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 283 ++++++++++++++++++++++++++++++++++++
 1 file changed, 283 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index d15153c652..157299105f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1053,4 +1053,287 @@ test_expect_failure '5d-check: Directory/file/file conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###########################################################################
+
+# Testcase 6a, Tricky rename/delete
+#   Commit A: z/{b,c,d}
+#   Commit B: z/b
+#   Commit C: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+#         them under y/ doesn't accidentally catch z/d and make it look like
+#         it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/c &&
+	git rm z/d &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir y &&
+	git mv z/b y/ &&
+	git mv z/c y/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+	test 2 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :3:y/c) = $(git rev-parse A:z/c)
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+#         but C did that rename and still decided to put the file into z/,
+#         so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	mkdir z &&
+	echo d >z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '6b-check: Same rename done on both sides' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
+'
+
+# Testcase 6c, Rename only done on same side
+#   (Related to testcases 6b and 8e)
+#   Commit A: z/{b,c}
+#   Commit B: z/{b,c} (no change)
+#   Commit C: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Seems obvious, but just checking that the implementation doesn't
+#         "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6c-setup: Rename only done on same side' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	test_tick &&
+	git commit --allow-empty -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	mkdir z &&
+	echo d >z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '6c-check: Rename only done on same side' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
+'
+
+# Testcase 6d, We don't always want transitive renaming
+#   (Related to testcase 1c)
+#   Commit A: z/{b,c}, x/d
+#   Commit B: z/{b,c}, x/d (no change)
+#   Commit C: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6d-setup: We do not always want transitive renaming' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	test_tick &&
+	git commit --allow-empty -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	git mv x z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '6d-check: We do not always want transitive renaming' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:z/d) = $(git rev-parse A:x/d)
+'
+
+# Testcase 6e, Add/add from one-side
+#   Commit A: z/{b,c}
+#   Commit B: z/{b,c} (no change)
+#   Commit C: y/{b,c,d_1}, z/d_2
+#   Expected: y/{b,c,d_1}, z/d_2
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c} +
+#         add/add conflict on y/d_1 vs y/d_2.
+
+test_expect_success '6e-setup: Add/add from one side' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	test_tick &&
+	git commit --allow-empty -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	echo d1 > y/d &&
+	mkdir z &&
+	echo d2 > z/d &&
+	git add y/d z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '6e-check: Add/add from one side' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse C:y/d) &&
+	test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
+'
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 10/30] directory rename detection: more involved edge/corner testcases
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (8 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 09/30] directory rename detection: testcases checking which side did the rename Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  0:42   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
                   ` (20 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 347 ++++++++++++++++++++++++++++++++++++
 1 file changed, 347 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 157299105f..115d0d2622 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1336,4 +1336,351 @@ test_expect_success '6e-check: Add/add from one side' '
 	test $(git rev-parse HEAD:z/d) = $(git rev-parse C:z/d)
 '
 
+
+###########################################################################
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###########################################################################
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir w &&
+	mkdir x &&
+	git mv z/b w/ &&
+	git mv z/c x/ &&
+	echo d > z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+	test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+	test 7 -eq $(git ls-files -s | wc -l) &&
+	test 6 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/d) = $(git rev-parse C:z/d) &&
+
+	test $(git rev-parse :1:z/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :2:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :3:w/b) = $(git rev-parse A:z/b) &&
+
+	test $(git rev-parse :1:z/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :2:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :3:x/c) = $(git rev-parse A:z/c) &&
+
+	test $(git hash-object y/b) = $(git rev-parse A:z/b) &&
+	test $(git hash-object w/b) = $(git rev-parse A:z/b) &&
+	test $(git hash-object y/c) = $(git rev-parse A:z/c) &&
+	test $(git hash-object x/c) = $(git rev-parse A:z/c)
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit A: z/{b,c},     x/d_1, w/d_2
+#   Commit B: y/{b,c,d_2}, x/d_1
+#   Commit C: z/{b,c,d_1},        w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive rename' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	mkdir x &&
+	mkdir w &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d1 > x/d &&
+	echo d2 > w/d &&
+	git add z x w &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	git mv w/d y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/ &&
+	rmdir x &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+
+	test $(git rev-parse :2:y/d) = $(git rev-parse A:w/d) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse A:x/d) &&
+
+	test ! -f y/d &&
+	test -f y/d~HEAD &&
+	test -f y/d~C^0 &&
+
+	test $(git hash-object y/d~HEAD) = $(git rev-parse A:w/d) &&
+	test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
+'
+
+# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
+#   (Related to testcases 3b and 5c)
+#   Commit A: z/{b,c}, x/d
+#   Commit B: y/{b,c}, w/d
+#   Commit C: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)
+#   NOTE: z/ was renamed to y/ so we do not want to report
+#         either CONFLICT(x/d -> w/d vs. z/d)
+#         or CONFLiCT x/d -> w/d vs. y/d vs. z/d)
+
+test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may add complexity' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	git mv x w &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/ &&
+	rmdir x &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*y/d" out &&
+
+	test 5 -eq $(git ls-files -s | wc -l) &&
+	test 3 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+
+	test $(git rev-parse :1:x/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse :2:w/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse A:x/d)
+'
+
+# Testcase 7d, transitive rename involved in rename/delete; how is it reported?
+#   (Related somewhat to testcases 5b and 8d)
+#   Commit A: z/{b,c}, x/d
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(delete x/d vs rename to y/d)
+#   NOTE: z->y so NOT CONFLICT(delete x/d vs rename to z/d)
+
+test_expect_success '7d-setup: transitive rename involved in rename/delete; how is it reported?' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	git rm -rf x &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/ &&
+	rmdir x &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse A:x/d)
+'
+
+# Testcase 7e, transitive rename in rename/delete AND dirs in the way
+#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
+#   (Also related to testcases 9c and 9d)
+#   Commit A: z/{b,c},     x/d_1
+#   Commit B: y/{b,c,d/g}, x/d/f
+#   Commit C: z/{b,c,d_1}
+#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
+#             y/{b,c,d/g}, y/d_1~C^0, x/d/f
+#   NOTE: x/d/f may be slightly confusing here.  x/d_1 -> z/d_1 implies
+#         there is a directory rename from x/ -> z/, performed by commit C.
+#         However, on the side of commit B, it renamed z/ -> y/, thus
+#         making a rename from x/ -> z/ when it was getting rid of z/ seems
+#         non-sensical.  Further, putting x/d/f into y/d/f also doesn't
+#         make a lot of sense because commit B did the renaming of z to y
+#         and it created x/d/f, and it clearly made these things separate,
+#         so it doesn't make much sense to push these together.
+
+test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in the way' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d1 >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	git rm x/d &&
+	mkdir -p x/d &&
+	mkdir -p y/d &&
+	echo f >x/d/f &&
+	echo g >y/d/g &&
+	git add x/d/f y/d/g &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/ &&
+	rmdir x &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+	test 5 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 2 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:x/d/f) = $(git rev-parse B:x/d/f) &&
+	test $(git rev-parse :0:y/d/g) = $(git rev-parse B:y/d/g) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse A:x/d) &&
+
+	test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
+'
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (9 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 10/30] directory rename detection: more involved edge/corner testcases Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14 20:33   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
                   ` (19 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 371 ++++++++++++++++++++++++++++++++++++
 1 file changed, 371 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 115d0d2622..bdfd943c88 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1683,4 +1683,375 @@ test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in th
 	test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
 '
 
+
+###########################################################################
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###########################################################################
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit A. x/{a,b},   y/{c,d}
+#   Commit B. x/{a,b,e}, y/{c,d,f}
+#   Commit C. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   Previous git: y/{a,b,f},   z/{c,d},   x/e
+#   Expected:     y/{a,b,e,f}, z/{c,d}
+#   Preferred:    y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did previously, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir x &&
+	mkdir y &&
+	echo a >x/a &&
+	echo b >x/b &&
+	echo c >y/c &&
+	echo d >y/d &&
+	git add x y &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo e >x/e &&
+	echo f >y/f &&
+	git add x/e y/f &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv y z &&
+	git mv x y &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/a) = $(git rev-parse A:x/a) &&
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:x/b) &&
+	test $(git rev-parse HEAD:y/e) = $(git rev-parse B:x/e) &&
+	test $(git rev-parse HEAD:y/f) = $(git rev-parse B:y/f) &&
+	test $(git rev-parse HEAD:z/c) = $(git rev-parse A:y/c) &&
+	test $(git rev-parse HEAD:z/d) = $(git rev-parse A:y/d)
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with conflicting filenames
+#   Commit A. x/{a_1,b_1},     y/{a_2,b_2}
+#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit C. y/{a_1,b_1},     z/{a_2,b_2}
+#
+# Possible Resolutions:
+#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Scary:        y/{a_1,b_1},     z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
+#   Preferred:    y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implment directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit B, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior for both
+# e_1 and e_2.
+
+test_expect_success '8b-setup: Dual-directory rename, one into the others way, with conflicting filenames' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir x &&
+	mkdir y &&
+	echo a1 >x/a &&
+	echo b1 >x/b &&
+	echo a2 >y/a &&
+	echo b2 >y/b &&
+	git add x y &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo e1 >x/e &&
+	echo e2 >y/e &&
+	git add x/e y/e &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv y z &&
+	git mv x y &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '8b-check: Dual-directory rename, one into the others way, with conflicting filenames' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/a) = $(git rev-parse A:x/a) &&
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:x/b) &&
+	test $(git rev-parse HEAD:z/a) = $(git rev-parse A:y/a) &&
+	test $(git rev-parse HEAD:z/b) = $(git rev-parse A:y/b) &&
+	test $(git rev-parse HEAD:x/e) = $(git rev-parse B:x/e) &&
+	test $(git rev-parse HEAD:y/e) = $(git rev-parse B:y/e)
+'
+
+# Testcase 8c, rename+modify/delete
+#   (Related to testcases 5b and 8d)
+#   Commit A: z/{b,c,d}
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d_modified,e}
+#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
+#
+#   Note: This testcase doesn't present any concerns for me...until you
+#         compare it with testcases 5b and 8d.  See notes in 8d for more
+#         details.
+
+test_expect_success '8c-setup: rename+modify/delete' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	test_seq 1 10 >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/d &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo 11 >z/d &&
+	test_chmod +x z/d &&
+	echo e >z/e &&
+	git add z/d z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '8c-check: rename+modify/delete' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (rename/delete).* z/d.*y/d" out &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :0:y/e) = $(git rev-parse C:z/e) &&
+
+	test_must_fail git rev-parse :1:y/d &&
+	test_must_fail git rev-parse :2:y/d &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse C:z/d) &&
+	git ls-files -s y/d | grep ^100755 &&
+	test -f y/d
+'
+
+# Testcase 8d, rename/delete...or not?
+#   (Related to testcase 5b; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit A: z/{b,c,d}
+#   Commit B: y/{b,c}
+#   Commit C: z/{b,c,d,e}
+#   Expected: y/{b,c,e}
+#
+#   Note: It would also be somewhat reasonable to resolve this as
+#             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
+#   The logic being that the only difference between this testcase and 8c
+#   is that there is no modification to d.  That suggests that instead of a
+#   rename/modify vs. delete conflict, we should just have a rename/delete
+#   conflict, otherwise we are being inconsistent.
+#
+#   However...as far as consistency goes, we didn't report a conflict for
+#   path d_1 in testcase 5b due to a different file being in the way.  So,
+#   we seem to be forced to have cases where users can change things
+#   slightly and get what they may perceive as inconsistent results.  It
+#   would be nice to avoid that, but I'm not sure I see how.
+#
+#   In this case, I'm leaning towards: commit B was the one that deleted z/d
+#   and it did the rename of z to y, so the two "conflicts" (rename vs.
+#   delete) are both coming from commit B, which is non-sensical.  Conflicts
+#   during merging are supposed to be about opposite sides doing things
+#   differently.
+
+test_expect_success '8d-setup: rename/delete...or not?' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	test_seq 1 10 >z/d &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/d &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo e >z/e &&
+	git add z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '8d-check: rename/delete...or not?' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/e) = $(git rev-parse C:z/e)
+'
+
+# Testcase 8e, Both sides rename, one side adds to original directory
+#   Commit A: z/{b,c}
+#   Commit B: y/{b,c}
+#   Commit C: w/{b,c}, z/d
+#
+# Possible Resolutions:
+#   Previous git: z/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
+#   Expected:     y/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
+#   Preferred:    ??
+#
+# Notes: In commit B, directory z got renamed to y.  In commit C, directory z
+#        did NOT get renamed; the directory is still present; instead it is
+#        considered to have just renamed a subset of paths in directory z
+#        elsewhere.  Therefore, the directory rename done in commit B to z/
+#        applies to z/d and maps it to y/d.
+#
+#        It's possible that users would get confused about this, but what
+#        should we do instead?  Silently leaving at z/d seems just as bad or
+#        maybe even worse.  Perhaps we could print a big warning about z/d
+#        and how we're moving to y/d in this case, but when I started thinking
+#        abouty the ramifications of doing that, I didn't know how to rule out
+#        that opening other weird edge and corner cases so I just punted.
+
+test_expect_success '8e-setup: Both sides rename, one side adds to original directory' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z w &&
+	mkdir z &&
+	echo d >z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+
+	test 7 -eq $(git ls-files -s | wc -l) &&
+	test 6 -eq $(git ls-files -u | wc -l) &&
+	test 2 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/d) = $(git rev-parse C:z/d) &&
+
+	test $(git rev-parse :1:z/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :2:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :3:w/b) = $(git rev-parse A:z/b) &&
+	test ! -f z/b &&
+	test $(git hash-object y/b) = $(git rev-parse A:z/b) &&
+	test $(git hash-object w/b) = $(git rev-parse A:z/b) &&
+
+	test $(git rev-parse :1:z/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :2:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :3:w/c) = $(git rev-parse A:z/c) &&
+	test ! -f z/c &&
+	test $(git hash-object y/c) = $(git rev-parse A:z/c) &&
+	test $(git hash-object w/c) = $(git rev-parse A:z/c) &&
+
+	test_i18ngrep CONFLICT.*rename/rename.*z/c.*y/c.*w/c out &&
+	test_i18ngrep CONFLICT.*rename/rename.*z/b.*y/b.*w/b out
+'
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (10 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-15 20:03   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 13/30] directory rename detection: tests for handling overwriting untracked files Elijah Newren
                   ` (18 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 505 ++++++++++++++++++++++++++++++++++++
 1 file changed, 505 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index bdfd943c88..bb179b16c8 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -265,6 +265,7 @@ test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit A: z/{oldb,oldc}
 #   Commit B: y/{newb,newc}
 #   Commit C: z/{oldb,oldc,d}
@@ -2054,4 +2055,508 @@ test_expect_failure '8e-check: Both sides rename, one side adds to original dire
 	test_i18ngrep CONFLICT.*rename/rename.*z/b.*y/b.*w/b out
 '
 
+###########################################################################
+# SECTION 9: Other testcases
+#
+# I came up with the testcases in the first eight sections before coding up
+# the implementation.  The testcases in this section were mostly ones I
+# thought of while coding/debugging, and which I was too lazy to insert
+# into the previous sections because I didn't want to re-label with all the
+# testcase references.  :-)
+###########################################################################
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit A: z/{b,c,d/{e,f,g}}
+#   Commit B: y/{b,c}, x/w/{e,f,g}
+#   Commit C: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+#         is split into multiple other directories, we determine by the weight
+#         of which one had the most paths going to it.  A naive implementation
+#         of that could take the new file in commit C at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed directory' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir -p z/d &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo e >z/d/e &&
+	echo f >z/d/f &&
+	echo g >z/d/g &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir x &&
+	git mv z/d x/w &&
+	git mv z y &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo h >z/d/h &&
+	echo i >z/i &&
+	git add z &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 7 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/i) = $(git rev-parse C:z/i) &&
+
+	test $(git rev-parse HEAD:x/w/e) = $(git rev-parse A:z/d/e) &&
+	test $(git rev-parse HEAD:x/w/f) = $(git rev-parse A:z/d/f) &&
+	test $(git rev-parse HEAD:x/w/g) = $(git rev-parse A:z/d/g) &&
+	test $(git rev-parse HEAD:x/w/h) = $(git rev-parse C:z/d/h)
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit A: z/{b,c},   x/d_1
+#   Commit B: y/{b,c},   x/d_2
+#   Commit C: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	test_seq 1 10 >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	test_seq 1 11 >x/d &&
+	git add x/d &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	test_seq 0 10 >x/d &&
+	git mv x/d z/d &&
+	git add z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9b-check: Transitive rename with content merge' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test_must_fail git rev-parse HEAD:x/d &&
+	test_must_fail git rev-parse HEAD:z/d &&
+	test ! -f z/d &&
+
+	test $(git rev-parse HEAD:y/d) != $(git rev-parse A:x/d) &&
+	test $(git rev-parse HEAD:y/d) != $(git rev-parse B:x/d) &&
+	test $(git rev-parse HEAD:y/d) != $(git rev-parse C:z/d) &&
+	test_seq 0 11 >expected &&
+	git add expected &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse :0:expected) &&
+	test_cmp expected y/d
+'
+
+# Testcase 9c, Doubly transitive rename?
+#   (Related to testcase 1c, 7e, and 9d)
+#   Commit A: z/{b,c},     x/{d,e},    w/f
+#   Commit B: y/{b,c},     x/{d,e,f,g}
+#   Commit C: z/{b,c,d,e},             w/f
+#   Expected: y/{b,c,d,e}, x/{f,g}
+#
+#   NOTE: x/f and x/g may be slightly confusing here.  The rename from w/f to
+#         x/f is clear.  Let's look beyond that.  Here's the logic:
+#            Commit C renamed x/ -> z/
+#            Commit B renamed z/ -> y/
+#         So, we could possibly further rename x/f to z/f to y/f, a doubly
+#         transient rename.  However, where does it end?  We can chain these
+#         indefinitely (see testcase 9d).  What if there is a D/F conflict
+#         at z/f/ or y/f/?  Or just another file conflict at one of those
+#         paths?  In the case of an N-long chain of transient renamings,
+#         where do we "abort" the rename at?  Can the user make sense of
+#         the resulting conflict and resolve it?
+#
+#         To avoid this confusion I use the simple rule that if the other side
+#         of history did a directory rename to a path that your side renamed
+#         away, then ignore that particular rename from the other side of
+#         history for any implicit directory renames.
+
+test_expect_success '9c-setup: Doubly transitive rename?' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	mkdir x &&
+	echo d >x/d &&
+	echo e >x/e &&
+	mkdir w &&
+	echo f >w/f &&
+	git add z x w &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z y &&
+	git mv w/f x/ &&
+	echo g >x/g &&
+	git add x/g &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/d &&
+	git mv x/e z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9c-check: Doubly transitive rename?' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 >out &&
+	test_i18ngrep "WARNING: Avoiding applying x -> z rename to x/f" out &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse HEAD:y/e) = $(git rev-parse A:x/e) &&
+	test $(git rev-parse HEAD:x/f) = $(git rev-parse A:w/f) &&
+	test $(git rev-parse HEAD:x/g) = $(git rev-parse B:x/g)
+'
+
+# Testcase 9d, N-fold transitive rename?
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit A: z/a, y/b, x/c, w/d, v/e, u/f
+#   Commit B:  y/{a,b},  w/{c,d},  u/{e,f}
+#   Commit C: z/{a,t}, x/{b,c}, v/{d,e}, u/f
+#   Expected: <see NOTE first>
+#
+#   NOTE: z/ -> y/ (in commit B)
+#         y/ -> x/ (in commit C)
+#         x/ -> w/ (in commit B)
+#         w/ -> v/ (in commit C)
+#         v/ -> u/ (in commit B)
+#         So, if we add a file to z, say z/t, where should it end up?  In u?
+#         What if there's another file or directory named 't' in one of the
+#         intervening directories and/or in u itself?  Also, shouldn't the
+#         same logic that places 't' in u/ also move ALL other files to u/?
+#         What if there are file or directory conflicts in any of them?  If
+#         we attempted to do N-way (N-fold? N-ary? N-uple?) transitive renames
+#         like this, would the user have any hope of understanding any
+#         conflicts or how their working tree ended up?  I think not, so I'm
+#         ruling out N-ary transitive renames for N>1.
+#
+#   Therefore our expected result is:
+#     z/t, y/a, x/b, w/c, u/d, u/e, u/f
+#   The reason that v/d DOES get transitively renamed to u/d is that u/ isn't
+#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
+#   simple rules that are consistent with what we need for all the other
+#   testcases and simplifies things for the user.
+
+test_expect_success '9d-setup: N-way transitive rename?' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z y x w v u &&
+	echo a >z/a &&
+	echo b >y/b &&
+	echo c >x/c &&
+	echo d >w/d &&
+	echo e >v/e &&
+	echo f >u/f &&
+	git add z y x w v u &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/a y/ &&
+	git mv x/c w/ &&
+	git mv v/e u/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo t >z/t &&
+	git mv y/b x/ &&
+	git mv w/d v/ &&
+	git add z/t &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9d-check: N-way transitive rename?' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 >out &&
+	test_i18ngrep "WARNING: Avoiding applying z -> y rename to z/t" out &&
+	test_i18ngrep "WARNING: Avoiding applying y -> x rename to y/a" out &&
+	test_i18ngrep "WARNING: Avoiding applying x -> w rename to x/b" out &&
+	test_i18ngrep "WARNING: Avoiding applying w -> v rename to w/c" out &&
+
+	test 7 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD:z/t) = $(git rev-parse C:z/t) &&
+	test $(git rev-parse HEAD:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse HEAD:x/b) = $(git rev-parse A:y/b) &&
+	test $(git rev-parse HEAD:w/c) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse HEAD:u/d) = $(git rev-parse A:w/d) &&
+	test $(git rev-parse HEAD:u/e) = $(git rev-parse A:v/e) &&
+	test $(git rev-parse HEAD:u/f) = $(git rev-parse B:u/f)
+'
+
+# Testcase 9e, N-to-1 whammo
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit A: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
+#   Commit B: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, dirN/{j,k,l,yo}
+#   Commit C: combined/{a,b,d,e,g,h,j,k}
+#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
+#             dir1/yo, dir2/yo, dir3/yo, dirN/yo
+
+test_expect_success '9e-setup: N-to-1 whammo' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir dir1 dir2 dir3 dirN &&
+	echo a >dir1/a &&
+	echo b >dir1/b &&
+	echo d >dir2/d &&
+	echo e >dir2/e &&
+	echo g >dir3/g &&
+	echo h >dir3/h &&
+	echo j >dirN/j &&
+	echo k >dirN/k &&
+	git add dir* &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	echo c  >dir1/c &&
+	echo yo >dir1/yo &&
+	echo f  >dir2/f &&
+	echo yo >dir2/yo &&
+	echo i  >dir3/i &&
+	echo yo >dir3/yo &&
+	echo l  >dirN/l &&
+	echo yo >dirN/yo &&
+	git add dir* &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv dir1 combined &&
+	git mv dir2/* combined/ &&
+	git mv dir3/* combined/ &&
+	git mv dirN/* combined/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9e-check: N-to-1 whammo' '
+	git checkout B^0 &&
+
+	test_must_fail git merge -s recursive C^0 >out &&
+	test_i18ngrep "CONFLICT (implicit dir rename): Cannot map more than one path to combined/yo" out >error_line &&
+	grep -q dir1/yo error_line &&
+	grep -q dir2/yo error_line &&
+	grep -q dir3/yo error_line &&
+	grep -q dirN/yo error_line &&
+
+	test 16 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 2 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:combined/a) = $(git rev-parse A:dir1/a) &&
+	test $(git rev-parse :0:combined/b) = $(git rev-parse A:dir1/b) &&
+	test $(git rev-parse :0:combined/c) = $(git rev-parse B:dir1/c) &&
+	test $(git rev-parse :0:combined/d) = $(git rev-parse A:dir2/d) &&
+	test $(git rev-parse :0:combined/e) = $(git rev-parse A:dir2/e) &&
+	test $(git rev-parse :0:combined/f) = $(git rev-parse B:dir2/f) &&
+	test $(git rev-parse :0:combined/g) = $(git rev-parse A:dir3/g) &&
+	test $(git rev-parse :0:combined/h) = $(git rev-parse A:dir3/h) &&
+	test $(git rev-parse :0:combined/i) = $(git rev-parse B:dir3/i) &&
+	test $(git rev-parse :0:combined/j) = $(git rev-parse A:dirN/j) &&
+	test $(git rev-parse :0:combined/k) = $(git rev-parse A:dirN/k) &&
+	test $(git rev-parse :0:combined/l) = $(git rev-parse B:dirN/l) &&
+
+	test $(git rev-parse :0:dir1/yo) = $(git rev-parse B:dir1/yo) &&
+	test $(git rev-parse :0:dir2/yo) = $(git rev-parse B:dir2/yo) &&
+	test $(git rev-parse :0:dir3/yo) = $(git rev-parse B:dir3/yo) &&
+	test $(git rev-parse :0:dirN/yo) = $(git rev-parse B:dirN/yo)
+'
+
+# Testcase 9f, Renamed directory that only contained immediate subdirs
+#   (Related to testcases 1e & 9g)
+#   Commit A: goal/{a,b}/$more_files
+#   Commit B: priority/{a,b}/$more_files
+#   Commit C: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{a,b}/$more_files, priority/c
+
+test_expect_success '9f-setup: Renamed directory that only contained immediate subdirs' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir -p goal/a &&
+	mkdir -p goal/b &&
+	echo foo >goal/a/foo &&
+	echo bar >goal/b/bar &&
+	echo baz >goal/b/baz &&
+	git add goal &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv goal/ priority &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo c >goal/c &&
+	git add goal/c &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:priority/a/foo) = $(git rev-parse A:goal/a/foo) &&
+	test $(git rev-parse HEAD:priority/b/bar) = $(git rev-parse A:goal/b/bar) &&
+	test $(git rev-parse HEAD:priority/b/baz) = $(git rev-parse A:goal/b/baz) &&
+	test $(git rev-parse HEAD:priority/c)     = $(git rev-parse C:goal/c) &&
+	test_must_fail git rev-parse HEAD:goal/c
+'
+
+# Testcase 9g, Renamed directory that only contained immediate subdirs, immediate subdirs renamed
+#   (Related to testcases 1e & 9f)
+#   Commit A: goal/{a,b}/$more_files
+#   Commit B: priority/{alpha,bravo}/$more_files
+#   Commit C: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{alpha,bravo}/$more_files, priority/c
+
+test_expect_success '9g-setup: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir -p goal/a &&
+	mkdir -p goal/b &&
+	echo foo >goal/a/foo &&
+	echo bar >goal/b/bar &&
+	echo baz >goal/b/baz &&
+	git add goal &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir priority &&
+	git mv goal/a/ priority/alpha &&
+	git mv goal/b/ priority/beta &&
+	rmdir goal/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo c >goal/c &&
+	git add goal/c &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '9g-check: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	git checkout B^0 &&
+
+	git merge -s recursive C^0 &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+
+	test $(git rev-parse HEAD:priority/alpha/foo) = $(git rev-parse A:goal/a/foo) &&
+	test $(git rev-parse HEAD:priority/beta/bar) = $(git rev-parse A:goal/b/bar) &&
+	test $(git rev-parse HEAD:priority/beta/baz) = $(git rev-parse A:goal/b/baz) &&
+	test $(git rev-parse HEAD:priority/c)     = $(git rev-parse C:goal/c) &&
+	test_must_fail git rev-parse HEAD:goal/c
+'
+
+###########################################################################
+# Rules suggested by section 9:
+#
+#   If the other side of history did a directory rename to a path that your
+#   side renamed away, then ignore that particular rename from the other
+#   side of history for any implicit directory renames.
+###########################################################################
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 13/30] directory rename detection: tests for handling overwriting untracked files
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (11 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 14/30] directory rename detection: tests for handling overwriting dirty files Elijah Newren
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 314 ++++++++++++++++++++++++++++++++++++
 1 file changed, 314 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index bb179b16c8..7af8962512 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2559,4 +2559,318 @@ test_expect_failure '9g-check: Renamed directory that only contained immediate s
 #   side of history for any implicit directory renames.
 ###########################################################################
 
+###########################################################################
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###########################################################################
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit A: z/{b,c_1}
+#   Commit B: z/b + untracked z/c + untracked z/d
+#   Commit C: z/{b,d_1}
+#   Expected: Aborted Merge +
+#       ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z/c z/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' '
+	git checkout B^0 &&
+	echo very >z/c &&
+	echo important >z/d &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "The following untracked working tree files would be overwritten by merge" err &&
+
+	test 1 -eq $(git ls-files -s | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	test "very" = "$(cat z/c)" &&
+	test "important" = "$(cat z/d)" &&
+	test $(git rev-parse HEAD:z/b) = $(git rev-parse A:z/b)
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit A: z/{b,c_1}
+#   Commit B: y/b + untracked y/{c,d,e}
+#   Commit C: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+#             z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#       ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo b >z/b &&
+	echo c >z/c &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git rm z/c &&
+	git mv z/ y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z/c z/d &&
+	echo e >z/e &&
+	git add z/e &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+	git checkout B^0 &&
+	echo very >y/c &&
+	echo important >y/d &&
+	echo contents >y/e &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "CONFLICT (rename/delete).*Version C^0 of y/d left in tree at y/d~C^0" out &&
+	test_i18ngrep "Error: Refusing to lose untracked file at y/e; writing to y/e~C^0 instead" out &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 5 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test "very" = "$(cat y/c)" &&
+
+	test "important" = "$(cat y/d)" &&
+	test "important" != "$(git rev-parse :3:y/d)" &&
+	test $(git rev-parse :3:y/d) = $(git rev-parse A:z/c) &&
+
+	test "contents" = "$(cat y/e)" &&
+	test "contents" != "$(git rev-parse :3:y/e)" &&
+	test $(git rev-parse :3:y/e) = $(git rev-parse C:z/e)
+'
+
+# Testcase 10c, Overwrite untracked: dir rename/rename(1to2)
+#   Commit A: z/{a,b}, x/{c,d}
+#   Commit B: y/{a,b}, w/c, x/d + different untracked y/c
+#   Commit C: z/{a,b,c}, x/d
+#   Expected: Failed Merge; y/{a,b} + x/d + untracked y/c +
+#             CONFLICT(rename/rename) x/c -> w/c vs y/c +
+#             y/c~C^0 +
+#             ERROR_MSG(Refusing to lose untracked file at y/c)
+
+test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >z/b &&
+	echo c >x/c &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	mkdir w &&
+	git mv x/c w/c &&
+	git mv z/ y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/c z/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+	git checkout B^0 &&
+	echo important >y/c &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+	test_i18ngrep "Refusing to lose untracked file at y/c; adding as y/c~C^0 instead" out &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 3 -eq $(git ls-files -u | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:x/d) = $(git rev-parse A:x/d) &&
+
+	test "important" = "$(cat y/c)" &&
+	test "important" != "$(git rev-parse :3:y/c)" &&
+	test $(git rev-parse :1:x/c) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse :2:w/c) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse :3:y/c) = $(git rev-parse A:x/c) &&
+	test $(git hash-object y/c~C^0) = $(git rev-parse A:x/c)
+'
+
+# Testcase 10d, Delete untracked w/ dir rename/rename(2to1)
+#   Commit A: z/{a,b,c_1},        x/{d,e,f_2}
+#   Commit B: y/{a,b},            x/{d,e,f_2,wham_1} + untracked y/wham
+#   Commit C: z/{a,b,c_1,wham_2}, y/{d,e}
+#   Expected: Failed Merge; y/{a,b,d,e} + untracked y/{wham,wham~C^0,wham~HEAD}+
+#             CONFLICT(rename/rename) z/c_1 vs x/f_2 -> y/wham
+#             ERROR_MSG(Refusing to lose untracked file at y/wham)
+
+test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >z/b &&
+	echo c >z/c &&
+	echo d >x/d &&
+	echo e >x/e &&
+	echo f >x/f &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/c x/wham &&
+	git mv z/ y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/f z/wham &&
+	git mv x/ y/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+	git checkout B^0 &&
+	echo important >y/wham &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+	test_i18ngrep "Refusing to lose untracked file at y/wham" out &&
+
+	test 6 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/d) = $(git rev-parse A:x/d) &&
+	test $(git rev-parse :0:y/e) = $(git rev-parse A:x/e) &&
+
+	test_must_fail git rev-parse :1:y/wham &&
+	test $(git rev-parse :2:y/wham) = $(git rev-parse A:z/c) &&
+	test $(git rev-parse :3:y/wham) = $(git rev-parse A:x/f) &&
+
+	test "important" = "$(cat y/wham)" &&
+	test $(git hash-object y/wham~C^0)  = $(git rev-parse A:x/f) &&
+	test $(git hash-object y/wham~HEAD) = $(git rev-parse A:z/c)
+'
+
+# Testcase 10e, Does git complain about untracked file that's not in the way?
+#   Commit A: z/{a,b}
+#   Commit B: y/{a,b} + untracked z/c
+#   Commit C: z/{a,b,c}
+#   Expected: y/{a,b,c} + untracked z/c
+
+test_expect_success '10e-setup: Does git complain about untracked file that is not really in the way?' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo a >z/a &&
+	echo b >z/b &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/ y/ &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo c >z/c &&
+	git add z/c &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '10e-check: Does git complain about untracked file that is not really in the way?' '
+	git checkout B^0 &&
+	mkdir z &&
+	echo random >z/c &&
+
+	git merge -s recursive C^0 >out 2>err &&
+	! test_i18ngrep "following untracked working tree files would be overwritten by merge" err &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse C:z/c) &&
+
+	test "random" = "$(cat z/c)"
+'
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 14/30] directory rename detection: tests for handling overwriting dirty files
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (12 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 13/30] directory rename detection: tests for handling overwriting untracked files Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 15/30] merge-recursive: Move the get_renames() function Elijah Newren
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 401 ++++++++++++++++++++++++++++++++++++
 1 file changed, 401 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 7af8962512..4066b08767 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2873,4 +2873,405 @@ test_expect_failure '10e-check: Does git complain about untracked file that is n
 	test "random" = "$(cat z/c)"
 '
 
+###########################################################################
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###########################################################################
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit A: z/{a,b_v1},
+#   Commit B: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit C: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+#             z/a, staged version of z/c has sha1sum matching C:z/b_v2,
+#             z/c~HEAD with contents of C:z/b_v2,
+#             z/c with uncommitted mods on top of B:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple rename' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z &&
+	echo a >z/a &&
+	test_seq 1 10 >z/b &&
+	git add z &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/b z/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	echo 11 >>z/b &&
+	git add z/b &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+	git checkout B^0 &&
+	echo stuff >>z/c &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+	test_seq 1 10 >expected &&
+	echo stuff >>expected &&
+
+	test 2 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:z/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :2:z/c) = $(git rev-parse C:z/b) &&
+
+	test "$(git hash-object z/c~HEAD)" = $(git rev-parse C:z/b) &&
+	test_cmp expected z/c
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit A: z/a,         x/{b,c_v1}
+#   Commit B: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit C: y/a,         x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommited mods untracked,
+#             ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory rename' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >x/b &&
+	test_seq 1 10 >x/c &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv x/c z/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	echo 11 >>x/c &&
+	git add x/c &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+	git checkout B^0 &&
+	echo stuff >>z/c &&
+
+	git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+	grep -q stuff */* &&
+	test_seq 1 10 >expected &&
+	echo stuff >>expected &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 0 -eq $(git ls-files -m | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:x/b) = $(git rev-parse A:x/b) &&
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/c) = $(git rev-parse C:x/c) &&
+
+	test "$(git hash-object y/c)" = $(git rev-parse C:x/c) &&
+	test_cmp expected z/c
+'
+
+# Testcase 11c, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit A: y/a,         x/{b,c_v1}
+#   Commit B: y/{a,c_v1},  x/b,       and y/c_v1 has uncommitted mods
+#   Commit C: y/{a,c/d},   x/{b,c_v2}
+#   Expected: Abort_msg("following files would be overwritten by merge") +
+#             y/c left untouched (still has uncommitted mods)
+
+test_expect_success '11c-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir y x &&
+	echo a >y/a &&
+	echo b >x/b &&
+	test_seq 1 10 >x/c &&
+	git add y x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv x/c y/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	mkdir y/c &&
+	echo d >y/c/d &&
+	echo 11 >>x/c &&
+	git add x/c y/c/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_success '11c-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	git checkout B^0 &&
+	echo stuff >>y/c &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "following files would be overwritten by merge" err &&
+
+	grep -q stuff */* &&
+	test_seq 1 10 >expected &&
+	echo stuff >>expected &&
+
+	test 3 -eq $(git ls-files -s | wc -l) &&
+	test 0 -eq $(git ls-files -u | wc -l) &&
+	test 1 -eq $(git ls-files -m | wc -l) &&
+	test 3 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse HEAD) = $(git rev-parse B) &&
+	test_cmp expected y/c
+'
+
+# Testcase 11d, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit A: z/a,         x/{b,c_v1}
+#   Commit B: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit C: y/{a,c/d},   x/{b,c_v2}
+#   Expected: D/F: y/c_v2 vs y/c/d) +
+#             Warning_Msg("Refusing to lose dirty file at z/c) +
+#             y/{a,c~HEAD,c/d}, x/b, now-untracked z/c_v1 with uncommited mods
+
+test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >x/b &&
+	test_seq 1 10 >x/c &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv x/c z/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv z y &&
+	mkdir y/c &&
+	echo d >y/c/d &&
+	echo 11 >>x/c &&
+	git add x/c y/c/d &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	git checkout B^0 &&
+	echo stuff >>z/c &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+	grep -q stuff */* &&
+	test_seq 1 10 >expected &&
+	echo stuff >>expected &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+	test 1 -eq $(git ls-files -u | wc -l) &&
+	test 5 -eq $(git ls-files -o | wc -l) &&
+
+	test $(git rev-parse :0:x/b) = $(git rev-parse A:x/b) &&
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/c/d) = $(git rev-parse C:y/c/d) &&
+	test $(git rev-parse :3:y/c) = $(git rev-parse C:x/c) &&
+
+	test "$(git hash-object y/c~HEAD)" = $(git rev-parse C:x/c) &&
+	test_cmp expected z/c
+'
+
+# Testcase 11e, Avoid deleting not-up-to-date with dir rename/rename(1to2)/add
+#   Commit A: z/{a,b},      x/{c_1,d}
+#   Commit B: y/{a,b,c_2},  x/d, w/c_1, and y/c_2 has uncommitted mods
+#   Commit C: z/{a,b,c_1},  x/d
+#   Expected: Failed Merge; y/{a,b} + x/d +
+#             CONFLICT(rename/rename) x/c_1 -> w/c_1 vs y/c_1 +
+#             ERROR_MSG(Refusing to lose dirty file at y/c)
+#             y/c~C^0 has A:x/c_1 contents
+#             y/c~HEAD has B:y/c_2 contents
+#             y/c has dirty file from before merge
+
+test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >z/b &&
+	echo c >x/c &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/ y/ &&
+	echo different >y/c &&
+	mkdir w &&
+	git mv x/c w/ &&
+	git add y/c &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/c z/ &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	git checkout B^0 &&
+	echo mods >>y/c &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+	test_i18ngrep "Refusing to lose dirty file at y/c" out &&
+
+	test 7 -eq $(git ls-files -s | wc -l) &&
+	test 4 -eq $(git ls-files -u | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	echo different >expected &&
+	echo mods >>expected &&
+
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+	test $(git rev-parse :0:x/d) = $(git rev-parse A:x/d) &&
+
+	test $(git rev-parse :1:x/c) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse :2:w/c) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse :2:y/c) = $(git rev-parse B:y/c) &&
+	test $(git rev-parse :3:y/c) = $(git rev-parse A:x/c) &&
+
+	test "$(git hash-object y/c~C^0)" = $(git rev-parse A:x/c) &&
+	test "$(git hash-object y/c~HEAD)" = $(git rev-parse B:y/c) &&
+	test_cmp expected y/c
+'
+
+# Testcase 11f, Avoid deleting not-up-to-date w/ dir rename/rename(2to1)
+#   Commit A: z/{a,b},        x/{c_1,d_2}
+#   Commit B: y/{a,b,wham_1}, x/d_2, except y/wham has uncommitted mods
+#   Commit C: z/{a,b,wham_2}, x/c_1
+#   Expected: Failed Merge; y/{a,b} + untracked y/{wham~C^0,wham~C^HEAD} +
+#             y/wham with dirty changes from before merge +
+#             CONFLICT(rename/rename) x/c vs x/d -> y/wham
+#             ERROR_MSG(Refusing to lose dirty file at y/wham)
+
+test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	git rm -rf . &&
+	git clean -fdqx &&
+	rm -rf .git &&
+	git init &&
+
+	mkdir z x &&
+	echo a >z/a &&
+	echo b >z/b &&
+	test_seq 1 10 >x/c &&
+	echo d >x/d &&
+	git add z x &&
+	test_tick &&
+	git commit -m "A" &&
+
+	git branch A &&
+	git branch B &&
+	git branch C &&
+
+	git checkout B &&
+	git mv z/ y/ &&
+	git mv x/c y/wham &&
+	test_tick &&
+	git commit -m "B" &&
+
+	git checkout C &&
+	git mv x/d z/wham &&
+	test_tick &&
+	git commit -m "C"
+'
+
+test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	git checkout B^0 &&
+	echo important >>y/wham &&
+
+	test_must_fail git merge -s recursive C^0 >out 2>err &&
+	test_i18ngrep "CONFLICT (rename/rename)" out &&
+	test_i18ngrep "Refusing to lose dirty file at y/wham" out &&
+
+	test 4 -eq $(git ls-files -s | wc -l) &&
+	test 2 -eq $(git ls-files -u | wc -l) &&
+	test 4 -eq $(git ls-files -o | wc -l) &&
+
+	test_seq 1 10 >expected &&
+	echo important >>expected &&
+
+	test $(git rev-parse :0:y/a) = $(git rev-parse A:z/a) &&
+	test $(git rev-parse :0:y/b) = $(git rev-parse A:z/b) &&
+
+	test_must_fail git rev-parse :1:y/wham &&
+	test $(git rev-parse :2:y/wham) = $(git rev-parse A:x/c) &&
+	test $(git rev-parse :3:y/wham) = $(git rev-parse A:x/d) &&
+
+	test_cmp expected y/wham &&
+	test $(git hash-object y/wham~C^0)  = $(git rev-parse A:x/d) &&
+	test $(git hash-object y/wham~HEAD) = $(git rev-parse A:x/c)
+'
+
 test_done
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 15/30] merge-recursive: Move the get_renames() function
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (13 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 14/30] directory rename detection: tests for handling overwriting dirty files Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  4:46   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic Elijah Newren
                   ` (15 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

I want to re-use some other functions in the file without moving those other
functions or dealing with a handful of annoying split function declarations
and definitions.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 138 +++++++++++++++++++++++++++---------------------------
 1 file changed, 69 insertions(+), 69 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3526c8d0b8..382016508b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -540,75 +540,6 @@ struct rename {
 	unsigned processed:1;
 };
 
-/*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
- */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
-{
-	int i;
-	struct string_list *renames;
-	struct diff_options opts;
-
-	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
-
-	diff_setup(&opts);
-	DIFF_OPT_SET(&opts, RECURSIVE);
-	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
-	opts.detect_rename = DIFF_DETECT_RENAME;
-	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
-			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
-			    1000;
-	opts.rename_score = o->rename_score;
-	opts.show_rename_progress = o->show_rename_progress;
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_setup_done(&opts);
-	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
-	diffcore_std(&opts);
-	if (opts.needed_rename_limit > o->needed_rename_limit)
-		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
-		struct string_list_item *item;
-		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
-		if (pair->status != 'R') {
-			diff_free_filepair(pair);
-			continue;
-		}
-		re = xmalloc(sizeof(*re));
-		re->processed = 0;
-		re->pair = pair;
-		item = string_list_lookup(entries, re->pair->one->path);
-		if (!item)
-			re->src_entry = insert_stage_data(re->pair->one->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->src_entry = item->util;
-
-		item = string_list_lookup(entries, re->pair->two->path);
-		if (!item)
-			re->dst_entry = insert_stage_data(re->pair->two->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->dst_entry = item->util;
-		item = string_list_insert(renames, pair->one->path);
-		item->util = re;
-	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
-	return renames;
-}
-
 static int update_stages(struct merge_options *opt, const char *path,
 			 const struct diff_filespec *o,
 			 const struct diff_filespec *a,
@@ -1383,6 +1314,75 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 	return ret;
 }
 
+/*
+ * Get information of all renames which occurred between 'o_tree' and
+ * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
+ * 'b_tree') to be able to associate the correct cache entries with
+ * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+	struct diff_options opts;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+	if (!o->detect_rename)
+		return renames;
+
+	diff_setup(&opts);
+	DIFF_OPT_SET(&opts, RECURSIVE);
+	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
+	opts.detect_rename = DIFF_DETECT_RENAME;
+	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
+			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
+			    1000;
+	opts.rename_score = o->rename_score;
+	opts.show_rename_progress = o->show_rename_progress;
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_setup_done(&opts);
+	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
+	diffcore_std(&opts);
+	if (opts.needed_rename_limit > o->needed_rename_limit)
+		o->needed_rename_limit = opts.needed_rename_limit;
+	for (i = 0; i < diff_queued_diff.nr; ++i) {
+		struct string_list_item *item;
+		struct rename *re;
+		struct diff_filepair *pair = diff_queued_diff.queue[i];
+		if (pair->status != 'R') {
+			diff_free_filepair(pair);
+			continue;
+		}
+		re = xmalloc(sizeof(*re));
+		re->processed = 0;
+		re->pair = pair;
+		item = string_list_lookup(entries, re->pair->one->path);
+		if (!item)
+			re->src_entry = insert_stage_data(re->pair->one->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->src_entry = item->util;
+
+		item = string_list_lookup(entries, re->pair->two->path);
+		if (!item)
+			re->dst_entry = insert_stage_data(re->pair->two->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->dst_entry = item->util;
+		item = string_list_insert(renames, pair->one->path);
+		item->util = re;
+	}
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_flush(&opts);
+	return renames;
+}
+
 static int process_renames(struct merge_options *o,
 			   struct string_list *a_renames,
 			   struct string_list *b_renames)
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (14 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 15/30] merge-recursive: Move the get_renames() function Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  4:56   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs Elijah Newren
                   ` (14 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each.  No code
or logic changes, just a new place to put stuff for when the rename
detection gains additional checks.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 48 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 382016508b..49710c0964 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1638,6 +1638,38 @@ static int process_renames(struct merge_options *o,
 	return clean_merge;
 }
 
+struct rename_info {
+	struct string_list *head_renames;
+	struct string_list *merge_renames;
+};
+
+static struct rename_info *handle_renames(struct merge_options *o,
+					  struct tree *common,
+					  struct tree *head,
+					  struct tree *merge,
+					  struct string_list *entries,
+					  int *clean)
+{
+	struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
+
+	rei->head_renames  = get_renames(o, head, common, head, merge, entries);
+	rei->merge_renames = get_renames(o, merge, common, head, merge, entries);
+	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
+
+	return rei;
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+	string_list_clear(re_info->head_renames, 0);
+	string_list_clear(re_info->merge_renames, 0);
+
+	free(re_info->head_renames);
+	free(re_info->merge_renames);
+
+	free(re_info);
+}
+
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
 {
 	return (is_null_oid(oid) || mode == 0) ? NULL: (struct object_id *)oid;
@@ -1989,7 +2021,8 @@ int merge_trees(struct merge_options *o,
 	}
 
 	if (unmerged_cache()) {
-		struct string_list *entries, *re_head, *re_merge;
+		struct string_list *entries;
+		struct rename_info *re_info;
 		int i;
 		/*
 		 * Only need the hashmap while processing entries, so
@@ -2003,9 +2036,7 @@ int merge_trees(struct merge_options *o,
 		get_files_dirs(o, merge);
 
 		entries = get_unmerged();
-		re_head  = get_renames(o, head, common, head, merge, entries);
-		re_merge = get_renames(o, merge, common, head, merge, entries);
-		clean = process_renames(o, re_head, re_merge);
+		re_info = handle_renames(o, common, head, merge, entries, &clean);
 		record_df_conflict_files(o, entries);
 		if (clean < 0)
 			goto cleanup;
@@ -2030,16 +2061,13 @@ int merge_trees(struct merge_options *o,
 		}
 
 cleanup:
-		string_list_clear(re_merge, 0);
-		string_list_clear(re_head, 0);
+		cleanup_renames(re_info);
+
 		string_list_clear(entries, 1);
+		free(entries);
 
 		hashmap_free(&o->current_file_dir_set, 1);
 
-		free(re_merge);
-		free(re_head);
-		free(entries);
-
 		if (clean < 0)
 			return clean;
 	}
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (15 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  4:58   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 18/30] merge-recursive: Make !o->detect_rename codepath more obvious Elijah Newren
                   ` (13 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list.  Make sure all of these are deallocated when we
are done with them.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 49710c0964..7a3402e50c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1661,10 +1661,21 @@ static struct rename_info *handle_renames(struct merge_options *o,
 
 static void cleanup_renames(struct rename_info *re_info)
 {
-	string_list_clear(re_info->head_renames, 0);
-	string_list_clear(re_info->merge_renames, 0);
+	const struct rename *re;
+	int i;
 
+	for (i = 0; i < re_info->head_renames->nr; i++) {
+		re = re_info->head_renames->items[i].util;
+		diff_free_filepair(re->pair);
+	}
+	string_list_clear(re_info->head_renames, 1);
 	free(re_info->head_renames);
+
+	for (i = 0; i < re_info->merge_renames->nr; i++) {
+		re = re_info->merge_renames->items[i].util;
+		diff_free_filepair(re->pair);
+	}
+	string_list_clear(re_info->merge_renames, 1);
 	free(re_info->merge_renames);
 
 	free(re_info);
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 18/30] merge-recursive: Make !o->detect_rename codepath more obvious
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (16 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs Elijah Newren
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over.  It seems more straightforward to simply avoid calling
either function in that case.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7a3402e50c..f40c70990c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1332,8 +1332,6 @@ static struct string_list *get_renames(struct merge_options *o,
 	struct diff_options opts;
 
 	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
 
 	diff_setup(&opts);
 	DIFF_OPT_SET(&opts, RECURSIVE);
@@ -1652,6 +1650,10 @@ static struct rename_info *handle_renames(struct merge_options *o,
 {
 	struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
 
+	*clean = 1;
+	if (!o->detect_rename)
+		return NULL;
+
 	rei->head_renames  = get_renames(o, head, common, head, merge, entries);
 	rei->merge_renames = get_renames(o, merge, common, head, merge, entries);
 	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
@@ -1664,6 +1666,9 @@ static void cleanup_renames(struct rename_info *re_info)
 	const struct rename *re;
 	int i;
 
+	if (!re_info)
+		return;
+
 	for (i = 0; i < re_info->head_renames->nr; i++) {
 		re = re_info->head_renames->items[i].util;
 		diff_free_filepair(re->pair);
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (17 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 18/30] merge-recursive: Make !o->detect_rename codepath more obvious Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  5:20   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 20/30] merge-recursive: Add a new hashmap for storing directory renames Elijah Newren
                   ` (11 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees.  While these are currently only used in
get_renames(), I want them to be available to some new functions.  No
actual logic changes yet.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 81 ++++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 60 insertions(+), 21 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index f40c70990c..8c9543d85c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1315,24 +1315,15 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 }
 
 /*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ * Get the diff_filepairs changed between o_tree and tree.
  */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
+static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
+					       struct tree *o_tree,
+					       struct tree *tree)
 {
-	int i;
-	struct string_list *renames;
+	struct diff_queue_struct *ret;
 	struct diff_options opts;
 
-	renames = xcalloc(1, sizeof(struct string_list));
-
 	diff_setup(&opts);
 	DIFF_OPT_SET(&opts, RECURSIVE);
 	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
@@ -1348,10 +1339,43 @@ static struct string_list *get_renames(struct merge_options *o,
 	diffcore_std(&opts);
 	if (opts.needed_rename_limit > o->needed_rename_limit)
 		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
+
+	ret = malloc(sizeof(struct diff_queue_struct));
+	ret->queue = diff_queued_diff.queue;
+	ret->nr = diff_queued_diff.nr;
+	// Ignore diff_queued_diff.alloc; we won't be changing the size at all
+
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_queued_diff.queue = NULL;
+	diff_flush(&opts);
+	return ret;
+}
+
+/*
+ * Get information of all renames which occurred in 'pairs', making use of
+ * any implicit directory renames inferred from the other side of history.
+ * We need the three trees in the merge ('o_tree', 'a_tree' and 'b_tree')
+ * to be able to associate the correct cache entries with the rename
+ * information; tree is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct diff_queue_struct *pairs,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+
+	for (i = 0; i < pairs->nr; ++i) {
 		struct string_list_item *item;
 		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
+		struct diff_filepair *pair = pairs->queue[i];
 		if (pair->status != 'R') {
 			diff_free_filepair(pair);
 			continue;
@@ -1375,9 +1399,6 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
 	return renames;
 }
 
@@ -1649,15 +1670,33 @@ static struct rename_info *handle_renames(struct merge_options *o,
 					  int *clean)
 {
 	struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
+	struct diff_queue_struct *head_pairs, *merge_pairs;
 
 	*clean = 1;
 	if (!o->detect_rename)
 		return NULL;
 
-	rei->head_renames  = get_renames(o, head, common, head, merge, entries);
-	rei->merge_renames = get_renames(o, merge, common, head, merge, entries);
+	head_pairs = get_diffpairs(o, common, head);
+	merge_pairs = get_diffpairs(o, common, merge);
+
+	rei->head_renames  = get_renames(o, head_pairs, head,
+					 common, head, merge, entries);
+	rei->merge_renames = get_renames(o, merge_pairs, merge,
+					 common, head, merge, entries);
 	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
 
+cleanup:
+	/*
+	 * Some cleanup is deferred until cleanup_renames() because the
+	 * data structures are still needed and referenced in
+	 * process_entry().  But there are a few things we can free now.
+	 */
+
+	free(head_pairs->queue);
+	free(head_pairs);
+	free(merge_pairs->queue);
+	free(merge_pairs);
+
 	return rei;
 }
 
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 20/30] merge-recursive: Add a new hashmap for storing directory renames
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (18 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 21/30] merge-recursive: Add get_directory_renames() Elijah Newren
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This just adds dir_rename_entry and the associated functions; code using
these will be added in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 24 ++++++++++++++++++++++++
 merge-recursive.h |  8 ++++++++
 2 files changed, 32 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 8c9543d85c..89a9b32635 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -49,6 +49,30 @@ static unsigned int path_hash(const char *path)
 	return ignore_case ? strihash(path) : strhash(path);
 }
 
+static struct dir_rename_entry *dir_rename_find_entry(struct hashmap *hashmap, char *dir)
+{
+	struct dir_rename_entry key;
+
+	if (dir == NULL)
+		return NULL;
+	hashmap_entry_init(&key, strhash(dir));
+	key.dir = dir;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int dir_rename_cmp(void *unused_cmp_data,
+			  const struct dir_rename_entry *e1,
+			  const struct dir_rename_entry *e2,
+			  const void *unused_keydata)
+{
+	return strcmp(e1->dir, e2->dir);
+}
+
+static void dir_rename_init(struct hashmap *map)
+{
+	hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..a024949739 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,14 @@ struct merge_options {
 	struct string_list df_conflict_file_set;
 };
 
+struct dir_rename_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *dir;
+	unsigned non_unique_new_dir:1;
+	char *new_dir;
+	struct string_list possible_new_dirs;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 21/30] merge-recursive: Add get_directory_renames()
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (19 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 20/30] merge-recursive: Add a new hashmap for storing directory renames Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-14  5:30   ` Junio C Hamano
  2017-11-10 19:05 ` [PATCH 22/30] merge-recursive: Check for directory level conflicts Elijah Newren
                   ` (9 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This populates a list of directory renames for us.  The list of
directory renames is not yet used, but will be in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 146 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 89a9b32635..b5770d3d7f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1376,6 +1376,124 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static void get_renamed_dir_portion(const char *old_path, const char *new_path,
+				    char **old_dir, char **new_dir) {
+	*old_dir = NULL;
+	*new_dir = NULL;
+
+	/* For
+	 *    "a/b/c/d/foo.c" -> "a/b/something-else/d/foo.c"
+	 * the "d/foo.c" part is the same, we just want to know that
+	 *    "a/b/c" was renamed to "a/b/something-else"
+	 * so, for this example, this function returns "a/b/c" in
+	 * *old_dir and "a/b/something-else" in *new_dir.
+	 *
+	 * Also, if the basename of the file changed, we don't care.  We
+	 * want to know which portion of the directory, if any, changed.
+	 */
+	char *end_of_old = strrchr(old_path, '/');
+	char *end_of_new = strrchr(new_path, '/');
+	if (end_of_old == NULL || end_of_new == NULL)
+		return;
+	while (*--end_of_new == *--end_of_old &&
+	       end_of_old != old_path &&
+	       end_of_new != new_path)
+		; // Do nothing; all in the while loop
+	/*
+	 * We've found the first non-matching character in the directory
+	 * paths.  That means the current directory we were comparing
+	 * represents the rename.  Move end_of_old and end_of_new back
+	 * to the full directory name.
+	 */
+	if (*end_of_old == '/')
+		end_of_old++;
+	if (*end_of_old != '/')
+		end_of_new++;
+	end_of_old = strchr(end_of_old, '/');
+	end_of_new = strchr(end_of_new, '/');
+
+	/*
+	 * It may have been the case that old_path and new_path were the same
+	 * directory all along.  Don't claim a rename if they're the same.
+	 */
+	int old_len = end_of_old - old_path;
+	int new_len = end_of_new - new_path;
+
+	if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
+		*old_dir = strndup(old_path, old_len);
+		*new_dir = strndup(new_path, new_len);
+	}
+}
+
+static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
+					     struct tree *tree) {
+	struct hashmap *dir_renames;
+	struct hashmap_iter iter;
+	struct dir_rename_entry *entry;
+	int i;
+
+	dir_renames = malloc(sizeof(struct hashmap));
+	dir_rename_init(dir_renames);
+	for (i = 0; i < pairs->nr; ++i) {
+		struct string_list_item *item;
+		int *count;
+		struct diff_filepair *pair = pairs->queue[i];
+
+		char *old_dir, *new_dir;
+		get_renamed_dir_portion(pair->one->path, pair->two->path,
+					&old_dir,        &new_dir);
+		if (!old_dir)
+			// Directory didn't change at all; ignore this one.
+			continue;
+
+		entry = dir_rename_find_entry(dir_renames, old_dir);
+		if (!entry) {
+			entry = xcalloc(1, sizeof(struct dir_rename_entry));
+			hashmap_entry_init(entry, strhash(old_dir));
+			hashmap_put(dir_renames, entry);
+			entry->dir = old_dir;
+		} else {
+			free(old_dir);
+		}
+		item = string_list_lookup(&entry->possible_new_dirs, new_dir);
+		if (!item) {
+			item = string_list_insert(&entry->possible_new_dirs, new_dir);
+			item->util = xcalloc(1, sizeof(int));
+		} else {
+			free(new_dir);
+		}
+		count = item->util;
+		*count += 1;
+	}
+
+	hashmap_iter_init(dir_renames, &iter);
+	while ((entry = hashmap_iter_next(&iter))) {
+		int max = 0;
+		int bad_max = 0;
+		char *best = NULL;
+		for (i = 0; i < entry->possible_new_dirs.nr; i++) {
+			int *count = entry->possible_new_dirs.items[i].util;
+			if (*count == max)
+				bad_max = max;
+			else if (*count > max) {
+				max = *count;
+				best = entry->possible_new_dirs.items[i].string;
+			}
+		}
+		if (bad_max == max)
+			entry->non_unique_new_dir = 1;
+		else
+			entry->new_dir = strdup(best);
+		/* Strings were strndup'ed before inserting into string-list,
+		 * so ask string_list to remove the entries for us.
+		 */
+		entry->possible_new_dirs.strdup_strings = 1;
+		string_list_clear(&entry->possible_new_dirs, 1);
+	}
+
+	return dir_renames;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1695,6 +1813,9 @@ static struct rename_info *handle_renames(struct merge_options *o,
 {
 	struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));
 	struct diff_queue_struct *head_pairs, *merge_pairs;
+	struct hashmap *dir_re_head, *dir_re_merge;
+	struct hashmap_iter iter;
+	struct dir_rename_entry *e;
 
 	*clean = 1;
 	if (!o->detect_rename)
@@ -1703,6 +1824,9 @@ static struct rename_info *handle_renames(struct merge_options *o,
 	head_pairs = get_diffpairs(o, common, head);
 	merge_pairs = get_diffpairs(o, common, merge);
 
+	dir_re_head = get_directory_renames(head_pairs, head);
+	dir_re_merge = get_directory_renames(merge_pairs, merge);
+
 	rei->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	rei->merge_renames = get_renames(o, merge_pairs, merge,
@@ -1716,6 +1840,28 @@ static struct rename_info *handle_renames(struct merge_options *o,
 	 * process_entry().  But there are a few things we can free now.
 	 */
 
+	hashmap_iter_init(dir_re_head, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->dir);
+		if (e->new_dir)
+			free(e->new_dir);
+		/* possible_new_dirs already cleared in get_directory_renames */
+		//string_list_clear(&e->possible_new_dirs, 1);
+	}
+	hashmap_free(dir_re_head, 1);
+	free(dir_re_head);
+
+	hashmap_iter_init(dir_re_merge, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->dir);
+		if (e->new_dir)
+			free(e->new_dir);
+		/* possible_new_dirs already cleared in get_directory_renames */
+		//string_list_clear(&e->possible_new_dirs, 1);
+	}
+	hashmap_free(dir_re_merge, 1);
+	free(dir_re_merge);
+
 	free(head_pairs->queue);
 	free(head_pairs);
 	free(merge_pairs->queue);
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 22/30] merge-recursive: Check for directory level conflicts
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (20 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 21/30] merge-recursive: Add get_directory_renames() Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 23/30] merge-recursive: Add a new hashmap for storing file collisions Elijah Newren
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
directory level.  There will be additional checks at the individual
file level too, which will be added later.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index b5770d3d7f..3633be0123 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1376,6 +1376,15 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static int tree_has_path(struct tree *tree, const char *path)
+{
+	unsigned char hashy[20];
+	unsigned mode_o;
+
+	return !get_tree_entry(tree->object.oid.hash, path,
+			       hashy, &mode_o);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir) {
 	*old_dir = NULL;
@@ -1425,6 +1434,105 @@ static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 	}
 }
 
+/*
+ * There are a couple things we want to do at the directory level:
+ *   1. Check for both sides renaming to the same thing, in order to avoid
+ *      implicit renaming of files that should be left in place.  (See
+ *      testcase 6b in t6043 for details.)
+ *   2. Prune directory renames if there are still files left in the
+ *      the original directory.  These represent a partial directory rename,
+ *      i.e. a rename where only some of the files within the directory
+ *      were renamed elsewhere.  (Technically, this could be done earlier
+ *      in get_directory_renames(), except that would prevent us from
+ *      doing the previous check and thus failing testcase 6b.)
+ *   3. Check for rename/rename(1to2) conflicts (at the directory level).
+ *      In the future, we could potentially record this info as well and
+ *      omit reporting rename/rename(1to2) conflicts for each path within
+ *      the affected directories, thus cleaning up the merge output.
+ *   NOTE: We do NOT check for rename/rename(2to1) conflicts at the
+ *         directory level, because merging directories is fine.  If it
+ *         causes conflicts for files within those merged directories, then
+ *         that should be detected at the individual path level.
+ */
+static void handle_directory_level_conflicts(struct merge_options *o,
+					     struct hashmap *dir_re_head,
+					     struct tree *head,
+					     struct hashmap *dir_re_merge,
+					     struct tree *merge)
+{
+	struct hashmap_iter iter;
+	struct dir_rename_entry *head_ent;
+	struct dir_rename_entry *merge_ent;
+	int i;
+
+	struct string_list remove_from_head = STRING_LIST_INIT_NODUP;
+	struct string_list remove_from_merge = STRING_LIST_INIT_NODUP;
+
+	hashmap_iter_init(dir_re_head, &iter);
+	while ((head_ent = hashmap_iter_next(&iter))) {
+		merge_ent = dir_rename_find_entry(dir_re_merge, head_ent->dir);
+		if (merge_ent &&
+		    !head_ent->non_unique_new_dir &&
+		    !merge_ent->non_unique_new_dir &&
+		    !strcmp(head_ent->new_dir, merge_ent->new_dir)) {
+			/* 1. Renamed identically; remove it from both sides */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			free(head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			free(merge_ent->new_dir);
+		} else if (tree_has_path(head, head_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			free(head_ent->new_dir);
+		}
+	}
+
+	hashmap_iter_init(dir_re_merge, &iter);
+	while ((merge_ent = hashmap_iter_next(&iter))) {
+		head_ent = dir_rename_find_entry(dir_re_head, merge_ent->dir);
+		if (tree_has_path(merge, merge_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+		} else if (head_ent &&
+			   !head_ent->non_unique_new_dir &&
+			   !merge_ent->non_unique_new_dir) {
+			/* 3. rename/rename(1to2) */
+			/* We can assume it's not rename/rename(1to1) because
+			 * that was case (1), already checked above.  But
+			 * quickly test that assertion, just because.
+			 */
+			assert(strcmp(head_ent->new_dir, merge_ent->new_dir));
+			output(o, 1, _("CONFLICT (rename/rename): "
+				       "Rename directory %s->%s in %s. "
+				       "Rename directory %s->%s in %s"),
+			       head_ent->dir, head_ent->new_dir, o->branch1,
+			       head_ent->dir, merge_ent->new_dir, o->branch2);
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			free(head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			free(merge_ent->new_dir);
+		}
+	}
+
+	for (i = 0; i < remove_from_head.nr; i++) {
+		head_ent = remove_from_head.items[i].util;
+		hashmap_remove(dir_re_head, head_ent, NULL);
+	}
+	for (i = 0; i < remove_from_merge.nr; i++) {
+		merge_ent = remove_from_merge.items[i].util;
+		hashmap_remove(dir_re_merge, merge_ent, NULL);
+	}
+
+	string_list_clear(&remove_from_head, 0);
+	string_list_clear(&remove_from_merge, 0);
+}
+
 static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 					     struct tree *tree) {
 	struct hashmap *dir_renames;
@@ -1827,6 +1935,10 @@ static struct rename_info *handle_renames(struct merge_options *o,
 	dir_re_head = get_directory_renames(head_pairs, head);
 	dir_re_merge = get_directory_renames(merge_pairs, merge);
 
+	handle_directory_level_conflicts(o,
+					 dir_re_head, head,
+					 dir_re_merge, merge);
+
 	rei->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	rei->merge_renames = get_renames(o, merge_pairs, merge,
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 23/30] merge-recursive: Add a new hashmap for storing file collisions
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (21 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 22/30] merge-recursive: Check for directory level conflicts Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging Elijah Newren
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Directory renames with the ability to merge directories opens up the
possibility of add/add/add/.../add conflicts, if each of the N
directories being merged into one target directory all had a file with
the same name.  We need a way to check for and report on such
collisions; this hashmap will be used for this purpose.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 23 +++++++++++++++++++++++
 merge-recursive.h |  7 +++++++
 2 files changed, 30 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3633be0123..1858686c35 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -73,6 +73,29 @@ static void dir_rename_init(struct hashmap *map)
 	hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
 }
 
+static struct collision_entry *collision_find_entry(struct hashmap *hashmap,
+						    char *target_file)
+{
+	struct collision_entry key;
+
+	hashmap_entry_init(&key, strhash(target_file));
+	key.target_file = target_file;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int collision_cmp(void *unused_cmp_data,
+			 const struct collision_entry *e1,
+			 const struct collision_entry *e2,
+			 const void *unused_keydata)
+{
+	return strcmp(e1->target_file, e2->target_file);
+}
+
+static void collision_init(struct hashmap *map)
+{
+	hashmap_init(map, (hashmap_cmp_fn) collision_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index a024949739..e02c1e1243 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -37,6 +37,13 @@ struct dir_rename_entry {
 	struct string_list possible_new_dirs;
 };
 
+struct collision_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *target_file;
+	struct string_list source_files;
+	unsigned reported_already:1;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (22 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 23/30] merge-recursive: Add a new hashmap for storing file collisions Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2018-06-10 10:56   ` René Scharfe
  2017-11-10 19:05 ` [PATCH 25/30] merge-recursive: Check for file level conflicts then get new name Elijah Newren
                   ` (6 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

directory renaming and merging can cause one or more files to be moved to
where an existing file is, or to cause several files to all be moved to
the same (otherwise vacant) location.  Add checking and reporting for such
cases, falling back to no-directory-rename handling for such paths.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 116 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1858686c35..251c4cc7fa 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1408,6 +1408,30 @@ static int tree_has_path(struct tree *tree, const char *path)
 			       hashy, &mode_o);
 }
 
+/*
+ * Return a new string that replaces the beginning portion (which matches
+ * entry->dir), with entry->new_dir.  In perl-speak:
+ *   new_path_name = (old_path =~ s/entry->dir/entry->new_dir/);
+ */
+static char *apply_dir_rename(struct dir_rename_entry *entry,
+			       const char *old_path) {
+	char *new_path;
+	int oldlen, newlen;
+
+	if (entry->non_unique_new_dir)
+		return NULL;
+
+	oldlen = strlen(entry->dir);
+	assert(strncmp(entry->dir, old_path, oldlen) == 0 &&
+	       old_path[oldlen] == '/');
+	newlen = strlen(entry->new_dir) + (strlen(old_path) - oldlen) + 1;
+	new_path = malloc(newlen);
+	strcpy(new_path, entry->new_dir);
+	strcpy(&new_path[strlen(new_path)], &old_path[oldlen]);
+
+	return new_path;
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir) {
 	*old_dir = NULL;
@@ -1625,6 +1649,82 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 	return dir_renames;
 }
 
+static struct dir_rename_entry *check_dir_renamed(const char *path,
+						  struct hashmap *dir_renames) {
+	char temp[PATH_MAX];
+	char *end;
+	struct dir_rename_entry *entry;
+
+	strcpy(temp, path);
+	while ((end = strrchr(temp, '/'))) {
+		*end = '\0';
+		entry = dir_rename_find_entry(dir_renames, temp);
+		if (entry)
+			return entry;
+	}
+	return NULL;
+}
+
+static void compute_collisions(struct hashmap *collisions,
+			       struct hashmap *dir_renames,
+			       struct diff_queue_struct *pairs)
+{
+	int i;
+
+	/*
+	 * Multiple files can be mapped to the same path due to directory
+	 * renames done by the other side of history.  Since that other
+	 * side of history could have merged multiple directories into one,
+	 * if our side of history added the same file basename to each of
+	 * those directories, then all N of them would get implicitly
+	 * renamed by the directory rename detection into the same path,
+	 * and we'd get an add/add/.../add conflict, and all those adds
+	 * from *this* side of history.  This is not representable in the
+	 * index, and users aren't going to easily be able to make sense of
+	 * it.  So we need to provide a good warning about what's
+	 * happening, and fall back to no-directory-rename detection
+	 * behavior for those paths.
+	 *
+	 * See testcases 9e and all of section 5 from t6043 for examples.
+	 */
+	collision_init(collisions);
+
+	for (i = 0; i < pairs->nr; ++i) {
+		struct dir_rename_entry *dir_rename_ent;
+		struct collision_entry *collision_ent;
+		char *new_path;
+		struct diff_filepair *pair = pairs->queue[i];
+
+		if (pair->status == 'D')
+			continue;
+		dir_rename_ent = check_dir_renamed(pair->two->path, dir_renames);
+		if (!dir_rename_ent)
+			continue;
+
+		new_path = apply_dir_rename(dir_rename_ent, pair->two->path);
+		if (!new_path)
+			/*
+			 * dir_rename_ent->non_unique_new_path is true, which
+			 * means there is no directory rename for us to use,
+			 * which means it won't cause us any additional
+			 * collisions.
+			 */
+			continue;
+		collision_ent = collision_find_entry(collisions, new_path);
+		if (!collision_ent) {
+			collision_ent = xcalloc(1,
+						sizeof(struct collision_entry));
+			hashmap_entry_init(collision_ent, strhash(new_path));
+			hashmap_put(collisions, collision_ent);
+			collision_ent->target_file = new_path;
+		} else {
+			free(new_path);
+		}
+		string_list_insert(&collision_ent->source_files,
+				   pair->two->path);
+	}
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1634,6 +1734,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
  */
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
+				       struct hashmap *dir_renames,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
@@ -1641,8 +1742,12 @@ static struct string_list *get_renames(struct merge_options *o,
 				       struct string_list *entries)
 {
 	int i;
+	struct hashmap collisions;
+	struct hashmap_iter iter;
+	struct collision_entry *e;
 	struct string_list *renames;
 
+	compute_collisions(&collisions, dir_renames, pairs);
 	renames = xcalloc(1, sizeof(struct string_list));
 
 	for (i = 0; i < pairs->nr; ++i) {
@@ -1672,6 +1777,13 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
+
+	hashmap_iter_init(&collisions, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->target_file);
+		string_list_clear(&e->source_files, 0);
+	}
+	hashmap_free(&collisions, 1);
 	return renames;
 }
 
@@ -1962,9 +2074,11 @@ static struct rename_info *handle_renames(struct merge_options *o,
 					 dir_re_head, head,
 					 dir_re_merge, merge);
 
-	rei->head_renames  = get_renames(o, head_pairs, head,
+	rei->head_renames  = get_renames(o, head_pairs,
+					 dir_re_merge, head,
 					 common, head, merge, entries);
-	rei->merge_renames = get_renames(o, merge_pairs, merge,
+	rei->merge_renames = get_renames(o, merge_pairs,
+					 dir_re_head, merge,
 					 common, head, merge, entries);
 	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
 
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 25/30] merge-recursive: Check for file level conflicts then get new name
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (23 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 26/30] merge-recursive: When comparing files, don't include trees Elijah Newren
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
file level either.  If there aren't any, then get the new name from
any directory renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 184 ++++++++++++++++++++++++++++++++++--
 t/t6043-merge-rename-directories.sh |   2 +-
 2 files changed, 176 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 251c4cc7fa..7c2c29bb51 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1481,6 +1481,102 @@ static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 	}
 }
 
+/*
+ * Write:
+ *   element1, element2, element3, ..., elementN
+ * to str.  If only one element, just write "element1" to str.
+ */
+static void comma_separated_list(char *str, struct string_list *slist) {
+	int i;
+	for (i=0; i<slist->nr; i++) {
+		str += sprintf(str, "%s", slist->items[i].string);
+		if (i < slist->nr-1)
+			str += sprintf(str, ", ");
+	}
+}
+
+/*
+ * See if there is a directory rename for path, and if there are any file
+ * level conflicts for the renamed location.  If there is a rename and
+ * there are no conflicts, return the new name.  Otherwise, return NULL.
+ */
+static char* handle_path_level_conflicts(struct merge_options *o,
+					 const char *path,
+					 struct dir_rename_entry *entry,
+					 struct hashmap *collisions,
+					 struct tree *tree)
+{
+	char *new_path = NULL;
+	struct collision_entry *collision_ent;
+	int clean = 1;
+
+	/*
+	 * entry has the mapping of old directory name to new directory name
+	 * that we want to apply to path.
+	 */
+	new_path = apply_dir_rename(entry, path);
+
+	if (!new_path) {
+		/* This should only happen when entry->non_unique_new_dir set */
+		assert(entry->non_unique_new_dir);
+		output(o, 1, _("CONFLICT (directory rename split): "
+			       "Unclear where to place %s because directory "
+			       "%s was renamed to multiple other directories, "
+			       "with no destination getting a majority of the "
+			       "files."),
+		       path, entry->dir);
+		clean = 0;
+		return NULL;
+	}
+
+	/*
+	 * The caller needs to have ensured that it has pre-populated
+	 * collisions with all paths that map to new_path.  Do a quick check
+	 * to ensure that's the case.
+	  */
+	collision_ent = collision_find_entry(collisions, new_path);
+	assert(collision_ent != NULL);
+
+	/*
+	 * Check for one-sided add/add/.../add conflicts, i.e.
+	 * where implicit renames from the other side doing
+	 * directory rename(s) can affect this side of history
+	 * to put multiple paths into the same location.  Warn
+	 * and bail on directory renames for such paths.
+	 */
+	char collision_paths[(PATH_MAX+2) * collision_ent->source_files.nr];
+	if (collision_ent->reported_already) {
+		clean = 0;
+	} else if (tree_has_path(tree, new_path)) {
+		collision_ent->reported_already = 1;
+		comma_separated_list(collision_paths,
+				     &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Existing "
+			       "file/dir at %s in the way of implicit "
+			       "directory rename(s) putting the following "
+			       "path(s) there: %s."),
+		       new_path, collision_paths);
+		clean = 0;
+	} else if (collision_ent->source_files.nr > 1) {
+		collision_ent->reported_already = 1;
+		comma_separated_list(collision_paths,
+				     &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Cannot map "
+			       "more than one path to %s; implicit directory "
+			       "renames tried to put these paths there: %s"),
+		       new_path, collision_paths);
+		clean = 0;
+	}
+
+	/* Free memory we no longer need */
+	if (!clean && new_path) {
+		free(new_path);
+		return NULL;
+	}
+
+	return new_path;
+}
+
 /*
  * There are a couple things we want to do at the directory level:
  *   1. Check for both sides renaming to the same thing, in order to avoid
@@ -1725,6 +1821,58 @@ static void compute_collisions(struct hashmap *collisions,
 	}
 }
 
+static char *check_for_directory_rename(struct merge_options *o,
+					const char *path,
+					struct tree *tree,
+					struct hashmap *dir_renames,
+					struct hashmap *dir_rename_exclusions,
+					struct hashmap *collisions,
+					int *clean_merge)
+{
+	char *new_path = NULL;
+	struct dir_rename_entry *entry = check_dir_renamed(path, dir_renames);
+	if (!entry)
+		return new_path;
+
+	/*
+	 * This next part is a little weird.  We do not want to do an
+	 * implicit rename into a directory we renamed on our side, because
+	 * that will result in a spurious rename/rename(1to2) conflict.  An
+	 * example:
+	 *   Base commit: dumbdir/afile, otherdir/bfile
+	 *   Side 1:      smrtdir/afile, otherdir/bfile
+	 *   Side 2:      dumbdir/afile, dumbdir/bfile
+	 * Here, while working on Side 1, we could notice that otherdir was
+	 * renamed/merged to dumbdir, and change the diff_filepair for
+	 * otherdir/bfile into a rename into dumbdir/bfile.  However, Side
+	 * 2 will notice the rename from dumbdir to smrtdir, and do the
+	 * transitive rename to move it from dumbdir/bfile to
+	 * smrtdir/bfile.  That gives us bfile in dumbdir vs being in
+	 * smrtdir, a rename/rename(1to2) conflict.  We really just want
+	 * the file to end up in smrtdir.  And the way to achieve that is
+	 * to not let Side1 do the rename to dumbdir, since we know that is
+	 * the source of one of our directory renames.
+	 *
+	 * That's why oentry and dir_rename_exclusions is here.
+	 *
+	 * As it turns out, this also prevents N-way transient rename
+	 * confusion; See testcases 9c and 9d of t6043.
+	 */
+	struct dir_rename_entry *oentry = NULL;
+	oentry = dir_rename_find_entry(dir_rename_exclusions, entry->new_dir);
+	if (oentry) {
+		output(o, 1, _("WARNING: Avoiding applying %s -> %s rename "
+			       "to %s, because %s itself was renamed."),
+		       entry->dir, entry->new_dir, path, entry->new_dir);
+	} else {
+		new_path = handle_path_level_conflicts(o, path, entry,
+						       collisions, tree);
+		*clean_merge &= (new_path != NULL);
+	}
+
+	return new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1735,11 +1883,13 @@ static void compute_collisions(struct hashmap *collisions,
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
 				       struct hashmap *dir_renames,
+				       struct hashmap *dir_rename_exclusions,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
 				       struct tree *b_tree,
-				       struct string_list *entries)
+				       struct string_list *entries,
+				       int *clean_merge)
 {
 	int i;
 	struct hashmap collisions;
@@ -1754,10 +1904,22 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct string_list_item *item;
 		struct rename *re;
 		struct diff_filepair *pair = pairs->queue[i];
-		if (pair->status != 'R') {
+		char *new_path; // non-NULL only with directory renames
+
+		if (pair->status == 'D') {
 			diff_free_filepair(pair);
 			continue;
 		}
+		new_path = check_for_directory_rename(o, pair->two->path, tree,
+						      dir_renames,
+						      dir_rename_exclusions,
+						      &collisions,
+						      clean_merge);
+		if (pair->status != 'R' && !new_path) {
+			diff_free_filepair(pair);
+			continue;
+		}
+
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
 		re->pair = pair;
@@ -2075,12 +2237,18 @@ static struct rename_info *handle_renames(struct merge_options *o,
 					 dir_re_merge, merge);
 
 	rei->head_renames  = get_renames(o, head_pairs,
-					 dir_re_merge, head,
-					 common, head, merge, entries);
+					 dir_re_merge, dir_re_head, head,
+					 common, head, merge, entries,
+					 clean);
+	if (*clean < 0)
+		goto cleanup;
 	rei->merge_renames = get_renames(o, merge_pairs,
-					 dir_re_head, merge,
-					 common, head, merge, entries);
-	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
+					 dir_re_head, dir_re_merge, merge,
+					 common, head, merge, entries,
+					 clean);
+	if (*clean < 0)
+		goto cleanup;
+	*clean &= process_renames(o, rei->head_renames, rei->merge_renames);
 
 cleanup:
 	/*
@@ -2095,7 +2263,6 @@ static struct rename_info *handle_renames(struct merge_options *o,
 		if (e->new_dir)
 			free(e->new_dir);
 		/* possible_new_dirs already cleared in get_directory_renames */
-		//string_list_clear(&e->possible_new_dirs, 1);
 	}
 	hashmap_free(dir_re_head, 1);
 	free(dir_re_head);
@@ -2106,7 +2273,6 @@ static struct rename_info *handle_renames(struct merge_options *o,
 		if (e->new_dir)
 			free(e->new_dir);
 		/* possible_new_dirs already cleared in get_directory_renames */
-		//string_list_clear(&e->possible_new_dirs, 1);
 	}
 	hashmap_free(dir_re_merge, 1);
 	free(dir_re_merge);
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 4066b08767..858d83016a 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -438,7 +438,7 @@ test_expect_success '2a-setup: Directory split into two on one side, with equal
 	git commit -m "C"
 '
 
-test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+test_expect_success '2a-check: Directory split into two on one side, with equal numbers of paths' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 26/30] merge-recursive: When comparing files, don't include trees
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (24 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 25/30] merge-recursive: Check for file level conflicts then get new name Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames Elijah Newren
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

get_renames() would look up stage data that already existed (populated
in get_unmerged(), taken from whatever unpack_trees() created), and if
it didn't exist, would call insert_stage_data() to create the necessary
entry for the given file.  The insert_stage_data() fallback becomes
much more important for directory rename detection, because that creates
a mechanism to have a file in the resulting merge that didn't exist on
either side of history.  However, insert_stage_data(), due to calling
get_tree_entry() loaded up trees as readily as files.  We aren't
interested in comparing trees to files; the D/F conflict handling is
done elsewhere.  This code is just concerned with what entries existed
for a given path on the different sides of the merge, so create a
get_tree_entry_if_blob() helper function and use it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7c2c29bb51..2a7258f6bb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -407,6 +407,21 @@ static void get_files_dirs(struct merge_options *o, struct tree *tree)
 	read_tree_recursive(tree, "", 0, 0, &match_all, save_files_dirs, o);
 }
 
+static int get_tree_entry_if_blob(const unsigned char *tree,
+				  const char *path,
+				  unsigned char *hashy,
+				  unsigned *mode_o)
+{
+	int ret;
+
+	ret = get_tree_entry(tree, path, hashy, mode_o);
+	if (S_ISDIR(*mode_o)) {
+		hashcpy(hashy, null_sha1);
+		*mode_o = 0;
+	}
+	return ret;
+}
+
 /*
  * Returns an index_entry instance which doesn't have to correspond to
  * a real cache entry in Git's index.
@@ -417,12 +432,12 @@ static struct stage_data *insert_stage_data(const char *path,
 {
 	struct string_list_item *item;
 	struct stage_data *e = xcalloc(1, sizeof(struct stage_data));
-	get_tree_entry(o->object.oid.hash, path,
-			e->stages[1].oid.hash, &e->stages[1].mode);
-	get_tree_entry(a->object.oid.hash, path,
-			e->stages[2].oid.hash, &e->stages[2].mode);
-	get_tree_entry(b->object.oid.hash, path,
-			e->stages[3].oid.hash, &e->stages[3].mode);
+	get_tree_entry_if_blob(o->object.oid.hash, path,
+			       e->stages[1].oid.hash, &e->stages[1].mode);
+	get_tree_entry_if_blob(a->object.oid.hash, path,
+			       e->stages[2].oid.hash, &e->stages[2].mode);
+	get_tree_entry_if_blob(b->object.oid.hash, path,
+			       e->stages[3].oid.hash, &e->stages[3].mode);
 	item = string_list_insert(entries, path);
 	item->util = e;
 	return e;
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (25 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 26/30] merge-recursive: When comparing files, don't include trees Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-15 20:23   ` Stefan Beller
  2017-11-10 19:05 ` [PATCH 28/30] merge-recursive: Avoid clobbering untracked files with " Elijah Newren
                   ` (3 subsequent siblings)
  30 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This commit hooks together all the directory rename logic by making the
necessary changes to the rename struct, it's dst_entry, and the
diff_filepair under consideration.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 195 +++++++++++++++++++++++++++++++++++-
 t/t6043-merge-rename-directories.sh |  50 ++++-----
 2 files changed, 219 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2a7258f6bb..838bfd32ec 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -166,6 +166,7 @@ static int oid_eq(const struct object_id *a, const struct object_id *b)
 
 enum rename_type {
 	RENAME_NORMAL = 0,
+	RENAME_DIR,
 	RENAME_DELETE,
 	RENAME_ONE_FILE_TO_ONE,
 	RENAME_ONE_FILE_TO_TWO,
@@ -599,6 +600,7 @@ struct rename {
 	 */
 	struct stage_data *src_entry;
 	struct stage_data *dst_entry;
+	unsigned add_turned_into_rename:1;
 	unsigned processed:1;
 };
 
@@ -633,6 +635,26 @@ static int update_stages(struct merge_options *opt, const char *path,
 	return 0;
 }
 
+static int update_stages_for_stage_data(struct merge_options *opt,
+					const char *path,
+					const struct stage_data *stage_data)
+{
+	struct diff_filespec o, a, b;
+	o.mode = stage_data->stages[1].mode;
+	oidcpy(&o.oid, &stage_data->stages[1].oid);
+
+	a.mode = stage_data->stages[2].mode;
+	oidcpy(&a.oid, &stage_data->stages[2].oid);
+
+	b.mode = stage_data->stages[3].mode;
+	oidcpy(&b.oid, &stage_data->stages[3].oid);
+
+	return update_stages(opt, path,
+			     is_null_sha1(o.oid.hash) ? NULL : &o,
+			     is_null_sha1(a.oid.hash) ? NULL : &a,
+			     is_null_sha1(b.oid.hash) ? NULL : &b);
+}
+
 static void update_entry(struct stage_data *entry,
 			 struct diff_filespec *o,
 			 struct diff_filespec *a,
@@ -1100,6 +1122,18 @@ static int merge_file_one(struct merge_options *o,
 	return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
 }
 
+static int conflict_rename_dir(struct merge_options *o,
+			       struct diff_filepair *pair,
+			       const char *rename_branch,
+			       const char *other_branch)
+{
+	const struct diff_filespec *dest = pair->two;
+
+	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
+		return -1;
+	return 0;
+}
+
 static int handle_change_delete(struct merge_options *o,
 				 const char *path, const char *old_path,
 				 const struct object_id *o_oid, int o_mode,
@@ -1369,6 +1403,24 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
 					  new_path2);
+		/*
+		 * unpack_trees() actually populates the index for us for
+		 * "normal" rename/rename(2to1) situtations so that the
+		 * correct entries are at the higher stages, which would
+		 * make the call below to update_stages_for_stage_data
+		 * unnecessary.  However, if either of the renames came
+		 * from a directory rename, then unpack_trees() will not
+		 * have gotten the right data loaded into the index, so we
+		 * need to do so now.  (While it'd be tempting to move this
+		 * call to update_stages_for_stage_data() to
+		 * apply_directory_rename_modifications(), that would break
+		 * our intermediate calls to would_lose_untracked() since
+		 * those rely on the current in-memory index.  See also the
+		 * big "NOTE" in update_stages()).
+		 */
+		if (update_stages_for_stage_data(o, path, ci->dst_entry1))
+			ret = -1;
+
 		free(new_path2);
 		free(new_path1);
 	}
@@ -1888,6 +1940,120 @@ static char *check_for_directory_rename(struct merge_options *o,
 	return new_path;
 }
 
+static void apply_directory_rename_modifications(struct merge_options *o,
+						 struct diff_filepair *pair,
+						 char *new_path,
+						 struct rename *re,
+						 struct tree *tree,
+						 struct tree *o_tree,
+						 struct tree *a_tree,
+						 struct tree *b_tree,
+						 struct string_list *entries,
+						 int *clean)
+{
+	struct string_list_item *item;
+	int stage = (tree == a_tree ? 2 : 3);
+
+	/*
+	 * In all cases where we can do directory rename detection,
+	 * unpack_trees() will have read pair->two->path into the
+	 * index and the working copy.  We need to remove it so that
+	 * we can instead place it at new_path.  It is guaranteed to
+	 * not be untracked (unpack_trees() would have errored out
+	 * saying the file would have been overwritten), but it might
+	 * be dirty, though.
+	 */
+	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+
+	/* Find or create a new re->dst_entry */
+	item = string_list_lookup(entries, new_path);
+	if (item) {
+		/*
+		 * Since we're renaming on this side of history, and it's
+		 * due to a directory rename on the other side of history
+		 * (which we only allow when the directory in question no
+		 * longer exists on the other side of history), the
+		 * original entry for re->dst_entry is no longer
+		 * necessary...
+		 */
+		re->dst_entry->processed = 1;
+
+		/*
+		 * ...because we'll be using this new one.
+		 */
+		re->dst_entry = item->util;
+	} else {
+		/*
+		 * re->dst_entry is for the before-dir-rename path, and we
+		 * need it to hold information for the after-dir-rename
+		 * path.  Before creating a new entry, we need to mark the
+		 * old one as unnecessary (...unless it is shared by
+		 * src_entry, i.e. this didn't use to be a rename, in which
+		 * case we can just allow the normal processing to happen
+		 * for it).
+		 */
+		if (!strcmp(pair->one->path, pair->two->path)) {
+			/*
+			 * Paths should only match if this was initially a
+			 * non-rename that is being turned into one by
+			 * directory rename detection.
+			 */
+			assert(pair->status != 'R');
+		} else {
+			assert(pair->status == 'R');
+			re->dst_entry->processed = 1;
+			//string_list_remove(entries, pair->two->path, 0);
+		}
+
+		re->dst_entry = insert_stage_data(new_path,
+						  o_tree, a_tree, b_tree,
+						  entries);
+		item = string_list_insert(entries, new_path);
+		item->util = re->dst_entry;
+	}
+
+	/*
+	 * Update the stage_data with the information about the path we are
+	 * moving into place.  That slot will be empty and available for us
+	 * to write to because of the collision checks in
+	 * handle_path_level_conflicts().
+	 *
+	 * It may be tempting to actually update the index at this point as
+	 * well, using update_stages_for_stage_data(), but as per the big
+	 * "NOTE" in update_stages(), doing so will modify the current
+	 * in-memory index which will break calls to would_lose_untracked()
+	 * that we need to make.  Instead, we need to just make sure that
+	 * the various conflict_rename_*() functions update the index
+	 * explicitly rather than relying on unpack_trees() to have done it.
+	 */
+	assert(is_null_oid(&re->dst_entry->stages[stage].oid));
+	get_tree_entry(tree->object.oid.hash,
+		       pair->two->path,
+		       re->dst_entry->stages[stage].oid.hash,
+		       &re->dst_entry->stages[stage].mode);
+
+	/* Update pair status */
+	if (pair->status == 'A') {
+		/*
+		 * Recording rename information for this add makes it look
+		 * like a rename/delete conflict.  Make sure we can
+		 * correctly handle this as an add that was moved to a new
+		 * directory instead of reporting a rename/delete conflict.
+		 */
+		re->add_turned_into_rename = 1;
+	}
+	/*
+	 * We don't actually look at pair->status again, but it seems
+	 * pedagogically correct to adjust it.
+	 */
+	pair->status = 'R';
+
+	/*
+	 * Finally, record the new location.
+	 */
+	pair->two->path = new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1937,6 +2103,7 @@ static struct string_list *get_renames(struct merge_options *o,
 
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
+		re->add_turned_into_rename = 0;
 		re->pair = pair;
 		item = string_list_lookup(entries, re->pair->one->path);
 		if (!item)
@@ -1953,6 +2120,12 @@ static struct string_list *get_renames(struct merge_options *o,
 			re->dst_entry = item->util;
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
+		if (new_path)
+			  apply_directory_rename_modifications(o, pair, new_path,
+							       re, tree, o_tree,
+							       a_tree, b_tree,
+							       entries,
+							       clean_merge);
 	}
 
 	hashmap_iter_init(&collisions, &iter);
@@ -2122,7 +2295,19 @@ static int process_renames(struct merge_options *o,
 			dst_other.mode = ren1->dst_entry->stages[other_stage].mode;
 			try_merge = 0;
 
-			if (oid_eq(&src_other.oid, &null_oid)) {
+			if (oid_eq(&src_other.oid, &null_oid) &&
+			    ren1->add_turned_into_rename) {
+				setup_rename_conflict_info(RENAME_DIR,
+							   ren1->pair,
+							   NULL,
+							   branch1,
+							   branch2,
+							   ren1->dst_entry,
+							   NULL,
+							   o,
+							   NULL,
+							   NULL);
+			} else if (oid_eq(&src_other.oid, &null_oid)) {
 				setup_rename_conflict_info(RENAME_DELETE,
 							   ren1->pair,
 							   NULL,
@@ -2546,6 +2731,14 @@ static int process_entry(struct merge_options *o,
 						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 						    conflict_info);
 			break;
+		case RENAME_DIR:
+			clean_merge = 1;
+			if (conflict_rename_dir(o,
+						conflict_info->pair1,
+						conflict_info->branch1,
+						conflict_info->branch2))
+				clean_merge = -1;
+			break;
 		case RENAME_DELETE:
 			clean_merge = 0;
 			if (conflict_rename_delete(o,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 858d83016a..e737bad2c5 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -64,7 +64,7 @@ test_expect_success '1a-setup: Simple directory rename detection' '
 	git commit -m "C"
 '
 
-test_expect_failure '1a-check: Simple directory rename detection' '
+test_expect_success '1a-check: Simple directory rename detection' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -121,7 +121,7 @@ test_expect_success '1b-setup: Merge a directory with another' '
 	git commit -m "C"
 '
 
-test_expect_failure '1b-check: Merge a directory with another' '
+test_expect_success '1b-check: Merge a directory with another' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -173,7 +173,7 @@ test_expect_success '1c-setup: Transitive renaming' '
 	git commit -m "C"
 '
 
-test_expect_failure '1c-check: Transitive renaming' '
+test_expect_success '1c-check: Transitive renaming' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -235,7 +235,7 @@ test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) con
 	git commit -m "C"
 '
 
-test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+test_expect_success '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -302,7 +302,7 @@ test_expect_success '1e-setup: Renamed directory, with all files being renamed t
 	git commit -m "C"
 '
 
-test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+test_expect_success '1e-check: Renamed directory, with all files being renamed too' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -361,7 +361,7 @@ test_expect_success '1f-setup: Split a directory into two other directories' '
 	git commit -m "C"
 '
 
-test_expect_failure '1f-check: Split a directory into two other directories' '
+test_expect_success '1f-check: Split a directory into two other directories' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -807,7 +807,7 @@ test_expect_success '5a-setup: Merge directories, other side adds files to origi
 	git commit -m "C"
 '
 
-test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+test_expect_success '5a-check: Merge directories, other side adds files to original and target' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -878,7 +878,7 @@ test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflic
 	git commit -m "C"
 '
 
-test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+test_expect_success '5b-check: Rename/delete in order to get add/add/add conflict' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -951,7 +951,7 @@ test_expect_success '5c-setup: Transitive rename would cause rename/rename/renam
 	git commit -m "C"
 '
 
-test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+test_expect_success '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1025,7 +1025,7 @@ test_expect_success '5d-setup: Directory/file/file conflict due to directory ren
 	git commit -m "C"
 '
 
-test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+test_expect_success '5d-check: Directory/file/file conflict due to directory rename' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1403,7 +1403,7 @@ test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS
 	git commit -m "C"
 '
 
-test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+test_expect_success '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1471,7 +1471,7 @@ test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive r
 	git commit -m "C"
 '
 
-test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+test_expect_success '7b-check: rename/rename(2to1), but only due to transitive rename' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1537,7 +1537,7 @@ test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may
 	git commit -m "C"
 '
 
-test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+test_expect_success '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1595,7 +1595,7 @@ test_expect_success '7d-setup: transitive rename involved in rename/delete; how
 	git commit -m "C"
 '
 
-test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+test_expect_success '7d-check: transitive rename involved in rename/delete; how is it reported?' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1664,7 +1664,7 @@ test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in th
 	git commit -m "C"
 '
 
-test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+test_expect_success '7e-check: transitive rename in rename/delete AND dirs in the way' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1748,7 +1748,7 @@ test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
 	git commit -m "C"
 '
 
-test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+test_expect_success '8a-check: Dual-directory rename, one into the others way' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -1881,7 +1881,7 @@ test_expect_success '8c-setup: rename+modify/delete' '
 	git commit -m "C"
 '
 
-test_expect_failure '8c-check: rename+modify/delete' '
+test_expect_success '8c-check: rename+modify/delete' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -1960,7 +1960,7 @@ test_expect_success '8d-setup: rename/delete...or not?' '
 	git commit -m "C"
 '
 
-test_expect_failure '8d-check: rename/delete...or not?' '
+test_expect_success '8d-check: rename/delete...or not?' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -2026,7 +2026,7 @@ test_expect_success '8e-setup: Both sides rename, one side adds to original dire
 	git commit -m "C"
 '
 
-test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+test_expect_success '8e-check: Both sides rename, one side adds to original directory' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out 2>err &&
@@ -2111,7 +2111,7 @@ test_expect_success '9a-setup: Inner renamed directory within outer renamed dire
 	git commit -m "C"
 '
 
-test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+test_expect_success '9a-check: Inner renamed directory within outer renamed directory' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -2171,7 +2171,7 @@ test_expect_success '9b-setup: Transitive rename with content merge' '
 	git commit -m "C"
 '
 
-test_expect_failure '9b-check: Transitive rename with content merge' '
+test_expect_success '9b-check: Transitive rename with content merge' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
@@ -2254,7 +2254,7 @@ test_expect_success '9c-setup: Doubly transitive rename?' '
 	git commit -m "C"
 '
 
-test_expect_failure '9c-check: Doubly transitive rename?' '
+test_expect_success '9c-check: Doubly transitive rename?' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 >out &&
@@ -2337,7 +2337,7 @@ test_expect_success '9d-setup: N-way transitive rename?' '
 	git commit -m "C"
 '
 
-test_expect_failure '9d-check: N-way transitive rename?' '
+test_expect_success '9d-check: N-way transitive rename?' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 >out &&
@@ -2411,7 +2411,7 @@ test_expect_success '9e-setup: N-to-1 whammo' '
 	git commit -m "C"
 '
 
-test_expect_failure '9e-check: N-to-1 whammo' '
+test_expect_success '9e-check: N-to-1 whammo' '
 	git checkout B^0 &&
 
 	test_must_fail git merge -s recursive C^0 >out &&
@@ -2482,7 +2482,7 @@ test_expect_success '9f-setup: Renamed directory that only contained immediate s
 	git commit -m "C"
 '
 
-test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+test_expect_success '9f-check: Renamed directory that only contained immediate subdirs' '
 	git checkout B^0 &&
 
 	git merge -s recursive C^0 &&
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 28/30] merge-recursive: Avoid clobbering untracked files with directory renames
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (26 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [RFC PATCH 29/30] merge-recursive: Fix overwriting dirty files involved in renames Elijah Newren
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 39 +++++++++++++++++++++++++++++++++++--
 t/t6043-merge-rename-directories.sh |  6 +++---
 2 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 838bfd32ec..1b3ee5b9fb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1129,6 +1129,25 @@ static int conflict_rename_dir(struct merge_options *o,
 {
 	const struct diff_filespec *dest = pair->two;
 
+	if (!o->call_depth && would_lose_untracked(dest->path)) {
+		char *alt_path = unique_path(o, dest->path, rename_branch);
+		output(o, 1, _("Error: Refusing to lose untracked file at %s; "
+			       "writing to %s instead."),
+		       dest->path, alt_path);
+		/*
+		 * Write the file in worktree at alt_path, but not in the
+		 * index.  Instead, write to dest->path for the index but
+		 * only at the higher appropriate stage.
+		 */
+		if (update_file(o, 0, &dest->oid, dest->mode, alt_path))
+			return -1;
+		free(alt_path);
+		return update_stages(o, dest->path, NULL,
+				     rename_branch == o->branch1 ? dest : NULL,
+				     rename_branch == o->branch1 ? NULL : dest);
+	}
+
+	/* Update dest->path both in index and in worktree */
 	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
 		return -1;
 	return 0;
@@ -1147,7 +1166,8 @@ static int handle_change_delete(struct merge_options *o,
 	const char *update_path = path;
 	int ret = 0;
 
-	if (dir_in_way(path, !o->call_depth, 0)) {
+	if (dir_in_way(path, !o->call_depth, 0) ||
+	    (!o->call_depth && would_lose_untracked(path))) {
 		update_path = alt_path = unique_path(o, path, change_branch);
 	}
 
@@ -1273,6 +1293,10 @@ static int handle_file(struct merge_options *o,
 			dst_name = unique_path(o, rename->path, cur_branch);
 			output(o, 1, _("%s is a directory in %s adding as %s instead"),
 			       rename->path, other_branch, dst_name);
+		} else if (!o->call_depth && would_lose_untracked(rename->path)) {
+			dst_name = unique_path(o, rename->path, cur_branch);
+			output(o, 1, _("Refusing to lose untracked file at %s; adding as %s instead"),
+			       rename->path, dst_name);
 		}
 	}
 	if ((ret = update_file(o, 0, &rename->oid, rename->mode, dst_name)))
@@ -1398,7 +1422,18 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		remove_file(o, 0, path, 0);
+		if (would_lose_untracked(path))
+			/*
+			 * Only way we get here is if both renames were from
+			 * a directory rename AND user had an untracked file
+			 * at the location where both files end up after the
+			 * two directory renames.  See testcase 10d of t6043.
+			 */
+			output(o, 1, _("Refusing to lose untracked file at "
+				       "%s, even though it's in the way."),
+			       path);
+		else
+			remove_file(o, 0, path, 0);
 		ret = update_file(o, 0, &mfi_c1.oid, mfi_c1.mode, new_path1);
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index e737bad2c5..6db764a1b6 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2660,7 +2660,7 @@ test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
 	git commit -m "C"
 '
 
-test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+test_expect_success '10b-check: Overwrite untracked with dir rename + delete' '
 	git checkout B^0 &&
 	echo very >y/c &&
 	echo important >y/d &&
@@ -2727,7 +2727,7 @@ test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)
 	git commit -m "C"
 '
 
-test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+test_expect_success '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
 	git checkout B^0 &&
 	echo important >y/c &&
 
@@ -2793,7 +2793,7 @@ test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
 	git commit -m "C"
 '
 
-test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+test_expect_success '10d-check: Delete untracked with dir rename/rename(2to1)' '
 	git checkout B^0 &&
 	echo important >y/wham &&
 
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 29/30] merge-recursive: Fix overwriting dirty files involved in renames
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (27 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 28/30] merge-recursive: Avoid clobbering untracked files with " Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 19:05 ` [PATCH 30/30] merge-recursive: Fix remaining directory rename + dirty overwrite cases Elijah Newren
  2017-11-10 22:27 ` [PATCH 00/30] Add directory rename detection to git Philip Oakley
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of directy files that are involved in directory rename
detection will be added in a subsequent commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
Seems kinda hacky, in multiple different ways.  It seems like there
should be a better way to handle lots of different things about this
patch, though I have a hard time seeing how without doing a bigger
rewrite of the whole interface between unpack_trees and
merge-recursive (possibly rewriting them so there isn't an interface
but some piece of code that does both functions).  Alternate simpler
suggestions?


 merge-recursive.c                   | 81 ++++++++++++++++++++++++++++---------
 merge-recursive.h                   |  2 +
 t/t3501-revert-cherry-pick.sh       |  2 +-
 t/t6043-merge-rename-directories.sh |  2 +-
 t/t7607-merge-overwrite.sh          |  2 +-
 unpack-trees.c                      |  4 +-
 unpack-trees.h                      |  4 ++
 7 files changed, 73 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1b3ee5b9fb..86ddb89727 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -323,32 +323,32 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
 	init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(int index_only,
+static int git_merge_trees(struct merge_options *o,
 			   struct tree *common,
 			   struct tree *head,
 			   struct tree *merge)
 {
 	int rc;
 	struct tree_desc t[3];
-	struct unpack_trees_options opts;
 
-	memset(&opts, 0, sizeof(opts));
-	if (index_only)
-		opts.index_only = 1;
+	memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
+	if (o->call_depth)
+		o->unpack_opts.index_only = 1;
 	else
-		opts.update = 1;
-	opts.merge = 1;
-	opts.head_idx = 2;
-	opts.fn = threeway_merge;
-	opts.src_index = &the_index;
-	opts.dst_index = &the_index;
-	setup_unpack_trees_porcelain(&opts, "merge");
+		o->unpack_opts.update = 1;
+	o->unpack_opts.merge = 1;
+	o->unpack_opts.head_idx = 2;
+	o->unpack_opts.fn = threeway_merge;
+	o->unpack_opts.src_index = &the_index;
+	o->unpack_opts.dst_index = &the_index;
+	setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
 	init_tree_desc_from_tree(t+0, common);
 	init_tree_desc_from_tree(t+1, head);
 	init_tree_desc_from_tree(t+2, merge);
 
-	rc = unpack_trees(3, t, &opts);
+	rc = unpack_trees(3, t, &o->unpack_opts);
+	o->unpack_opts.src_index = &the_index; // unpack_trees NULLifies src_index, but it's used in verify_uptodate, so set it back
 	cache_tree_free(&active_cache_tree);
 	return rc;
 }
@@ -783,6 +783,20 @@ static int would_lose_untracked(const char *path)
 	return !was_tracked(path) && file_exists(path);
 }
 
+static int was_dirty(struct merge_options *o, const char *path)
+{
+	struct cache_entry *ce;
+
+	int dirty = 1;
+	if (o->call_depth || !was_tracked(path))
+	  return !dirty;
+
+	ce = cache_file_exists(path, strlen(path), ignore_case);
+	dirty = (ce->ce_stat_data.sd_mtime.sec > 0 &&
+		 verify_uptodate(ce, &o->unpack_opts) != 0);
+	return dirty;
+}
+
 static int make_room_for_path(struct merge_options *o, const char *path)
 {
 	int status, i;
@@ -2635,6 +2649,7 @@ static int handle_modify_delete(struct merge_options *o,
 
 static int merge_content(struct merge_options *o,
 			 const char *path,
+			 int file_in_way,
 			 struct object_id *o_oid, int o_mode,
 			 struct object_id *a_oid, int a_mode,
 			 struct object_id *b_oid, int b_mode,
@@ -2709,7 +2724,7 @@ static int merge_content(struct merge_options *o,
 				return -1;
 	}
 
-	if (df_conflict_remains) {
+	if (df_conflict_remains || file_in_way) {
 		char *new_path;
 		if (o->call_depth) {
 			remove_file_from_cache(path);
@@ -2743,6 +2758,31 @@ static int merge_content(struct merge_options *o,
 	return mfi.clean;
 }
 
+static int conflict_rename_normal(struct merge_options *o,
+				  const char *path,
+				  struct object_id *o_oid, unsigned o_mode,
+				  struct object_id *a_oid, unsigned a_mode,
+				  struct object_id *b_oid, unsigned b_mode,
+				  struct rename_conflict_info *ci)
+{
+	int clean_merge;
+	int file_in_the_way = 0;
+
+	if (was_dirty(o, path)) {
+			file_in_the_way = 1;
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       path);
+	}
+
+	/* Merge the content and write it out */
+	clean_merge = merge_content(o, path, file_in_the_way,
+				    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
+				    ci);
+	if (clean_merge > 0 && file_in_the_way)
+		clean_merge = 0;
+	return clean_merge;
+}
+
 /* Per entry merge function */
 static int process_entry(struct merge_options *o,
 			 const char *path, struct stage_data *entry)
@@ -2762,9 +2802,12 @@ static int process_entry(struct merge_options *o,
 		switch (conflict_info->rename_type) {
 		case RENAME_NORMAL:
 		case RENAME_ONE_FILE_TO_ONE:
-			clean_merge = merge_content(o, path,
-						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
-						    conflict_info);
+			clean_merge = conflict_rename_normal(o,
+							     path,
+							     o_oid, o_mode,
+							     a_oid, a_mode,
+							     b_oid, b_mode,
+							     conflict_info);
 			break;
 		case RENAME_DIR:
 			clean_merge = 1;
@@ -2859,7 +2902,7 @@ static int process_entry(struct merge_options *o,
 	} else if (a_oid && b_oid) {
 		/* Case C: Added in both (check for same permissions) and */
 		/* case D: Modified in both, but differently. */
-		clean_merge = merge_content(o, path,
+		clean_merge = merge_content(o, path, 0 /* file_in_way */,
 					    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 					    NULL);
 	} else if (!o_oid && !a_oid && !b_oid) {
@@ -2893,7 +2936,7 @@ int merge_trees(struct merge_options *o,
 		return 1;
 	}
 
-	code = git_merge_trees(o->call_depth, common, head, merge);
+	code = git_merge_trees(o, common, head, merge);
 
 	if (code != 0) {
 		if (show(o, 4) || o->call_depth)
diff --git a/merge-recursive.h b/merge-recursive.h
index e02c1e1243..591b824a98 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_RECURSIVE_H
 #define MERGE_RECURSIVE_H
 
+#include "unpack-trees.h"
 #include "string-list.h"
 
 struct merge_options {
@@ -27,6 +28,7 @@ struct merge_options {
 	struct strbuf obuf;
 	struct hashmap current_file_dir_set;
 	struct string_list df_conflict_file_set;
+	struct unpack_trees_options unpack_opts;
 };
 
 struct dir_rename_entry {
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 783bdbf59d..0d89f6d0f6 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'cherry-pick works with dirty renamed file' '
+test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 6db764a1b6..02c97c9823 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2925,7 +2925,7 @@ test_expect_success '11a-setup: Avoid losing dirty contents with simple rename'
 	git commit -m "C"
 '
 
-test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+test_expect_success '11a-check: Avoid losing dirty contents with simple rename' '
 	git checkout B^0 &&
 	echo stuff >>z/c &&
 
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 00617dadf8..e44fb50173 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -92,7 +92,7 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	test_cmp important c1.c
 '
 
-test_expect_failure 'will not overwrite unstaged changes in renamed file' '
+test_expect_success 'will not overwrite unstaged changes in renamed file' '
 	git reset --hard c1 &&
 	git mv c1.c other.c &&
 	git commit -m rename &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 71b70ccb12..71826edfb6 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1484,8 +1484,8 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		add_rejected_path(o, error_type, ce->name);
 }
 
-static int verify_uptodate(const struct cache_entry *ce,
-			   struct unpack_trees_options *o)
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o)
 {
 	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
 		return 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 6c48117b84..41178ada94 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -1,6 +1,7 @@
 #ifndef UNPACK_TREES_H
 #define UNPACK_TREES_H
 
+#include "tree-walk.h"
 #include "string-list.h"
 
 #define MAX_UNPACK_TREES 8
@@ -78,6 +79,9 @@ struct unpack_trees_options {
 extern int unpack_trees(unsigned n, struct tree_desc *t,
 		struct unpack_trees_options *options);
 
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o);
+
 int threeway_merge(const struct cache_entry * const *stages,
 		   struct unpack_trees_options *o);
 int twoway_merge(const struct cache_entry * const *src,
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [PATCH 30/30] merge-recursive: Fix remaining directory rename + dirty overwrite cases
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (28 preceding siblings ...)
  2017-11-10 19:05 ` [RFC PATCH 29/30] merge-recursive: Fix overwriting dirty files involved in renames Elijah Newren
@ 2017-11-10 19:05 ` Elijah Newren
  2017-11-10 22:27 ` [PATCH 00/30] Add directory rename detection to git Philip Oakley
  30 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 19:05 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 26 +++++++++++++++++++++++---
 t/t6043-merge-rename-directories.sh |  8 ++++----
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 86ddb89727..6ef1d52f0a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1296,11 +1296,23 @@ static int handle_file(struct merge_options *o,
 
 	add = filespec_from_entry(&other, dst_entry, stage ^ 1);
 	if (add) {
+		int ren_src_was_dirty = was_dirty(o, rename->path);
 		char *add_name = unique_path(o, rename->path, other_branch);
 		if (update_file(o, 0, &add->oid, add->mode, add_name))
 			return -1;
 
-		remove_file(o, 0, rename->path, 0);
+		if (ren_src_was_dirty) {
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       rename->path);
+		}
+		/*
+		 * Stupid double negatives in remove_file; it somehow manages
+		 * to repeatedly mess me up.  So, just for myself:
+		 *    1) update_wd iff !ren_src_was_dirty.
+		 *    2) no_wd iff !update_wd
+		 *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
+		 */
+		remove_file(o, 0, rename->path, ren_src_was_dirty);
 		dst_name = unique_path(o, rename->path, cur_branch);
 	} else {
 		if (dir_in_way(rename->path, !o->call_depth, 0)) {
@@ -1436,7 +1448,10 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		if (would_lose_untracked(path))
+		if (was_dirty(o, path))
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       path);
+		else if (would_lose_untracked(path))
 			/*
 			 * Only way we get here is if both renames were from
 			 * a directory rename AND user had an untracked file
@@ -2002,6 +2017,7 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 {
 	struct string_list_item *item;
 	int stage = (tree == a_tree ? 2 : 3);
+	int update_wd;
 
 	/*
 	 * In all cases where we can do directory rename detection,
@@ -2012,7 +2028,11 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 	 * saying the file would have been overwritten), but it might
 	 * be dirty, though.
 	 */
-	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+	update_wd = !was_dirty(o, pair->two->path);
+	if (!update_wd)
+		output(o, 1, _("Refusing to lose dirty file at %s"),
+		       pair->two->path);
+	remove_file(o, 1, pair->two->path, !update_wd);
 
 	/* Find or create a new re->dst_entry */
 	item = string_list_lookup(entries, new_path);
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 02c97c9823..676e72e9e6 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2985,7 +2985,7 @@ test_expect_success '11b-setup: Avoid losing dirty file involved in directory re
 	git commit -m "C"
 '
 
-test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+test_expect_success '11b-check: Avoid losing dirty file involved in directory rename' '
 	git checkout B^0 &&
 	echo stuff >>z/c &&
 
@@ -3109,7 +3109,7 @@ test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conf
 	git commit -m "C"
 '
 
-test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+test_expect_success '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
 	git checkout B^0 &&
 	echo stuff >>z/c &&
 
@@ -3178,7 +3178,7 @@ test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rena
 	git commit -m "C"
 '
 
-test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+test_expect_success '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
 	git checkout B^0 &&
 	echo mods >>y/c &&
 
@@ -3247,7 +3247,7 @@ test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rena
 	git commit -m "C"
 '
 
-test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+test_expect_success '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
 	git checkout B^0 &&
 	echo important >>y/wham &&
 
-- 
2.15.0.5.g9567be9905


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 00/30] Add directory rename detection to git
  2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
                   ` (29 preceding siblings ...)
  2017-11-10 19:05 ` [PATCH 30/30] merge-recursive: Fix remaining directory rename + dirty overwrite cases Elijah Newren
@ 2017-11-10 22:27 ` Philip Oakley
  2017-11-10 23:26   ` Elijah Newren
  30 siblings, 1 reply; 81+ messages in thread
From: Philip Oakley @ 2017-11-10 22:27 UTC (permalink / raw)
  To: Elijah Newren, git; +Cc: Elijah Newren

From: "Elijah Newren" <newren@gmail.com>
> [This series is entirely independent of my rename detection limits series.
> However, I have a separate rename detection performance series that 
> depends
> on both this series and the rename detection limits series.]
>
> In this patchset, I introduce directory rename detection to 
> merge-recursive,
> predominantly so that when files are added to directories on one side of
> history and those directories are renamed on the other side of history, 
> the
> files will end up in the proper location after a merge or cherry-pick.
>
> However, this isn't limited to that simplistic case.  More interesting
> possibilities exist, such as:
>
>  * a file being renamed into a directory which is renamed on the other
>    side of history, causing the need for a transitive rename.
>

How does this cope with the case insensitive case preserving file systems on 
Mac and Windows, esp when core.ignorecase is true. If it's a bigger problem 
that the series already covers, would the likely changes be reasonably 
localised?

This came up recently on GfW for `git checkout` of a branch where the case 
changed ("Test" <-> "test"), but git didn't notice that it needed to rename 
the directories on such an file system. 
https://github.com/git-for-windows/git/issues/1333

<snip>

--
Philip 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 00/30] Add directory rename detection to git
  2017-11-10 22:27 ` [PATCH 00/30] Add directory rename detection to git Philip Oakley
@ 2017-11-10 23:26   ` Elijah Newren
  2017-11-13 15:04     ` Philip Oakley
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-10 23:26 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Git Mailing List

On Fri, Nov 10, 2017 at 2:27 PM, Philip Oakley <philipoakley@iee.org> wrote:
> From: "Elijah Newren" <newren@gmail.com>
>>
>> In this patchset, I introduce directory rename detection to
>> merge-recursive,
>> predominantly so that when files are added to directories on one side of
>> history and those directories are renamed on the other side of history,
>> the
>> files will end up in the proper location after a merge or cherry-pick.
>>
>> However, this isn't limited to that simplistic case.  More interesting
>> possibilities exist, such as:
>>
>>  * a file being renamed into a directory which is renamed on the other
>>    side of history, causing the need for a transitive rename.
>>
>
> How does this cope with the case insensitive case preserving file systems on
> Mac and Windows, esp when core.ignorecase is true. If it's a bigger problem
> that the series already covers, would the likely changes be reasonably
> localised?
>
> This came up recently on GfW for `git checkout` of a branch where the case
> changed ("Test" <-> "test"), but git didn't notice that it needed to rename
> the directories on such an file system.
> https://github.com/git-for-windows/git/issues/1333

I wasn't aware there were problems with git on case insensitive case
preserving filesystems; fixing them wasn't something I had in mind
when writing this series.  However, the particular bug you mention is
actually completely orthogonal to this series; it talks about
git-checkout without the -m/--merge option, which doesn't touch any
code path I modified in my series, so my series can't really fix or
worsen that particular issue.

But, if there are further issues with such filesystems that also
affect merges/cherry-picks/rebases, then I don't think my series will
either help or hurt there either.  The recursive merge machinery
already has remove_file() and update_file() wrappers that it uses
whenever it needs to remove/add/update a file in the working directory
and/or index, and I have simply continued using those, so the number
of places you'd need to modify to fix issues would remain just as
localized as before.  Also, I continue to depend on the reading of the
index & trees that unpack_trees() does, which I haven't modified, so
again it'd be the same number of places that someone would need to
fix.  (However, the whole design to have unpack_trees() do the initial
work and then have recursive merge try to "fix it up" is really
starting to strain.  I'm starting to think, again, that merge
recursive needs a redesign, and have some arguments I wanted to float
out there...but I've dumped enough on the list for a day.)

It's possible that this series fixes one particular issue -- namely
when merging, if the merge-base contained a "Test" directory, one side
added a file to that directory, and the other side renamed "Test" to
"test", and if the presence of both "Test" and "test" directories in
the merge result is problematic, then at least with my fixes you
wouldn't end up with both directories and could thus avoid that
problem in a narrow set of cases.

Sorry that I don't have any better news than that for you.

Elijah

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 00/30] Add directory rename detection to git
  2017-11-10 23:26   ` Elijah Newren
@ 2017-11-13 15:04     ` Philip Oakley
  0 siblings, 0 replies; 81+ messages in thread
From: Philip Oakley @ 2017-11-13 15:04 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

From: "Elijah Newren" <newren@gmail.com>
: Friday, November 10, 2017 11:26 PM
> On Fri, Nov 10, 2017 at 2:27 PM, Philip Oakley <philipoakley@iee.org> 
> wrote:
>> From: "Elijah Newren" <newren@gmail.com>
>>>
>>> In this patchset, I introduce directory rename detection to
>>> merge-recursive,
>>> predominantly so that when files are added to directories on one side of
>>> history and those directories are renamed on the other side of history,
>>> the
>>> files will end up in the proper location after a merge or cherry-pick.
>>>
>>> However, this isn't limited to that simplistic case.  More interesting
>>> possibilities exist, such as:
>>>
>>>  * a file being renamed into a directory which is renamed on the other
>>>    side of history, causing the need for a transitive rename.
>>>
>>
>> How does this cope with the case insensitive case preserving file systems 
>> on
>> Mac and Windows, esp when core.ignorecase is true. If it's a bigger 
>> problem
>> that the series already covers, would the likely changes be reasonably
>> localised?
>>
>> This came up recently on GfW for `git checkout` of a branch where the 
>> case
>> changed ("Test" <-> "test"), but git didn't notice that it needed to 
>> rename
>> the directories on such an file system.
>> https://github.com/git-for-windows/git/issues/1333
>
> I wasn't aware there were problems with git on case insensitive case
> preserving filesystems; fixing them wasn't something I had in mind
> when writing this series.

I was mainly ensuring awareness of the potential issue, as it's not easy to 
solve.

> However, the particular bug you mention is
> actually completely orthogonal to this series; it talks about
> git-checkout without the -m/--merge option, which doesn't touch any
> code path I modified in my series, so my series can't really fix or
> worsen that particular issue.

That's good.
>
> But, if there are further issues with such filesystems that also
> affect merges/cherry-picks/rebases, then I don't think my series will
> either help or hurt there either.  The recursive merge machinery
> already has remove_file() and update_file() wrappers that it uses
> whenever it needs to remove/add/update a file in the working directory
> and/or index, and I have simply continued using those, so the number
> of places you'd need to modify to fix issues would remain just as
> localized as before.

It's when the working directory path/filename has a case change that goes 
undetected (one way or another) that can cause issues. I think that part of 
the problem (after awareness) is not having a cannonical expectation of 
which way is 'right', and what options there may be. E,g. if a project is 
wholly on a case insensitive system then the filenames in the worktree never 
matter, but aligning the path/filenames in the repository would still be a 
problem.

>  Also, I continue to depend on the reading of the
> index & trees that unpack_trees() does, which I haven't modified, so
> again it'd be the same number of places that someone would need to
> fix.  (However, the whole design to have unpack_trees() do the initial
> work and then have recursive merge try to "fix it up" is really
> starting to strain.

Interesting point.

>  I'm starting to think, again, that merge
> recursive needs a redesign, and have some arguments I wanted to float
> out there...but I've dumped enough on the list for a day.)
>
> It's possible that this series fixes one particular issue -- namely
> when merging, if the merge-base contained a "Test" directory, one side
> added a file to that directory, and the other side renamed "Test" to
> "test", and if the presence of both "Test" and "test" directories in
> the merge result is problematic, then at least with my fixes you
> wouldn't end up with both directories and could thus avoid that
> problem in a narrow set of cases.

I'll think on that. It may provide extra clues as to what the right 
solutions could be!
>
> Sorry that I don't have any better news than that for you.
>
> Elijah

Thanks
--
Philip 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking
  2017-11-10 19:05 ` [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking Elijah Newren
@ 2017-11-13 19:32   ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 19:32 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> t3501 had a testcase originally added

... goes and looks ...

  "in 05f2dfb965 (cherry-pick: demonstrate a segmentation fault, 2016-11-26)"

would have helped me here in the commit message.


> to ensure cherry-pick wouldn't
> segfault when working with a dirty file involved in a rename.  While
> the segfault was fixed, there was another problem this test demonstrated:
> namely, that git would overwrite a dirty file involved in a rename.
> Further, the test encoded a "successful merge" and overwriting of this
> file as correct behavior.  Modify the test so that it would still catch
> the segfault, but to require the correct behavior.

As the correct behavior is not yet implemented, mark it as
test_expect_failure, too. (probably this reads implicit)


>
> t7607 had a test

... added in 30fd3a5425 (merge overwrites unstaged changes in renamed file,
2012-04-15) ...

> specific to looking for a merge overwriting a dirty file
> involved in a rename, but it too actually encoded what I would term
> incorrect behavior: it expected the merge to succeed.  Fix that, and add
> a few more checks to make sure that the merge really does produce the
> expected results.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t3501-revert-cherry-pick.sh | 7 +++++--
>  t/t7607-merge-overwrite.sh    | 5 ++++-
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
> index 4f2a263b63..783bdbf59d 100755
> --- a/t/t3501-revert-cherry-pick.sh
> +++ b/t/t3501-revert-cherry-pick.sh
> @@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
>         test_cmp expect actual
>  '
>
> -test_expect_success 'cherry-pick works with dirty renamed file' '
> +test_expect_failure 'cherry-pick works with dirty renamed file' '
>         test_commit to-rename &&
>         git checkout -b unrelated &&
>         test_commit unrelated &&
> @@ -150,7 +150,10 @@ test_expect_success 'cherry-pick works with dirty renamed file' '
>         test_tick &&
>         git commit -m renamed &&
>         echo modified >renamed &&
> -       git cherry-pick refs/heads/unrelated
> +       test_must_fail git cherry-pick refs/heads/unrelated >out &&
> +       test_i18ngrep "Refusing to lose dirty file at renamed" out &&
> +       test $(git rev-parse :0:renamed) = $(git rev-parse HEAD^:to-rename.t) &&
> +       grep -q "^modified$" renamed
>  '
>
>  test_done
> diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
> index 9444d6a9b9..00617dadf8 100755
> --- a/t/t7607-merge-overwrite.sh
> +++ b/t/t7607-merge-overwrite.sh
> @@ -97,7 +97,10 @@ test_expect_failure 'will not overwrite unstaged changes in renamed file' '
>         git mv c1.c other.c &&
>         git commit -m rename &&
>         cp important other.c &&
> -       git merge c1a &&
> +       test_must_fail git merge c1a >out &&
> +       test_i18ngrep "Refusing to lose dirty file at other.c" out &&
> +       test -f other.c~HEAD &&
> +       test $(git hash-object other.c~HEAD) = $(git rev-parse c1a:c1.c) &&
>         test_cmp important other.c

Code looks good,

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-10 19:05 ` [PATCH 02/30] merge-recursive: Fix logic ordering issue Elijah Newren
@ 2017-11-13 19:48   ` Stefan Beller
  2017-11-13 22:04     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 19:48 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> merge_trees() did a variety of work, including:
>   * Calling get_unmerged() to get unmerged entries
>   * Calling record_df_conflict_files() with all unmerged entries to
>     do some work to ensure we could handle D/F conflicts correctly
>   * Calling get_renames() to check for renames.
>
> An easily overlooked issue is that get_renames() can create more
> unmerged entries and add them to the list, which have the possibility of
> being involved in D/F conflicts.

I presume these are created via insert_stage_data called in
get_renames, when the path entry is not found?

> So the call to
> record_df_conflict_files() should really be moved after all the rename
> detection.  I didn't come up with any testcases demonstrating any bugs
> with the old ordering, but I suspect there were some for both normal
> renames and for directory renames.  Fix the ordering.

It is hard to trace this down, though looking at
3af244caa8 (Cumulative update of merge-recursive in C, 2006-07-27)
may help us reason about it.

How would a bug look like?

>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 1d3f8f0d22..52521faf09 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1981,10 +1981,10 @@ int merge_trees(struct merge_options *o,
>                 get_files_dirs(o, merge);
>
>                 entries = get_unmerged();
> -               record_df_conflict_files(o, entries);
>                 re_head  = get_renames(o, head, common, head, merge, entries);
>                 re_merge = get_renames(o, merge, common, head, merge, entries);
>                 clean = process_renames(o, re_head, re_merge);
> +               record_df_conflict_files(o, entries);
>                 if (clean < 0)
>                         goto cleanup;
>                 for (i = entries->nr-1; 0 <= i; i--) {
> --
> 2.15.0.5.g9567be9905
>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry
  2017-11-10 19:05 ` [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry Elijah Newren
@ 2017-11-13 21:06   ` Stefan Beller
  2017-11-13 22:57     ` Elijah Newren
  2017-11-14  1:26   ` Junio C Hamano
  1 sibling, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 21:06 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> If I have to walk through the debugger and inspect the values found in
> here in order to figure out their meaning, despite having known these
> things inside and out some years back, then they probably need a comment
> for the casual reader to explain their purpose.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 52521faf09..3526c8d0b8 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -513,6 +513,28 @@ static void record_df_conflict_files(struct merge_options *o,
>
>  struct rename {
>         struct diff_filepair *pair;
> +       /*
> +        * Because I keep forgetting every few years what src_entry and
> +        * dst_entry are and have to walk through a debugger and puzzle
> +        * through it to remind myself...

This repeats the commit message; and doesn't help me understanding the
{src/dst}_entry. (Maybe drop the first part here?) I'll read on.

> +        *
> +        * If 'before' is renamed to 'after' then src_entry will contain
> +        * the versions of 'before' from the merge_base, HEAD, and MERGE in
> +        * stages 1, 2, and 3; dst_entry will contain the versions of
> +        * 'after' from the merge_base, HEAD, and MERGE in stages 1, 2, and
> +        * 3.

So src == before, dst = after; no trickery with the stages (the same
stage number
before and after; only the order needs to be conveyed:
base, HEAD (ours?), MERGE (theirs?)

I can understand that, so I wonder if we can phrase it to mention (base,
HEAD, MERGE) just once.

>     Thus, we have a total of six modes and oids, though some
> +        * will be null.  (Stage 0 is ignored; we're interested in handling

s/will be/may be/ or /can be/?

> +        * conflicts.)
> +        *
> +        * Since we don't turn on break-rewrites by default, neither
> +        * src_entry nor dst_entry can have all three of their stages have
> +        * non-null oids, meaning at most four of the six will be non-null.

Oh. That explains the choice of /will be/ above. Thanks!

> +        * Also, since this is a rename, both src_entry and dst_entry will
> +        * have at least one non-null oid, meaning at least two will be
> +        * non-null.  Of the six oids, a typical rename will have three be
> +        * non-null.  Only two implies a rename/delete, and four implies a
> +        * rename/add.

That makes sense.

Thanks,
Stefan

> +        */
>         struct stage_data *src_entry;
>         struct stage_data *dst_entry;
>         unsigned processed:1;
> --
> 2.15.0.5.g9567be9905
>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/30] directory rename detection: basic testcases
  2017-11-10 19:05 ` [PATCH 04/30] directory rename detection: basic testcases Elijah Newren
@ 2017-11-13 22:04   ` Stefan Beller
  2017-11-14  0:57     ` Elijah Newren
  2017-11-14  2:03     ` Junio C Hamano
  0 siblings, 2 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 22:04 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 391 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 391 insertions(+)
>  create mode 100755 t/t6043-merge-rename-directories.sh
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> new file mode 100755
> index 0000000000..b737b0a105
> --- /dev/null
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -0,0 +1,391 @@
> +#!/bin/sh
> +
> +test_description="recursive merge with directory renames"
> +# includes checking of many corner cases, with a similar methodology to:
> +#   t6042: corner cases with renames but not criss-cross merges
> +#   t6036: corner cases with both renames and criss-cross merges
> +#
> +# The setup for all of them, pictorially, is:
> +#
> +#      B
> +#      o
> +#     / \
> +#  A o   ?
> +#     \ /
> +#      o
> +#      C
> +#
> +# To help make it easier to follow the flow of tests, they have been
> +# divided into sections and each test will start with a quick explanation
> +# of what commits A, B, and C contain.
> +#
> +# Notation:
> +#    z/{b,c}   means  files z/b and z/c both exist
> +#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
> +#                     underscore notation is to differentiate different
> +#                     files that might be renamed into each other's paths.)
> +
> +. ./test-lib.sh
> +
> +
> +###########################################################################
> +# SECTION 1: Basic cases we should be able to handle
> +###########################################################################
> +
> +# Testcase 1a, Basic directory rename.
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d,e/f}

(minor thought:)
After rereading the docs above this is clear; I wonder if instead of A, B, C
a notation of Base, ours, theirs would be easier to understand?


> +test_expect_success '1a-setup: Simple directory rename detection' '
> +test_expect_failure '1a-check: Simple directory rename detection' '

Thanks for splitting the setup and the check into two different test cases!


> +       git checkout B^0 &&

Any reason for ^0 ? (to make clear it is a branch?)

> +# Testcase 1b, Merge a directory with another
> +#   Commit A: z/{b,c},   y/d
> +#   Commit B: z/{b,c,e}, y/d
> +#   Commit C: y/{b,c,d}
> +#   Expected: y/{b,c,d,e}
> +
> +test_expect_success '1b-setup: Merge a directory with another' '
> +       git rm -rf . &&
> +       git clean -fdqx &&
> +       rm -rf .git &&
> +       git init &&

This is quite a strong statement to start a test with.
Nowadays we rather do

    test_when_finished "some command" &&

than polluting the follow up tests. But as you split up the previous test
into 2 tests, it is not clear if this would bring any good.

Also these are four cleanup commands, I'd have expected fewer.
Maybe just "rm -rf ." ? Or as we make a new repo for each test case,

    test_create_repo 1a &&
    cd 1a

in the first setup, and here we do
    test_create_repo 1b &&
    cd 1b

relying on test_done to cleanup everything afterwards?


> +# Testcase 1c, Transitive renaming
> +#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
> +#   (Related to testcases 9c and 9d -- can transitivity repeat?)
> +#   Commit A: z/{b,c},   x/d
> +#   Commit B: y/{b,c},   x/d
> +#   Commit C: z/{b,c,d}

So B is "Rename z to y" and C is "Move x/d to z/d"

> +#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)

This is an excellent expectation for a clean case like this.
I have not reached 3, 9 yet, so I'll remember these questions.

> +test_expect_success '1c-setup: Transitive renaming' '
> +       git rm -rf . &&
> +       git clean -fdqx &&
> +       rm -rf .git &&
> +       git init &&
> +
> +       mkdir z &&
> +       echo b >z/b &&
> +       echo c >z/c &&
> +       mkdir x &&
> +       echo d >x/d &&
> +       git add z x &&
> +       test_tick &&
> +       git commit -m "A" &&
> +
> +       git branch A &&
> +       git branch B &&
> +       git branch C &&
> +
> +       git checkout B &&
> +       git mv z y &&
> +       test_tick &&
> +       git commit -m "B" &&
> +
> +       git checkout C &&
> +       git mv x/d z/d &&
> +       test_tick &&
> +       git commit -m "C"
> +'
> +
> +test_expect_failure '1c-check: Transitive renaming' '
> +       git checkout B^0 &&
> +
> +       git merge -s recursive C^0 &&
> +
> +       test 3 -eq $(git ls-files -s | wc -l) &&

    git ls-files >out &&
    test_line_count = 3 out &&

maybe? Piping output of git commands somewhere is an
anti-pattern as we cannot examine the return code of ls-files in this case.

> +       test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
> +       test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
> +       test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&

Speaking of these, there are quite a lot of invocations of rev-parse,
though it looks clean; recently Junio had some good ideas regarding a
similar test[1].
So maybe

  git rev-parse >actual \
    HEAD:y/b  HEAD:y/c HEAD:y/d &&
  git rev-parse >expect \
    A:z/b    A:z/c    A:x/d &&
  test_cmp expect actual

Not sure if that is any nicer, but has fewer calls.

[1] https://public-inbox.org/git/xmqqa807ztx4.fsf@gitster.mtv.corp.google.com/


> +       test_must_fail git rev-parse HEAD:x/d &&
> +       test_must_fail git rev-parse HEAD:z/d &&
> +       test ! -f z/d
> +'
> +
> +# Testcase 1d, Directory renames (merging two directories into one new one)
> +#              cause a rename/rename(2to1) conflict
> +#   (Related to testcases 1c and 7b)
> +#   Commit A. z/{b,c},        y/{d,e}
> +#   Commit B. x/{b,c},        y/{d,e,m,wham}
> +#   Commit C. z/{b,c,n,wham}, x/{d,e}
> +#   Expected: x/{b,c,d,e,m,n}, CONFLICT:(y/wham & z/wham -> x/wham)
> +#   Note: y/m & z/n should definitely move into x.  By the same token, both
> +#         y/wham & z/wham should to...giving us a conflict.

If wham are equal (in oid), shouldn't this not conflict and only conflict
when z/wham and x/wham differ in oid, but have the same sub-path?

> +
> +# Testcase 1e, Renamed directory, with all filenames being renamed too
> +#   Commit A: z/{oldb,oldc}
> +#   Commit B: y/{newb,newc}
> +#   Commit C: z/{oldb,oldc,d}

What about oldd ?
(expecting a pattern across many files of s/old/new/ isn't unreasonable,
but maybe too much for now?)
By having no "old" prefix there is only one thing to do, which is y/d

> +#   Expected: y/{newb,newc,d}

ok.

> +# Testcase 1f, Split a directory into two other directories
> +#   (Related to testcases 3a, all of section 2, and all of section 4)
> +#   Commit A: z/{b,c,d,e,f}
> +#   Commit B: z/{b,c,d,e,f,g}
> +#   Commit C: y/{b,c}, x/{d,e,f}
> +#   Expected: y/{b,c}, x/{d,e,f,g}

Why not y/g ? Because there are more files in x?
I can come up with a reasonable counter example:

A: src/{main.c, foo.c, bar.c, magic.py}
B: src/{main.c, foo.c, bar.c, magic.py, magic-helper.py}
C: src/{main.c, foo.c, bar.c} py/{magic.py}
Expect: src/{main.c, foo.c, bar.c} py/{magic.py, magic-helper.py}


> +
> +###########################################################################
> +# Rules suggested by testcases in section 1:
> +#
> +#   We should still detect the directory rename even if it wasn't just
> +#   the directory renamed, but the files within it. (see 1b)
> +#
> +#   If renames split a directory into two or more others, the directory
> +#   with the most renames, "wins" (see 1c).  However, see the testcases
> +#   in section 2, plus testcases 3a and 4a.

oh, ok. I presume testcases 2 follows in a later patch.
I'll go looking.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 19:48   ` Stefan Beller
@ 2017-11-13 22:04     ` Elijah Newren
  2017-11-13 22:12       ` Stefan Beller
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-13 22:04 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Thanks for the reviews!

On Mon, Nov 13, 2017 at 11:48 AM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> merge_trees() did a variety of work, including:
>>   * Calling get_unmerged() to get unmerged entries
>>   * Calling record_df_conflict_files() with all unmerged entries to
>>     do some work to ensure we could handle D/F conflicts correctly
>>   * Calling get_renames() to check for renames.
>>
>> An easily overlooked issue is that get_renames() can create more
>> unmerged entries and add them to the list, which have the possibility of
>> being involved in D/F conflicts.
>
> I presume these are created via insert_stage_data called in
> get_renames, when the path entry is not found?

Yes.

>> So the call to
>> record_df_conflict_files() should really be moved after all the rename
>> detection.  I didn't come up with any testcases demonstrating any bugs
>> with the old ordering, but I suspect there were some for both normal
>> renames and for directory renames.  Fix the ordering.
>
> It is hard to trace this down, though looking at
> 3af244caa8 (Cumulative update of merge-recursive in C, 2006-07-27)
> may help us reason about it.

It doesn't really go back that far.  I added the
record_df_conflict_files() function (originally named
make_room_for_directories_of_df_conflicts()) in commit ef02b31721
(merge-recursive: Make room for directories in D/F conflicts
2010-09-20); the rename happened in commit 70cc3d36eb
(merge-recursive: Save D/F conflict filenames instead of unlinking
them 2011-08-11).

> How would a bug look like?

Some of these corner cases sometimes get confusing to try to reason
about and duplicate, so I was trying to avoid that....oh, well.  :-)
I mostly wanted to use the simple logic that:
record_df_conflict_files() exists to take an inventory of all unmerged
files to make sure that D/F conflicts can be handled appropriately.
get_renames() has the potential for adding more unmerged files, thus I
should have placed record_df_conflict_files() after get_renames() when
I introduced it.

But since you asked...

A bug here would essentially mean that a git merge fails to handle
files in directories under a D/F conflict; when trying to process such
files and write out their conflict state to disk, it would fail to
create the necessary directory because a file is in the way.

In order to trigger it, you'd need to have a D/F conflict where the
file involved in the D/F conflict wasn't unmerged after unpack_trees()
but only "shows up" due to the rename detection (i.e. added by the
insert_stage_data() call as you mention above).  I think reading
through Documentation/technical/trivial-merge.txt, that this actually
isn't possible with what I'm calling "normal" renames; it's actually
something newly possible only due to directory rename detection.  But
you may have to get the merge direction just right, you might have to
worry about files that sort between a file with the same name as a
directory and the files within the directory (e.g. "path.txt" in the
list "path", then "path.txt", then "path/foo").

Do you feel it's important that I come up with a demonstration case
here?  If so, I'll see if I can generate one.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 22:04     ` Elijah Newren
@ 2017-11-13 22:12       ` Stefan Beller
  2017-11-13 23:39         ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 22:12 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, Nov 13, 2017 at 2:04 PM, Elijah Newren <newren@gmail.com> wrote:

> Do you feel it's important that I come up with a demonstration case
> here?  If so, I'll see if I can generate one.

I was mostly "just curious" on how you'd construct it theoretically.

> it's actually something newly possible only due to directory rename detection.

So something like {rename/delete} on a directory in the merge, but
also an addition instead of the delete of another file?

I wanted to debug a very similar issue today just after reviewing this
series, see
https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8fd63@ispras.ru/

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry
  2017-11-13 21:06   ` Stefan Beller
@ 2017-11-13 22:57     ` Elijah Newren
  2017-11-13 23:11       ` Stefan Beller
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-13 22:57 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 1:06 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> +       /*
>> +        * Because I keep forgetting every few years what src_entry and
>> +        * dst_entry are and have to walk through a debugger and puzzle
>> +        * through it to remind myself...
>
> This repeats the commit message; and doesn't help me understanding the
> {src/dst}_entry. (Maybe drop the first part here?) I'll read on.

Yep, I'll toss it.

>> +        *
>> +        * If 'before' is renamed to 'after' then src_entry will contain
>> +        * the versions of 'before' from the merge_base, HEAD, and MERGE in
>> +        * stages 1, 2, and 3; dst_entry will contain the versions of
>> +        * 'after' from the merge_base, HEAD, and MERGE in stages 1, 2, and
>> +        * 3.
>
> So src == before, dst = after; no trickery with the stages (the same
> stage number
> before and after; only the order needs to be conveyed:
> base, HEAD (ours?), MERGE (theirs?)
>
> I can understand that, so I wonder if we can phrase it to mention (base,
> HEAD, MERGE) just once.

Perhaps:

  If 'before' is renamed to 'after' then src_entry will contain
  the versions of 'before' from the merge_base, HEAD, and MERGE in
  stages 1, 2, and 3; and dst_entry will contain the respective versions of
  'after' in corresponding locations.  Thus, we have a total of six modes
  and oids, though some will be null.  (Stage 0 is ignored; we're interested
  in handling conflicts.)

?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry
  2017-11-13 22:57     ` Elijah Newren
@ 2017-11-13 23:11       ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 23:11 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, Nov 13, 2017 at 2:57 PM, Elijah Newren <newren@gmail.com> wrote:
>
> Perhaps:
>
>   If 'before' is renamed to 'after' then src_entry will contain
>   the versions of 'before' from the merge_base, HEAD, and MERGE in
>   stages 1, 2, and 3; and dst_entry will contain the respective versions of
>   'after' in corresponding locations.  Thus, we have a total of six modes
>   and oids, though some will be null.  (Stage 0 is ignored; we're interested
>   in handling conflicts.)

I find that much easier to read, though I am biased with prior knowledge now. ;)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 05/30] directory rename detection: directory splitting testcases
  2017-11-10 19:05 ` [PATCH 05/30] directory rename detection: directory splitting testcases Elijah Newren
@ 2017-11-13 23:20   ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 23:20 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 125 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 125 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index b737b0a105..00811f512a 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -388,4 +388,129 @@ test_expect_failure '1f-check: Split a directory into two other directories' '
>  #   in section 2, plus testcases 3a and 4a.
>  ###########################################################################
>
> +
> +###########################################################################
> +# SECTION 2: Split into multiple directories, with equal number of paths
> +#
> +# Explore the splitting-a-directory rules a bit; what happens in the
> +# edge cases?
> +#
> +# Note that there is a closely related case of a directory not being
> +# split on either side of history, but being renamed differently on
> +# each side.  See testcase 8e for that.
> +###########################################################################
> +
> +# Testcase 2a, Directory split into two on one side, with equal numbers of paths
> +#   Commit A: z/{b,c}
> +#   Commit B: y/b, w/c
> +#   Commit C: z/{b,c,d}
> +#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict

> +       test_i18ngrep "CONFLICT.*directory rename split" out



> +# Testcase 2b, Directory split into two on one side, with equal numbers of paths
> +#   Commit A: z/{b,c}
> +#   Commit B: y/b, w/c
> +#   Commit C: z/{b,c}, x/d
> +#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict

This makes sense.

> +
> +###########################################################################
> +# Rules suggested by section 2:
> +#
> +#   None; the rule was already covered in section 1.  These testcases are
> +#   here just to make sure the conflict resolution and necessary warning
> +#   messages are handled correctly.
> +###########################################################################

okay, then I'll go back to 1. and discuss "the number of files as a
hint where to rename it to" there

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far
  2017-11-10 19:05 ` [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far Elijah Newren
@ 2017-11-13 23:25   ` Stefan Beller
  2017-11-14  1:02     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 23:25 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 137 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 137 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index 00811f512a..021513ec00 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -513,4 +513,141 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
>  #   messages are handled correctly.
>  ###########################################################################
>
> +
> +###########################################################################
> +# SECTION 3: Path in question is the source path for some rename already
> +#
> +# Combining cases from Section 1 and trying to handle them could lead to
> +# directory renaming detection being over-applied.  So, this section
> +# provides some good testcases to check that the implementation doesn't go
> +# too far.
> +###########################################################################
> +
> +# Testcase 3a, Avoid implicit rename if involved as source on other side
> +#   (Related to testcases 1c and 1f)
> +#   Commit A: z/{b,c,d}
> +#   Commit B: z/{b,c,d} (no change)

This could also be an unrelated change such as adding w/e?

> +#   Commit C: y/{b,c}, x/d
> +#   Expected: y/{b,c}, x/d


> +
> +# Testcase 3b, Avoid implicit rename if involved as source on other side
> +#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
> +#   Commit A: z/{b,c,d}
> +#   Commit B: y/{b,c}, x/d
> +#   Commit C: z/{b,c}, w/d
> +#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
> +#   NOTE: We're particularly checking that since z/d is already involved as
> +#         a source in a file rename on the same side of history, that we don't
> +#         get it involved in directory rename detection.  If it were, we might
> +#         end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
> +#         rename/rename/rename(1to3) conflict, which is just weird.

Makes sense.

>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 22:12       ` Stefan Beller
@ 2017-11-13 23:39         ` Elijah Newren
  2017-11-13 23:46           ` Stefan Beller
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-13 23:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 2:12 PM, Stefan Beller <sbeller@google.com> wrote:
> I wanted to debug a very similar issue today just after reviewing this
> series, see
> https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8fd63@ispras.ru/

Oh, bleh.  That's not a D/F conflict at all, it's the code assuming
there's a D/F conflict because the entry it is processing ("sub") is a
submodule rather than a file, and it panics when it sees "a directory
in the way" -- a directory that just so happens to be named "sub" and
which is in fact the desired submodule, meaning that the working
directory is already good and needs no changes.

In this case, the relevant code from merge-recursive.c is the following:

        /* Case B: Added in one. */
        /* [nothing|directory] -> ([nothing|directory], file) */
<snip>
        if (dir_in_way(path, !o->call_depth,
                   S_ISGITLINK(a_mode))) {
            char *new_path = unique_path(o, path, add_branch);
            clean_merge = 0;
            output(o, 1, _("CONFLICT (%s): There is a directory with
name %s in %s. "
                   "Adding %s as %s"),
                   conf, path, other_branch, path, new_path);

Note that the comment even explicitly assumes the newly added entry is
a file.  We should expect there to be a directory present (the
submodule being added), but the code doesn't have a check for that.
The S_ISGITLINK(a_mode) makes you think it has special handling for
the submodule case, but that's for the reverse situation (the
submodule isn't yet present in the working copy, it came from the
other side of history, but there is an empty directory present).

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 02/30] merge-recursive: Fix logic ordering issue
  2017-11-13 23:39         ` Elijah Newren
@ 2017-11-13 23:46           ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-13 23:46 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, Nov 13, 2017 at 3:39 PM, Elijah Newren <newren@gmail.com> wrote:
> On Mon, Nov 13, 2017 at 2:12 PM, Stefan Beller <sbeller@google.com> wrote:
>> I wanted to debug a very similar issue today just after reviewing this
>> series, see
>> https://public-inbox.org/git/743acc29-85bb-3773-b6a0-68d4a0b8fd63@ispras.ru/
>
> Oh, bleh.  That's not a D/F conflict at all, it's the code assuming
> there's a D/F conflict because the entry it is processing ("sub") is a
> submodule rather than a file, and it panics when it sees "a directory
> in the way" -- a directory that just so happens to be named "sub" and
> which is in fact the desired submodule, meaning that the working
> directory is already good and needs no changes.

yup, I came to find the same snippet of code to be the offender,
I just haven't figured out how to fix this bug.

Thanks for taking a look!

As you have a lot of fresh knowledge in the merge-recursive case
currently, how would we approach the fix here?

(there is a test available at
remotes/origin/sb/test-cherry-pick-submodule-getting-in-a-way)

> In this case, the relevant code from merge-recursive.c is the following:
>
>         /* Case B: Added in one. */
>         /* [nothing|directory] -> ([nothing|directory], file) */
> <snip>
>         if (dir_in_way(path, !o->call_depth,
>                    S_ISGITLINK(a_mode))) {
>             char *new_path = unique_path(o, path, add_branch);
>             clean_merge = 0;
>             output(o, 1, _("CONFLICT (%s): There is a directory with
> name %s in %s. "
>                    "Adding %s as %s"),
>                    conf, path, other_branch, path, new_path);
>
> Note that the comment even explicitly assumes the newly added entry is
> a file.  We should expect there to be a directory present (the
> submodule being added), but the code doesn't have a check for that.
> The S_ISGITLINK(a_mode) makes you think it has special handling for
> the submodule case, but that's for the reverse situation (the
> submodule isn't yet present in the working copy, it came from the
> other side of history, but there is an empty directory present).

oh :/

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion
  2017-11-10 19:05 ` [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
@ 2017-11-14  0:07   ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-14  0:07 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 100 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 100 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index 021513ec00..ec054b210a 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -650,4 +650,104 @@ test_expect_success '3b-check: Avoid implicit rename if involved as source on cu
>  #   of a rename on either side of a merge.
>  ###########################################################################
>
> +
> +###########################################################################
> +# SECTION 4: Partially renamed directory; still exists on both sides of merge
> +#
> +# What if we were to attempt to do directory rename detection when someone
> +# "mostly" moved a directory but still left some files around, or,
> +# equivalently, fully renamed a directory in one commmit and then recreated
> +# that directory in a later commit adding some new files and then tried to
> +# merge?
> +#
> +# It's hard to divine user intent in these cases, because you can make an
> +# argument that, depending on the intermediate history of the side being
> +# merged, that some users will want files in that directory to
> +# automatically be detected and renamed, while users with a different
> +# intermediate history wouldn't want that rename to happen.
> +#
> +# I think that it is best to simply not have directory rename detection
> +# apply to such cases.  My reasoning for this is four-fold: (1) it's
> +# easiest for users in general to figure out what happened if we don't
> +# apply directory rename detection in any such case, (2) it's an easy rule
> +# to explain ["We don't do directory rename detection if the directory
> +# still exists on both sides of the merge"], (3) we can get some hairy
> +# edge/corner cases that would be really confusing and possibly not even
> +# representable in the index if we were to even try, and [related to 3] (4)
> +# attempting to resolve this issue of divining user intent by examining
> +# intermediate history goes against the spirit of three-way merges and is a
> +# path towards crazy corner cases that are far more complex than what we're
> +# already dealing with.

This last paragraph ("I think") sounds like part of a commit message,
rather than
a note inside a testing script. Not sure if I recommend moving this
text into the
commit message.

> +# This section contains a test for this partially-renamed-directory case.
> +###########################################################################
> +
> +# Testcase 4a, Directory split, with original directory still present
> +#   (Related to testcase 1f)
> +#   Commit A: z/{b,c,d,e}
> +#   Commit B: y/{b,c,d}, z/e
> +#   Commit C: z/{b,c,d,e,f}
> +#   Expected: y/{b,c,d}, z/{e,f}
> +#   NOTE: Even though most files from z moved to y, we don't want f to follow.

Makes sense.

> +###########################################################################
> +# Rules suggested by section 4:
> +#
> +#   Directory-rename-detection should be turned off for any directories (as
> +#   a source for renames) that exist on both sides of the merge.  (The "as
> +#   a source for renames" clarification is due to cases like 1c where
> +#   the target directory exists on both sides and we do want the rename
> +#   detection.)  But, sadly, see testcase 8b.

Looking forward for that test case.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 08/30] directory rename detection: files/directories in the way of some renames
  2017-11-10 19:05 ` [PATCH 08/30] directory rename detection: files/directories in the way of some renames Elijah Newren
@ 2017-11-14  0:15   ` Stefan Beller
  2017-11-14  1:19     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-14  0:15 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 303 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 303 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index ec054b210a..d15153c652 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -750,4 +750,307 @@ test_expect_success '4a-check: Directory split, with original directory still pr
>  #   detection.)  But, sadly, see testcase 8b.
>  ###########################################################################
>
> +
> +###########################################################################
> +# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
> +#
> +# Implicitly renaming files due to a detected directory rename could run
> +# into problems if there are files or directories in the way of the paths
> +# we want to rename.  Explore such cases in this section.
> +###########################################################################
> +
> +# Testcase 5a, Merge directories, other side adds files to original and target
> +#   Commit A: z/{b,c},       y/d
> +#   Commit B: z/{b,c,e_1,f}, y/{d,e_2}
> +#   Commit C: y/{b,c,d}
> +#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
> +#   NOTE: While directory rename detection is active here causing z/f to
> +#         become y/f, we did not apply this for z/e_1 because that would
> +#         give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
> +#         this add/add, is that both versions of y/e are from the same side
> +#         of history, giving us no way to represent this conflict in the
> +#         index.

Makes sense.

> +# Testcase 5b, Rename/delete in order to get add/add/add conflict
> +#   (Related to testcase 8d; these may appear slightly inconsistent to users;
> +#    Also related to testcases 7d and 7e)
> +#   Commit A: z/{b,c,d_1}
> +#   Commit B: y/{b,c,d_2}
> +#   Commit C: z/{b,c,d_1,e}, y/d_3
> +#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
> +#   NOTE: If z/d_1 in commit C were to be involved in dir rename detection, as
> +#         we normaly would since z/ is being renamed to y/, then this would be
> +#         a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
> +#         conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
> +#         representable in the index, so the existence of y/d_3 needs to
> +#         cause us to bail on directory rename detection for that path, falling
> +#         back to git behavior without the directory rename detection.


> +
> +# Testcase 5c, Transitive rename would cause rename/rename/rename/add/add/add
> +#   (Directory rename detection would result in transitive rename vs.
> +#    rename/rename(1to2) and turn it into a rename/rename(1to3).  Further,
> +#    rename paths conflict with separate adds on the other side)
> +#   (Related to testcases 3b and 7c)
> +#   Commit A: z/{b,c}, x/d_1
> +#   Commit B: y/{b,c,d_2}, w/d_1
> +#   Commit C: z/{b,c,d_1,e}, w/d_3, y/d_4
> +#   Expected: A mess, but only a rename/rename(1to2)/add/add mess.  Use the
> +#             presence of y/d_4 in C to avoid doing transitive rename of
> +#             x/d_1 -> z/d_1 -> y/d_1, so that the only paths we have at
> +#             y/d are y/d_2 and y/d_4.  We still do the move from z/e to y/e,
> +#             though, because it doesn't have anything in the way.

Missing the expected state, only the explanation is given.


> +# Testcase 5d, Directory/file/file conflict due to directory rename
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c,d_1}
> +#   Commit C: z/{b,c,d_2,f}, y/d/e
> +#   Expected: y/{b,c,d/e,f}, z/d_2, CONFLICT(file/directory), y/d_1~HEAD
> +#   Note: The fact that y/d/ exists in C makes us bail on directory rename
> +#         detection for z/d_2, but that doesn't prevent us from applying the
> +#         directory rename detection for z/f -> y/f.

Makes sense.

> +
> +###########################################################################
> +# Rules suggested by section 5:
> +#
> +#   If a subset of to-be-renamed files have a file or directory in the way,
> +#   "turn off" the directory rename for those specific sub-paths,

Makes sense.

>  falling
> +#   back to old handling.  But, sadly, see testcases 8a and 8b.

You seem to be hinting at these all the time.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 09/30] directory rename detection: testcases checking which side did the rename
  2017-11-10 19:05 ` [PATCH 09/30] directory rename detection: testcases checking which side did the rename Elijah Newren
@ 2017-11-14  0:25   ` Stefan Beller
  2017-11-14  1:30     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-14  0:25 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 283 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 283 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index d15153c652..157299105f 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -1053,4 +1053,287 @@ test_expect_failure '5d-check: Directory/file/file conflict due to directory ren
>  #   back to old handling.  But, sadly, see testcases 8a and 8b.
>  ###########################################################################
>
> +
> +###########################################################################
> +# SECTION 6: Same side of the merge was the one that did the rename
> +#
> +# It may sound obvious that you only want to apply implicit directory
> +# renames to directories if the _other_ side of history did the renaming.
> +# If you did make an implementation that didn't explicitly enforce this
> +# rule, the majority of cases that would fall under this section would
> +# also be solved by following the rules from the above sections.  But
> +# there are still a few that stick out, so this section covers them just
> +# to make sure we also get them right.
> +###########################################################################
> +
> +# Testcase 6a, Tricky rename/delete
> +#   Commit A: z/{b,c,d}
> +#   Commit B: z/b
> +#   Commit C: y/{b,c}, z/d
> +#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
> +#   Note: We're just checking here that the rename of z/b and z/c to put
> +#         them under y/ doesn't accidentally catch z/d and make it look like
> +#         it is also involved in a rename/delete conflict.
> +

> +
> +# Testcase 6b, Same rename done on both sides
> +#   (Related to testcases 6c and 8e)
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c}
> +#   Commit C: y/{b,c}, z/d

Missing expected state

> +#   Note: If we did directory rename detection here, we'd move z/d into y/,
> +#         but C did that rename and still decided to put the file into z/,
> +#         so we probably shouldn't apply directory rename detection for it.

correct. Also we don't want to see a rename/rename conflict (obviously).

If we have

    Commit A: z/{b_1,c}
    Commit B: y/{b_2,c}
    Commit C: y/{b_3,c}, z/d

then we'd produce a standard file merge (which may or may not result
in conflict,
depending on touched lines) for y/b_{try-resolve}

> +
> +# Testcase 6c, Rename only done on same side
> +#   (Related to testcases 6b and 8e)
> +#   Commit A: z/{b,c}
> +#   Commit B: z/{b,c} (no change)
> +#   Commit C: y/{b,c}, z/d
> +#   Expected: y/{b,c}, z/d
> +#   NOTE: Seems obvious, but just checking that the implementation doesn't
> +#         "accidentally detect a rename" and give us y/{b,c,d}.

makes sense.

> +
> +# Testcase 6d, We don't always want transitive renaming
> +#   (Related to testcase 1c)
> +#   Commit A: z/{b,c}, x/d
> +#   Commit B: z/{b,c}, x/d (no change)
> +#   Commit C: y/{b,c}, z/d
> +#   Expected: y/{b,c}, z/d
> +#   NOTE: Again, this seems obvious but just checking that the implementation
> +#         doesn't "accidentally detect a rename" and give us y/{b,c,d}.

makes sense, too.

> +# Testcase 6e, Add/add from one-side
> +#   Commit A: z/{b,c}
> +#   Commit B: z/{b,c} (no change)
> +#   Commit C: y/{b,c,d_1}, z/d_2
> +#   Expected: y/{b,c,d_1}, z/d_2
> +#   NOTE: Again, this seems obvious but just checking that the implementation
> +#         doesn't "accidentally detect a rename" and give us y/{b,c} +
> +#         add/add conflict on y/d_1 vs y/d_2.

What is less obvious in all these cases is the "(no change)" part to me.
I would think that at least *something* changes in B in all cases above, maybe
add file u/r (un-related) to have the tree ids changed?
("Less obvious" as in: we don't rely on the "no changes" part to make
the decision,
which sounds tempting so far)

>  test_done

No conclusion box here, so my (misguided) suggestion:

  If "No change" occurs, just take the other side. ;)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 10/30] directory rename detection: more involved edge/corner testcases
  2017-11-10 19:05 ` [PATCH 10/30] directory rename detection: more involved edge/corner testcases Elijah Newren
@ 2017-11-14  0:42   ` Stefan Beller
  2017-11-14 21:11     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-14  0:42 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:

> +# In my opinion, testcases that are difficult to understand from this
> +# section is due to difficulty in the testcase rather than the directory
> +# renaming (similar to how t6042 and t6036 have difficult resolutions due
> +# to the problem setup itself being complex).  And I don't think the
> +# error messages are a problem.

"In my opinion" ... sounds like commit message?

> +# On the other hand, the testcases in section 8 worry me slightly more...

The aforementioned section 8... :)

> +# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c}
> +#   Commit C: w/b, x/c, z/d
> +#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
> +#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.

But the creator of C intended to have z/d, not {w,x}/d, and as {w,x} == y,
I am not sure I like this result. (I have no concrete counter example, just
messy logic)


> +# Testcase 7b, rename/rename(2to1), but only due to transitive rename
> +#   (Related to testcase 1d)
> +#   Commit A: z/{b,c},     x/d_1, w/d_2
> +#   Commit B: y/{b,c,d_2}, x/d_1
> +#   Commit C: z/{b,c,d_1},        w/d_2
> +#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)

Makes sense.

> +# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
> +#   (Related to testcases 3b and 5c)
> +#   Commit A: z/{b,c}, x/d
> +#   Commit B: y/{b,c}, w/d
> +#   Commit C: z/{b,c,d}
> +#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)

CONFLICT(x/d -> y/d vs w/d) ?

> +#   NOTE: z/ was renamed to y/ so we do not want to report
> +#         either CONFLICT(x/d -> w/d vs. z/d)
> +#         or CONFLiCT x/d -> w/d vs. y/d vs. z/d)

"neither ... nor" instead of "not either or"?

> +# Testcase 7d, transitive rename involved in rename/delete; how is it reported?
> +#   (Related somewhat to testcases 5b and 8d)
> +#   Commit A: z/{b,c}, x/d
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d}
> +#   Expected: y/{b,c}, CONFLICT(delete x/d vs rename to y/d)
> +#   NOTE: z->y so NOT CONFLICT(delete x/d vs rename to z/d)


> +# Testcase 7e, transitive rename in rename/delete AND dirs in the way
> +#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
> +#   (Also related to testcases 9c and 9d)
> +#   Commit A: z/{b,c},     x/d_1
> +#   Commit B: y/{b,c,d/g}, x/d/f
> +#   Commit C: z/{b,c,d_1}
> +#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
> +#             y/{b,c,d/g}, y/d_1~C^0, x/d/f
> +#   NOTE: x/d/f may be slightly confusing here.  x/d_1 -> z/d_1 implies
> +#         there is a directory rename from x/ -> z/, performed by commit C.
> +#         However, on the side of commit B, it renamed z/ -> y/, thus
> +#         making a rename from x/ -> z/ when it was getting rid of z/ seems
> +#         non-sensical.  Further, putting x/d/f into y/d/f also doesn't
> +#         make a lot of sense because commit B did the renaming of z to y
> +#         and it created x/d/f, and it clearly made these things separate,
> +#         so it doesn't make much sense to push these together.

This is confusing.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/30] directory rename detection: basic testcases
  2017-11-13 22:04   ` Stefan Beller
@ 2017-11-14  0:57     ` Elijah Newren
  2017-11-14  1:21       ` Stefan Beller
  2017-11-14  2:03     ` Junio C Hamano
  1 sibling, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-14  0:57 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 2:04 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> (minor thought:)
> After rereading the docs above this is clear; I wonder if instead of A, B, C
> a notation of Base, ours, theirs would be easier to understand?

Sure, that'd be an easy change.

>> +test_expect_success '1a-setup: Simple directory rename detection' '
>> +test_expect_failure '1a-check: Simple directory rename detection' '
>
> Thanks for splitting the setup and the check into two different test cases!
>
>
>> +       git checkout B^0 &&
>
> Any reason for ^0 ? (to make clear it is a branch?)

Partially force-of-habit (did the same with t6036 and t6042), but it
seemed to have the nice feature of making debugging easier through
improved reproducability.  In particular, if I had checked out B
rather than B^0 in the testcase and a merge succeeded when I didn't
expect it, then attempting to run the same commands gives me a
different starting point for the merge.  By instead explicitly
checking out B^0, then even if the merge succeeds, someone who again
runs checkout B^0 will have the same starting point.

>> +test_expect_success '1b-setup: Merge a directory with another' '
>> +       git rm -rf . &&
>> +       git clean -fdqx &&
>> +       rm -rf .git &&
>> +       git init &&
>
> This is quite a strong statement to start a test with.

It was actually copy-paste from t6036, and achieved two purposes:
  1) Even if one test fails, subsequent ones continue running.  (Had
lots of problems with this in t6036 years ago and just ended up with
those four steps)
  2) Someone who runs into a failing testcase has a _much_ easier time
figuring out what is going on with the testcase.  I find it takes a
fair amount of time to figure out what's going on with other tests in
git's testsuite because of the presence of so many files completely
unrelated to the given test, which have simply accumulated from
previous tests.  For many tests, that may be fine, but in particular
for t6036, t6042, and now t6043, since these are mostly about corner
cases that are hard enough to reason about, I didn't want the extra
distractions.

but...

> Nowadays we rather do
>
>     test_when_finished "some command" &&
>
> than polluting the follow up tests. But as you split up the previous test
> into 2 tests, it is not clear if this would bring any good.

test_when_finished looks pretty cool; I didn't know about it.  Thanks
for the pointer.  Not sure if we want to use it here (if we do, we'd
only do so in the check tests rather than the setup ones).

> Also these are four cleanup commands, I'd have expected fewer.
> Maybe just "rm -rf ." ? Or as we make a new repo for each test case,
>
>     test_create_repo 1a &&
>     cd 1a
>
> in the first setup, and here we do
>     test_create_repo 1b &&
>     cd 1b
>
> relying on test_done to cleanup everything afterwards?

rm aborts for me with 'rm -rf .', but I could possibly make it 'rm -rf
* .* && git init .'

The test_create_repo might not be so bad as long as every test used it
and put all files under it's own little repo.  It does create some
clutter, but it's at least somewhat managed.  I'm still a bit partial
to just totally cleaning it out, but if others would prefer, I can
switch.

>> +       test 3 -eq $(git ls-files -s | wc -l) &&
>
>     git ls-files >out &&
>     test_line_count = 3 out &&
>
> maybe? Piping output of git commands somewhere is an
> anti-pattern as we cannot examine the return code of ls-files in this case.

Um...I guess that makes sense, if you assumed that I cared about the
return code of ls-files.  But it seems to make the tests with multiple
calls to ls-files in a row (of which there are many) considerably
uglier, so I'd personally prefer to avoid that.  Also, why would I
care about the return code of ls-files?

Your suggestion made me curious, so I went looking.  As far as I can
tell, the return code of ls-files is always 0 unless you both specify
both --error-unmatch and one or more paths, neither of which I did.
Am I missing something?

If you feel the return code of ls-files is important, perhaps I could
just have a separate
   git ls-files -s >/dev/null &&
call before the others?

>> +       test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
>> +       test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
>> +       test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&
>
> Speaking of these, there are quite a lot of invocations of rev-parse,
> though it looks clean; recently Junio had some good ideas regarding a
> similar test[1].
> So maybe
>
>   git rev-parse >actual \
>     HEAD:y/b  HEAD:y/c HEAD:y/d &&
>   git rev-parse >expect \
>     A:z/b    A:z/c    A:x/d &&
>   test_cmp expect actual
>
> Not sure if that is any nicer, but has fewer calls.

Sure, I can switch it over.

>> +       test_must_fail git rev-parse HEAD:x/d &&
>> +       test_must_fail git rev-parse HEAD:z/d &&
>> +       test ! -f z/d
>> +'
>> +
>> +# Testcase 1d, Directory renames (merging two directories into one new one)
>> +#              cause a rename/rename(2to1) conflict
>> +#   (Related to testcases 1c and 7b)
>> +#   Commit A. z/{b,c},        y/{d,e}
>> +#   Commit B. x/{b,c},        y/{d,e,m,wham}
>> +#   Commit C. z/{b,c,n,wham}, x/{d,e}
>> +#   Expected: x/{b,c,d,e,m,n}, CONFLICT:(y/wham & z/wham -> x/wham)
>> +#   Note: y/m & z/n should definitely move into x.  By the same token, both
>> +#         y/wham & z/wham should to...giving us a conflict.
>
> If wham are equal (in oid), shouldn't this not conflict and only conflict
> when z/wham and x/wham differ in oid, but have the same sub-path?

Good eyes, I should have labelled these as wham_1 and wham_2, since
the testcase did explicitly make them different (having contents of
"wham1\n" and "wham2\n"), but it could make sense to add a test to the
testsuite where two such colliding files are identical.

I thought about that, figured it wasn't worth it, and didn't add the
code to handle it.  I could add a testcase for it, though I'd be very
tempted to just leave it as test_expect_failure and let someone else
come along and implement it if someone feels so inclined.  Note,
though, that the situation is made slightly more complex due to the
fact that it might be N colliding files rather than just 2 (due to the
possibility of multiple directories being merged into one on either
side of history).

>> +
>> +# Testcase 1e, Renamed directory, with all filenames being renamed too
>> +#   Commit A: z/{oldb,oldc}
>> +#   Commit B: y/{newb,newc}
>> +#   Commit C: z/{oldb,oldc,d}
>
> What about oldd ?
> (expecting a pattern across many files of s/old/new/ isn't unreasonable,
> but maybe too much for now?)
> By having no "old" prefix there is only one thing to do, which is y/d

I'm not following.  The "old" and "new" in the filenames were just
there so a human reading the testcase could easily tell which
filenames were related and involved in renames.  There is no rename
associated with d, so why would I have called it "old" or "new"?

>> +# Testcase 1f, Split a directory into two other directories
>> +#   (Related to testcases 3a, all of section 2, and all of section 4)
>> +#   Commit A: z/{b,c,d,e,f}
>> +#   Commit B: z/{b,c,d,e,f,g}
>> +#   Commit C: y/{b,c}, x/{d,e,f}
>> +#   Expected: y/{b,c}, x/{d,e,f,g}
>
> Why not y/g ? Because there are more files in x?

Yes.

> I can come up with a reasonable counter example:

Oh sure, of course.  You should also be able to come up with
counter-examples to the "correctness" of git's content merging of
files when it fails to report any conflicts (e.g. one side added
another call to a function, the other side changed the number of
parameters the function took).  In short, this is just a case of where
we need to come up with simple predictable rules since we can't read
minds.  Here's the situation we're faced with, and my line of thinking
in coming up with this simple, predictable, but definitely coarse
rule:

  * Users sometimes rename directories on one side of history and add
files to the original directory on the other side.
  * We would like to "detect" the directory rename, and move the new
files into the correct directory.
  * We don't really have any hints for detecting directory renames
other by comparing content, i.e. basing it on the file rename
detection we already have.
  * There is a possibility that not all files in a certain directory
went to the same location.  It's possible that a few may have gone
elsewhere.
  * Only in weird corner cases would I expect renamed-file location to
be split nearly 50-50 (as in this contrived testcase) -- although for
such cases, as you point out, there is a much higher chance of us
getting the merge "wrong".  Thus, our rule should be simple so people
can understand and expect what we did and have an easy time fixing it.

So, what to do?  There are a few options:
  1) Don't do directory rename detection at all.  Just declare it to
be an anti-feature.
  2) Try to guess why the user moved different files to different
directories and try to mimic that somehow.
  3) Only allow directory rename detection to happen when ALL renamed
files in the directory went to the same directory.
  4) Use a simple predictable rule like majority wins.

I think 2 is insanity.  Choices 1 and 3 don't have much appeal to me;
I'm strongly against them.  I'm unware of any remaining choices other
than 4, but 4 seems pretty reasonable to me.  It won't always be
right, but neither is simple content merging.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far
  2017-11-13 23:25   ` Stefan Beller
@ 2017-11-14  1:02     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14  1:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 3:25 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:

>> +# Testcase 3a, Avoid implicit rename if involved as source on other side
>> +#   (Related to testcases 1c and 1f)
>> +#   Commit A: z/{b,c,d}
>> +#   Commit B: z/{b,c,d} (no change)
>
> This could also be an unrelated change such as adding w/e?

Yes, precisely.  Whenever I have a "no change" commit, the intent is
that there may be unrelated changes involved, they've just been
excluded from the testcase in order to make it minimal.

>> +#   Commit C: y/{b,c}, x/d
>> +#   Expected: y/{b,c}, x/d

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 08/30] directory rename detection: files/directories in the way of some renames
  2017-11-14  0:15   ` Stefan Beller
@ 2017-11-14  1:19     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14  1:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 4:15 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> +# Testcase 5c, Transitive rename would cause rename/rename/rename/add/add/add
>> +#   (Directory rename detection would result in transitive rename vs.
>> +#    rename/rename(1to2) and turn it into a rename/rename(1to3).  Further,
>> +#    rename paths conflict with separate adds on the other side)
>> +#   (Related to testcases 3b and 7c)
>> +#   Commit A: z/{b,c}, x/d_1
>> +#   Commit B: y/{b,c,d_2}, w/d_1
>> +#   Commit C: z/{b,c,d_1,e}, w/d_3, y/d_4
>> +#   Expected: A mess, but only a rename/rename(1to2)/add/add mess.  Use the
>> +#             presence of y/d_4 in C to avoid doing transitive rename of
>> +#             x/d_1 -> z/d_1 -> y/d_1, so that the only paths we have at
>> +#             y/d are y/d_2 and y/d_4.  We still do the move from z/e to y/e,
>> +#             though, because it doesn't have anything in the way.
>
> Missing the expected state, only the explanation is given.

Yeah...it seemed so ugly to try to write down.  As a possible
sidenote, this testcase was actually guided by the final test of
t6042, which is messy enough, but directory rename detection provides
a little extra freedom to get a higher order conflict and make things
a bit messier.  It felt like it was a case where just leaving the
expectation in code in the 5c-check was just easier and maybe even
clearer.  Should I add a comment to that effect, or would you really
just prefer to see it spelled out?

>>  falling
>> +#   back to old handling.  But, sadly, see testcases 8a and 8b.
>
> You seem to be hinting at these all the time.

I think there were just multiple angles at which to approach those
testcases.  *shrug*

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/30] directory rename detection: basic testcases
  2017-11-14  0:57     ` Elijah Newren
@ 2017-11-14  1:21       ` Stefan Beller
  2017-11-14  1:40         ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-14  1:21 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Mon, Nov 13, 2017 at 4:57 PM, Elijah Newren <newren@gmail.com> wrote:
> On Mon, Nov 13, 2017 at 2:04 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> (minor thought:)
>> After rereading the docs above this is clear; I wonder if instead of A, B, C
>> a notation of Base, ours, theirs would be easier to understand?
>
> Sure, that'd be an easy change.
>
>>> +test_expect_success '1a-setup: Simple directory rename detection' '
>>> +test_expect_failure '1a-check: Simple directory rename detection' '
>>
>> Thanks for splitting the setup and the check into two different test cases!
>>
>>
>>> +       git checkout B^0 &&
>>
>> Any reason for ^0 ? (to make clear it is a branch?)
>
> Partially force-of-habit (did the same with t6036 and t6042), but it
> seemed to have the nice feature of making debugging easier through
> improved reproducability.  In particular, if I had checked out B
> rather than B^0 in the testcase and a merge succeeded when I didn't
> expect it, then attempting to run the same commands gives me a
> different starting point for the merge.  By instead explicitly
> checking out B^0, then even if the merge succeeds, someone who again
> runs checkout B^0 will have the same starting point.

Thanks for enlightening me. Makes sense.

>
>>> +test_expect_success '1b-setup: Merge a directory with another' '
>>> +       git rm -rf . &&
>>> +       git clean -fdqx &&
>>> +       rm -rf .git &&
>>> +       git init &&
>>
>> This is quite a strong statement to start a test with.
>
> It was actually copy-paste from t6036, and achieved two purposes:
>   1) Even if one test fails, subsequent ones continue running.  (Had
> lots of problems with this in t6036 years ago and just ended up with
> those four steps)
>   2) Someone who runs into a failing testcase has a _much_ easier time
> figuring out what is going on with the testcase.  I find it takes a
> fair amount of time to figure out what's going on with other tests in
> git's testsuite because of the presence of so many files completely
> unrelated to the given test, which have simply accumulated from
> previous tests.  For many tests, that may be fine, but in particular
> for t6036, t6042, and now t6043, since these are mostly about corner
> cases that are hard enough to reason about, I didn't want the extra
> distractions.

I agree with these reasons to strongly want a clean slate.

>> Nowadays we rather do
>>
>>     test_when_finished "some command" &&
>>
>> than polluting the follow up tests. But as you split up the previous test
>> into 2 tests, it is not clear if this would bring any good.
>
> test_when_finished looks pretty cool; I didn't know about it.  Thanks
> for the pointer.  Not sure if we want to use it here (if we do, we'd
> only do so in the check tests rather than the setup ones).
>
>> Also these are four cleanup commands, I'd have expected fewer.
>> Maybe just "rm -rf ." ? Or as we make a new repo for each test case,
>>
>>     test_create_repo 1a &&
>>     cd 1a
>>
>> in the first setup, and here we do
>>     test_create_repo 1b &&
>>     cd 1b
>>
>> relying on test_done to cleanup everything afterwards?
>
> rm aborts for me with 'rm -rf .', but I could possibly make it 'rm -rf
> * .* && git init .'
>
> The test_create_repo might not be so bad as long as every test used it
> and put all files under it's own little repo.

That is my current favorite, I would think.


>  It does create some
> clutter, but it's at least somewhat managed.

But the clutter is outside the repository under test, which
may help with the situation.

> I'm still a bit partial
> to just totally cleaning it out, but if others would prefer, I can
> switch.

(slightly dreaming:)
I wonder if we could teach our test suite to accept multiple test_done
or restart_tests or such to resurrect the clean slate.

>
>>> +       test 3 -eq $(git ls-files -s | wc -l) &&
>>
>>     git ls-files >out &&
>>     test_line_count = 3 out &&
>>
>> maybe? Piping output of git commands somewhere is an
>> anti-pattern as we cannot examine the return code of ls-files in this case.
>
> Um...I guess that makes sense, if you assumed that I cared about the
> return code of ls-files.

As this is the test suite, we care about the return code of any git
command, for current git as well as future-gits.

>  But it seems to make the tests with multiple
> calls to ls-files in a row (of which there are many) considerably
> uglier, so I'd personally prefer to avoid that.  Also, why would I
> care about the return code of ls-files?

While I understand the rationale here and your examination of ls-files
seems to indicate that we are ok doing it here, a lot of (test) code
is taken for inspiration (copied, adapted) to other test cases.
And most of the time we actually care if the return code is correct
additionally to the actions performed, so I was shooting from the hip
here.

> Your suggestion made me curious, so I went looking.  As far as I can
> tell, the return code of ls-files is always 0 unless you both specify
> both --error-unmatch and one or more paths, neither of which I did.
> Am I missing something?

I am not saying it was a cargo-culting reaction, but rather relaying
a well discussed style issue to you. ;)


> If you feel the return code of ls-files is important, perhaps I could
> just have a separate
>    git ls-files -s >/dev/null &&
> call before the others?

I would prefer to not add any further calls; also (speaking generically)
this would bring in potential racing issues (what if the second ls-files
behaves differently than the first?)

>>> +       test $(git rev-parse HEAD:y/b) = $(git rev-parse A:z/b) &&
>>> +       test $(git rev-parse HEAD:y/c) = $(git rev-parse A:z/c) &&
>>> +       test $(git rev-parse HEAD:y/d) = $(git rev-parse A:x/d) &&
>>
>> Speaking of these, there are quite a lot of invocations of rev-parse,
>> though it looks clean; recently Junio had some good ideas regarding a
>> similar test[1].
>> So maybe
>>
>>   git rev-parse >actual \
>>     HEAD:y/b  HEAD:y/c HEAD:y/d &&
>>   git rev-parse >expect \
>>     A:z/b    A:z/c    A:x/d &&
>>   test_cmp expect actual
>>
>> Not sure if that is any nicer, but has fewer calls.
>
> Sure, I can switch it over.

Well, that was also just a quick suggestion, maybe we'll find an
even nicer way that has also very few invocations.

>>> +
>>> +# Testcase 1e, Renamed directory, with all filenames being renamed too
>>> +#   Commit A: z/{oldb,oldc}
>>> +#   Commit B: y/{newb,newc}
>>> +#   Commit C: z/{oldb,oldc,d}
>>
>> What about oldd ?
>> (expecting a pattern across many files of s/old/new/ isn't unreasonable,
>> but maybe too much for now?)
>> By having no "old" prefix there is only one thing to do, which is y/d
>
> I'm not following.  The "old" and "new" in the filenames were just
> there so a human reading the testcase could easily tell which
> filenames were related and involved in renames.  There is no rename
> associated with d, so why would I have called it "old" or "new"?

because a user may be impressed by gits pattern matching in the
rename detection:

 A: z/{oldb,oldc}
 B: z/{newb,newc}
 C: z/{oldb, oldc, oldd}

Obviously A->B is z/{old->new}-* (not a directory rename,
but just patterns), so one might be tempted to expect newd
as the expectation. But that is nonsense(!?)

>
>   * Users sometimes rename directories on one side of history and add
> files to the original directory on the other side.
>   * We would like to "detect" the directory rename, and move the new
> files into the correct directory.
>   * We don't really have any hints for detecting directory renames
> other by comparing content, i.e. basing it on the file rename
> detection we already have.
>   * There is a possibility that not all files in a certain directory
> went to the same location.  It's possible that a few may have gone
> elsewhere.
>   * Only in weird corner cases would I expect renamed-file location to
> be split nearly 50-50 (as in this contrived testcase) -- although for
> such cases, as you point out, there is a much higher chance of us
> getting the merge "wrong".  Thus, our rule should be simple so people
> can understand and expect what we did and have an easy time fixing it.

Makes sense

>
> So, what to do?  There are a few options:
>   1) Don't do directory rename detection at all.  Just declare it to
> be an anti-feature.
>   2) Try to guess why the user moved different files to different
> directories and try to mimic that somehow.
>   3) Only allow directory rename detection to happen when ALL renamed
> files in the directory went to the same directory.
>   4) Use a simple predictable rule like majority wins.

I wonder if eventually (down the road, not now) we would want to
also teach custom merge/diff drivers about potential dir renaming.
(Well not these drivers itself, but rather a method by which a custom diff
driver can tell us to use another custom rule for thees rename detections)

>
> I think 2 is insanity.

or the place where hooks/custom code shines. :)

> Choices 1 and 3 don't have much appeal to me;
> I'm strongly against them.  I'm unware of any remaining choices other
> than 4, but 4 seems pretty reasonable to me.  It won't always be
> right, but neither is simple content merging.

That makes sense, thanks!
Stefan

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry
  2017-11-10 19:05 ` [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry Elijah Newren
  2017-11-13 21:06   ` Stefan Beller
@ 2017-11-14  1:26   ` Junio C Hamano
  1 sibling, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  1:26 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> If I have to walk through the debugger and inspect the values found in
> here in order to figure out their meaning, despite having known these
> things inside and out some years back, then they probably need a comment
> for the casual reader to explain their purpose.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 52521faf09..3526c8d0b8 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -513,6 +513,28 @@ static void record_df_conflict_files(struct merge_options *o,
>  
>  struct rename {
>  	struct diff_filepair *pair;
> +	/*
> +	 * Because I keep forgetting every few years what src_entry and
> +	 * dst_entry are and have to walk through a debugger and puzzle
> +	 * through it to remind myself...

Very much appreciated.  I recall having trouble reasoning about
them myself, too (even though I admit I wasn't involved in the
implementation of this part very much and know this code a lot less
intimately than you do in the first place).

> +	 *
> +	 * If 'before' is renamed to 'after' then src_entry will contain
> +	 * the versions of 'before' from the merge_base, HEAD, and MERGE in
> +	 * stages 1, 2, and 3; dst_entry will contain the versions of
> +	 * 'after' from the merge_base, HEAD, and MERGE in stages 1, 2, and
> +	 * 3.  Thus, we have a total of six modes and oids, though some
> +	 * will be null.  (Stage 0 is ignored; we're interested in handling
> +	 * conflicts.)
> +	 *
> +	 * Since we don't turn on break-rewrites by default, neither
> +	 * src_entry nor dst_entry can have all three of their stages have
> +	 * non-null oids, meaning at most four of the six will be non-null.
> +	 * Also, since this is a rename, both src_entry and dst_entry will
> +	 * have at least one non-null oid, meaning at least two will be
> +	 * non-null.  Of the six oids, a typical rename will have three be
> +	 * non-null.  Only two implies a rename/delete, and four implies a
> +	 * rename/add.
> +	 */
>  	struct stage_data *src_entry;
>  	struct stage_data *dst_entry;
>  	unsigned processed:1;

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 09/30] directory rename detection: testcases checking which side did the rename
  2017-11-14  0:25   ` Stefan Beller
@ 2017-11-14  1:30     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14  1:30 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 4:25 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> +# Testcase 6b, Same rename done on both sides
>> +#   (Related to testcases 6c and 8e)
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: y/{b,c}, z/d
>
> Missing expected state

Oops!

>> +#   Note: If we did directory rename detection here, we'd move z/d into y/,
>> +#         but C did that rename and still decided to put the file into z/,
>> +#         so we probably shouldn't apply directory rename detection for it.
>
> correct. Also we don't want to see a rename/rename conflict (obviously).

Interestingly, (and this is unrelated to directory rename detection)
merge-recursive.c has a RENAME_ONE_FILE_TO_ONE value exactly for the
case where one file was renamed on both sides of history, but was
renamed to the exact same thing on both sides.  And in those cases, it
turns what would otherwise be a rename/rename(1to2) conflict into
essentially a RENAME_NORMAL case.  So, we wouldn't have to worry about
a rename/rename conflict anyway.

>> +# Testcase 6e, Add/add from one-side
>> +#   Commit A: z/{b,c}
>> +#   Commit B: z/{b,c} (no change)
>> +#   Commit C: y/{b,c,d_1}, z/d_2
>> +#   Expected: y/{b,c,d_1}, z/d_2
>> +#   NOTE: Again, this seems obvious but just checking that the implementation
>> +#         doesn't "accidentally detect a rename" and give us y/{b,c} +
>> +#         add/add conflict on y/d_1 vs y/d_2.
>
> What is less obvious in all these cases is the "(no change)" part to me.
> I would think that at least *something* changes in B in all cases above, maybe
> add file u/r (un-related) to have the tree ids changed?
> ("Less obvious" as in: we don't rely on the "no changes" part to make
> the decision,
> which sounds tempting so far)

Yes, I could have introduced unrelated changes into the test, and my
assumption is that the real world testcase would have such, it's just
that in paring down to a minimal testcase I end up with a "no change"
commit.

>>  test_done
>
> No conclusion box here, so my (misguided) suggestion:

Yeah, the conclusion was actually in the summary.  I could potentially
restate it here:

"Only apply implicit directory renames to directories if the _other_
side of history is the one doing the renaming"

I can add that.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/30] directory rename detection: basic testcases
  2017-11-14  1:21       ` Stefan Beller
@ 2017-11-14  1:40         ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14  1:40 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 5:21 PM, Stefan Beller <sbeller@google.com> wrote:
> On Mon, Nov 13, 2017 at 4:57 PM, Elijah Newren <newren@gmail.com> wrote:

> (slightly dreaming:)
> I wonder if we could teach our test suite to accept multiple test_done
> or restart_tests or such to resurrect the clean slate.

I'm dreaming now too; I would like that a lot more, although the
separate test_create_repo for each case isn't too bad and should be a
pretty mechanical switch.

>>>> +       test 3 -eq $(git ls-files -s | wc -l) &&
>>>
>>>     git ls-files >out &&
>>>     test_line_count = 3 out &&
>>>

> I am not saying it was a cargo-culting reaction, but rather relaying
> a well discussed style issue to you. ;)

Ah, gotcha.

>> If you feel the return code of ls-files is important, perhaps I could
>> just have a separate
>>    git ls-files -s >/dev/null &&
>> call before the others?
>
> I would prefer to not add any further calls; also (speaking generically)
> this would bring in potential racing issues (what if the second ls-files
> behaves differently than the first?)

Makes sense.

>> I'm not following.  The "old" and "new" in the filenames were just
>> there so a human reading the testcase could easily tell which
>> filenames were related and involved in renames.  There is no rename
>> associated with d, so why would I have called it "old" or "new"?
>
> because a user may be impressed by gits pattern matching in the
> rename detection:
>
>  A: z/{oldb,oldc}
>  B: z/{newb,newc}
>  C: z/{oldb, oldc, oldd}
>
> Obviously A->B is z/{old->new}-* (not a directory rename,
> but just patterns), so one might be tempted to expect newd
> as the expectation. But that is nonsense(!?)

Ah, now I see where you were going.  Thanks for explaining.

>> I think 2 is insanity.
>
> or the place where hooks/custom code shines. :)

:)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 04/30] directory rename detection: basic testcases
  2017-11-13 22:04   ` Stefan Beller
  2017-11-14  0:57     ` Elijah Newren
@ 2017-11-14  2:03     ` Junio C Hamano
  1 sibling, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  2:03 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Elijah Newren, git

Stefan Beller <sbeller@google.com> writes:

> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>> Signed-off-by: Elijah Newren <newren@gmail.com>
>> ...
>> +#      B
>> +#      o
>> +#     / \
>> +#  A o   ?
>> +#     \ /
>> +#      o
>> +#      C
>> + ...
>> +# Testcase 1a, Basic directory rename.
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: z/{b,c,d,e/f}
>
> (minor thought:)
> After rereading the docs above this is clear; I wonder if instead of A, B, C
> a notation of Base, ours, theirs would be easier to understand?

I had a similar thought, but as long as everything in this file is
consistent, as we have that picture upfront, I am OK with it.  FWIW,
t1000 uses O (original--common ancestor) A and B, which was the
notation commonly used in our codebase since the early days when we
needed to call them with single letters.

>> +test_expect_success '1a-setup: Simple directory rename detection' '
>> +test_expect_failure '1a-check: Simple directory rename detection' '
>
> Thanks for splitting the setup and the check into two different test cases!
>
>
>> +       git checkout B^0 &&
>
> Any reason for ^0 ? (to make clear it is a branch?)

I think it is to make it clear that no matter what this test does
(or fails to do), the branch B is *not* affected by it because we'd
be playing on a detached head.

>> +test_expect_success '1b-setup: Merge a directory with another' '
>> +       git rm -rf . &&
>> +       git clean -fdqx &&
>> +       rm -rf .git &&
>> +       git init &&
>
> This is quite a strong statement to start a test with.

Yes.  If a test before this one did cd ../.. and forgot to come
back, we'd be in trouble.  If we want a fresh repository perhaps
test-create-repo inside the trash repository may be a less evil
option.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 15/30] merge-recursive: Move the get_renames() function
  2017-11-10 19:05 ` [PATCH 15/30] merge-recursive: Move the get_renames() function Elijah Newren
@ 2017-11-14  4:46   ` Junio C Hamano
  2017-11-14 17:41     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  4:46 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> I want to re-use some other functions in the file without moving those other
> functions or dealing with a handful of annoying split function declarations
> and definitions.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---

It took me a while to figure out that you are basing this on top of
a slightly older tip of 'master'.  When rebasing on, or merging to,
a newer codebase, these two lines

> -	diff_setup(&opts);
> -	DIFF_OPT_SET(&opts, RECURSIVE);
> -	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
> -	opts.detect_rename = DIFF_DETECT_RENAME;
> ...
> +	diff_setup(&opts);
> +	DIFF_OPT_SET(&opts, RECURSIVE);
> +	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
> +	opts.detect_rename = DIFF_DETECT_RENAME;

would need a bit of adjustment.

By the way, checkpatch.pl complains about // C99 comments and binary
operators missing SP on both ends, etc., on the entire series [*1*].
These look like small issues, but they are distracting enough to
break concentration while reading the changes to spot places with
real issues and places that can be improved, so cleaning them up
early would help the final result to get better reviews.

I won't reproduce all of them here, but here are a representable
few.

ERROR: spaces required around that '=' (ctx:VxV)
#30: FILE: merge-recursive.c:1491:
+       for (i=0; i<slist->nr; i++) {
              ^

ERROR: spaces required around that '<' (ctx:VxV)
#30: FILE: merge-recursive.c:1491:
+       for (i=0; i<slist->nr; i++) {
                   ^

ERROR: "foo* bar" should be "foo *bar"
#42: FILE: merge-recursive.c:1503:
+static char* handle_path_level_conflicts(struct merge_options *o,

WARNING: suspect code indent for conditional statements (8, 10)
#80: FILE: merge-recursive.c:791:
+       if (o->call_depth || !was_tracked(path))
+         return !dirty;

Thanks.

[Footnote]

Just FYI, checkpatch.pl also notices these but it seems that our
existing codebase already violates them in a major way, so I usually
do not pay attention to these classes of complaints:

ERROR: spaces required around that ':' (ctx:VxV)
#30: FILE: merge-recursive.c:603:
+       unsigned add_turned_into_rename:1;
                                       ^

WARNING: quoted string split across lines
#74: FILE: merge-recursive.c:1433:
+                       output(o, 1, _("Refusing to lose untracked file at "
+                                      "%s, even though it's in the way."),

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic
  2017-11-10 19:05 ` [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic Elijah Newren
@ 2017-11-14  4:56   ` Junio C Hamano
  2017-11-14  5:14     ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  4:56 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> +struct rename_info {
> +	struct string_list *head_renames;
> +	struct string_list *merge_renames;
> +};

This type is added in order to allow the caller and the helper to
communicate the findings in a single logical structure, instead of
having to pass them as separate parameters, etc.  If we anticipate
that the information that needs to be passed will grow richer in
later steps (or a follow-up series), such encapsulation makes a lot
of sence.

> +static struct rename_info *handle_renames(struct merge_options *o,
> +					  struct tree *common,
> +					  struct tree *head,
> +					  struct tree *merge,
> +					  struct string_list *entries,
> +					  int *clean)
> +{
> +	struct rename_info *rei = xcalloc(1, sizeof(struct rename_info));

I however notice that there is only one caller of this helper at
this step, and also at the end of this series.  I suspect that it
would probably be a better design to make "clean" the return value
of this helper, and instead have the caller pass an uninitialised
rename_info structure on its stack by address to be filled by the
helper.

> +	rei->head_renames  = get_renames(o, head, common, head, merge, entries);
> +	rei->merge_renames = get_renames(o, merge, common, head, merge, entries);
> +	*clean = process_renames(o, rei->head_renames, rei->merge_renames);
> +
> +	return rei;
> +}
> +
> +static void cleanup_renames(struct rename_info *re_info)
> +{
> +	string_list_clear(re_info->head_renames, 0);
> +	string_list_clear(re_info->merge_renames, 0);
> +
> +	free(re_info->head_renames);
> +	free(re_info->merge_renames);
> +
> +	free(re_info);
> +}
>  static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
>  {
>  	return (is_null_oid(oid) || mode == 0) ? NULL: (struct object_id *)oid;
> @@ -1989,7 +2021,8 @@ int merge_trees(struct merge_options *o,
>  	}
>  
>  	if (unmerged_cache()) {
> -		struct string_list *entries, *re_head, *re_merge;
> +		struct string_list *entries;
> +		struct rename_info *re_info;
>  		int i;
>  		/*
>  		 * Only need the hashmap while processing entries, so
> @@ -2003,9 +2036,7 @@ int merge_trees(struct merge_options *o,
>  		get_files_dirs(o, merge);
>  
>  		entries = get_unmerged();
> -		re_head  = get_renames(o, head, common, head, merge, entries);
> -		re_merge = get_renames(o, merge, common, head, merge, entries);
> -		clean = process_renames(o, re_head, re_merge);
> +		re_info = handle_renames(o, common, head, merge, entries, &clean);
>  		record_df_conflict_files(o, entries);
>  		if (clean < 0)
>  			goto cleanup;
> @@ -2030,16 +2061,13 @@ int merge_trees(struct merge_options *o,
>  		}
>  
>  cleanup:
> -		string_list_clear(re_merge, 0);
> -		string_list_clear(re_head, 0);
> +		cleanup_renames(re_info);
> +
>  		string_list_clear(entries, 1);
> +		free(entries);
>  
>  		hashmap_free(&o->current_file_dir_set, 1);
>  
> -		free(re_merge);
> -		free(re_head);
> -		free(entries);
> -
>  		if (clean < 0)
>  			return clean;
>  	}

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs
  2017-11-10 19:05 ` [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs Elijah Newren
@ 2017-11-14  4:58   ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  4:58 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> get_renames() has always zero'ed out diff_queued_diff.nr while only
> manually free'ing diff_filepairs that did not correspond to renames.
> Further, it allocated struct renames that were tucked away in the
> return string_list.  Make sure all of these are deallocated when we
> are done with them.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 49710c0964..7a3402e50c 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1661,10 +1661,21 @@ static struct rename_info *handle_renames(struct merge_options *o,
>  
>  static void cleanup_renames(struct rename_info *re_info)
>  {
> -	string_list_clear(re_info->head_renames, 0);
> -	string_list_clear(re_info->merge_renames, 0);
> +	const struct rename *re;
> +	int i;
>  
> +	for (i = 0; i < re_info->head_renames->nr; i++) {
> +		re = re_info->head_renames->items[i].util;
> +		diff_free_filepair(re->pair);
> +	}
> +	string_list_clear(re_info->head_renames, 1);
>  	free(re_info->head_renames);
> +
> +	for (i = 0; i < re_info->merge_renames->nr; i++) {
> +		re = re_info->merge_renames->items[i].util;
> +		diff_free_filepair(re->pair);
> +	}
> +	string_list_clear(re_info->merge_renames, 1);

And this obviously will be helped by having another helper "cleanup_rename()"
that does one of them, and call it twice from here.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic
  2017-11-14  4:56   ` Junio C Hamano
@ 2017-11-14  5:14     ` Junio C Hamano
  2017-11-14 18:24       ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  5:14 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Elijah Newren <newren@gmail.com> writes:
>
>> +struct rename_info {
>> +	struct string_list *head_renames;
>> +	struct string_list *merge_renames;
>> +};
>
> This type is added in order to allow the caller and the helper to
> communicate the findings in a single logical structure, instead of
> having to pass them as separate parameters, etc.  If we anticipate
> that the information that needs to be passed will grow richer in
> later steps (or a follow-up series), such encapsulation makes a lot

Hmph, I actually am quite confused with the existing code.

The caller (originally in merge_trees(), now in handle_renames())
calls get_renames() twice and have the list of renamed paths in
these two string lists.  get_renames() mostly works with the
elements in the "entries" list and adds the "struct rename" to the
string list that is to be returned.  And the caller uses these two
string lists get_renames() returns when calling process_renames(),
but once process_renames() is done with them, these two string lists
are never looked at by anybody.

So do we really need to pass this structure around in the first
place?  I am wondering if we can do the cleanup_rename() on both of
these lists after handle_renames() calls process_renames() before
returning, which will eliminate the need for this structure and a
separate cleanup_renames() helper that clears the structure and the
two string lists in it.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs
  2017-11-10 19:05 ` [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs Elijah Newren
@ 2017-11-14  5:20   ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  5:20 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> Create a new function, get_diffpairs() to compute the diff_filepairs
> between two trees.  While these are currently only used in
> get_renames(), I want them to be available to some new functions.  No
> actual logic changes yet.

OK.  

This refactors an easy-to-use (in the context of merge-recursive
code) wrapper to diff-tree out of the existing code, which makes
sense.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 21/30] merge-recursive: Add get_directory_renames()
  2017-11-10 19:05 ` [PATCH 21/30] merge-recursive: Add get_directory_renames() Elijah Newren
@ 2017-11-14  5:30   ` Junio C Hamano
  2017-11-14 18:38     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Junio C Hamano @ 2017-11-14  5:30 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> +		entry = dir_rename_find_entry(dir_renames, old_dir);
> +		if (!entry) {
> +			entry = xcalloc(1, sizeof(struct dir_rename_entry));
> +			hashmap_entry_init(entry, strhash(old_dir));

Please make these two lines into its own dir_rename_entry_init()
helper.  Because the structure is defined as

+struct dir_rename_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *dir;
+	unsigned non_unique_new_dir:1;
+	char *new_dir;
+	struct string_list possible_new_dirs;
+};
+

in the previous patch, we'd want to see its string_list member to be
initialised explicitly (we do not want to depend on "filling with
NUL happens to make it a NODUP kind of string_list, which suits our
purpose").  The definition of _init() function may belong to the
previous step.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 15/30] merge-recursive: Move the get_renames() function
  2017-11-14  4:46   ` Junio C Hamano
@ 2017-11-14 17:41     ` Elijah Newren
  2017-11-15  1:20       ` Junio C Hamano
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-14 17:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Mon, Nov 13, 2017 at 8:46 PM, Junio C Hamano <gitster@pobox.com> wrote:

> It took me a while to figure out that you are basing this on top of
> a slightly older tip of 'master'.  When rebasing on, or merging to,
> a newer codebase

Sorry about that.  Yes, I worked on the series over time and rebased a
couple times up to v2.15.0.  I assumed that was new enough, but
clearly I was wrong.

> By the way, checkpatch.pl complains about // C99 comments and binary
> operators missing SP on both ends, etc., on the entire series [*1*].
> These look like small issues, but they are distracting enough to
> break concentration while reading the changes to spot places with
> real issues and places that can be improved, so cleaning them up
> early would help the final result to get better reviews.
>
> I won't reproduce all of them here, but here are a representable
> few.

Eek!  My apologies.  I will go through and fix them up.  I see no
reference to checkpatch.pl in git, but a google search shows there's
one in the linux source tree.  Is that were I get it from, or is there
a different one?

Also, would you like me to make a separate commit that cleans up
pre-existing issues in merge-recursive.c so that it runs clean, or
just remove the problems I added?


Thanks for all the reviews!

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic
  2017-11-14  5:14     ` Junio C Hamano
@ 2017-11-14 18:24       ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14 18:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Mon, Nov 13, 2017 at 9:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Elijah Newren <newren@gmail.com> writes:
>>
>>> +struct rename_info {
>>> +    struct string_list *head_renames;
>>> +    struct string_list *merge_renames;
>>> +};
>>
>> This type is added in order to allow the caller and the helper to
>> communicate the findings in a single logical structure, instead of
>> having to pass them as separate parameters, etc.  If we anticipate
>> that the information that needs to be passed will grow richer in
>> later steps (or a follow-up series), such encapsulation makes a lot
>
> Hmph, I actually am quite confused with the existing code.
>
> The caller (originally in merge_trees(), now in handle_renames())
> calls get_renames() twice and have the list of renamed paths in
> these two string lists.  get_renames() mostly works with the
> elements in the "entries" list and adds the "struct rename" to the
> string list that is to be returned.  And the caller uses these two
> string lists get_renames() returns when calling process_renames(),
> but once process_renames() is done with them, these two string lists
> are never looked at by anybody.

Actually, if I remember correctly, my first stab was to do all the
cleanup at the end of handle_renames(), but then I ran into
use-after-free errors.  I'm not sure if I remember all the details,
but I'll try to lay out the path:

process_renames() can't handle conflicts immediately because of D/F
concerns (if all entries in the competing directory resolve away, then
there's no more D/F conflict, but we have to wait until each of those
entries is processed to find out if that happens or if a D/F conflict
remains).  Because of that, process_renames() needs to store
information into a rename_conflict_info struct for process_entry() to
look at later.  Included in rename_conflict_info are things like
diff_filepair and stage_data entries, both taken from the rename
lists.  If the rename lists are freed at the end of handle_renames(),
then this information is freed before process_entry() runs and thus we
get a use-after-free error.

Since both you and I thought to push this cleanup to the end of
handle_renames(), though, I should probably add that explanation to
the commit message.  Granted, it isn't actually needed for this
particular commit, because up to this point all the information used
in rename_conflict_info was leaked anyway.  But it becomes an issue
with patch 17 when we start freeing that info.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 21/30] merge-recursive: Add get_directory_renames()
  2017-11-14  5:30   ` Junio C Hamano
@ 2017-11-14 18:38     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14 18:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Mon, Nov 13, 2017 at 9:30 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
>> +             entry = dir_rename_find_entry(dir_renames, old_dir);
>> +             if (!entry) {
>> +                     entry = xcalloc(1, sizeof(struct dir_rename_entry));
>> +                     hashmap_entry_init(entry, strhash(old_dir));
>
> Please make these two lines into its own dir_rename_entry_init()
> helper.
<snip>
> we'd want to see its string_list member to be
> initialised explicitly

Will do.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges
  2017-11-10 19:05 ` [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
@ 2017-11-14 20:33   ` Stefan Beller
  2017-11-14 21:42     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-14 20:33 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  t/t6043-merge-rename-directories.sh | 371 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 371 insertions(+)
>
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index 115d0d2622..bdfd943c88 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -1683,4 +1683,375 @@ test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in th
>         test $(git hash-object y/d~C^0) = $(git rev-parse A:x/d)
>  '
>
> +
> +###########################################################################
> +# SECTION 8: Suboptimal merges
> +#
> +# As alluded to in the last section, the ruleset we have built up for
> +# detecting directory renames unfortunately has some special cases where it
> +# results in slightly suboptimal or non-intuitive behavior.  This section
> +# explores these cases.
> +#
> +# To be fair, we already had non-intuitive or suboptimal behavior for most
> +# of these cases in git before introducing implicit directory rename
> +# detection, but it'd be nice if there was a modified ruleset out there
> +# that handled these cases a bit better.
> +###########################################################################
> +
> +# Testcase 8a, Dual-directory rename, one into the others' way
> +#   Commit A. x/{a,b},   y/{c,d}
> +#   Commit B. x/{a,b,e}, y/{c,d,f}
> +#   Commit C. y/{a,b},   z/{c,d}
> +#
> +# Possible Resolutions:
> +#   Previous git: y/{a,b,f},   z/{c,d},   x/e
> +#   Expected:     y/{a,b,e,f}, z/{c,d}
> +#   Preferred:    y/{a,b,e},   z/{c,d,f}

it might be tricky in the future to know what "previous git" is;
"Previous git" means without directory renames enabled;

"expected" means we expect the algorithm presented in this series to produce
this output, preferred is what we actually expect.


> +#
> +# Note: Both x and y got renamed and it'd be nice to detect both, and we do
> +# better with directory rename detection than git did previously, but the
> +# simple rule from section 5 prevents me from handling this as optimally as
> +# we potentially could.

which were:

   If a subset of to-be-renamed files have a file or directory in the way,
   "turn off" the directory rename for those specific sub-paths, falling
   back to old handling.  But, sadly, see testcases 8a and 8b.

The tricky part is y in this example as x,y "swapped" its content in C,
and moved 'old y content' to the new z/.

Makes sense, but I agree it might be painful to debug such a case
from a users point of view.

> +
> +# Testcase 8b, Dual-directory rename, one into the others' way, with conflicting filenames
> +#   Commit A. x/{a_1,b_1},     y/{a_2,b_2}
> +#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
> +#   Commit C. y/{a_1,b_1},     z/{a_2,b_2}
> +#
> +# Possible Resolutions:
> +#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
> +#   Scary:        y/{a_1,b_1},     z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
> +#   Preferred:    y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}

It may be common to have sub directories with the same path having different
blobs, e.g. when having say multiple hardware configurations in different sub
directories configured. Then renaming becomes a pain when they overlap.

> +# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
> +# y, both are named 'e'.  Without directory rename detection, neither file
> +# moves directories.  Implment directory rename detection suboptimally, and

Implement

> +# you get an add/add conflict, but both files were added in commit B, so this
> +# is an add/add conflict where one side of history added both files --
> +# something we can't represent in the index.  Obviously, we'd prefer the last
> +# resolution, but our previous rules are too coarse to allow it.  Using both
> +# the rules from section 4 and section 5 save us from the Scary resolution,
> +# making us fall back to pre-directory-rename-detection behavior for both
> +# e_1 and e_2.

ok, so add "Expected" as well? (repeating "Previous git", or so?)

> +
> +# Testcase 8c, rename+modify/delete
> +#   (Related to testcases 5b and 8d)
> +#   Commit A: z/{b,c,d}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d_modified,e}
> +#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
> +#
> +#   Note: This testcase doesn't present any concerns for me...until you
> +#         compare it with testcases 5b and 8d.  See notes in 8d for more
> +#         details.

Makes sense.

> +# Testcase 8d, rename/delete...or not?
> +#   (Related to testcase 5b; these may appear slightly inconsistent to users;
> +#    Also related to testcases 7d and 7e)

> +#   Commit A: z/{b,c,d}
> +#   Commit B: y/{b,c}
> +#   Commit C: z/{b,c,d,e}
> +#   Expected: y/{b,c,e}

Why this?
* d is deleted in B and not found in the result
* the rename detection also worked well in z->y  for adding e

I do not see the confusion, yet.

> +#   Note: It would also be somewhat reasonable to resolve this as
> +#             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
> +#   The logic being that the only difference between this testcase and 8c
> +#   is that there is no modification to d.  That suggests that instead of a
> +#   rename/modify vs. delete conflict, we should just have a rename/delete
> +#   conflict, otherwise we are being inconsistent.
> +#
> +#   However...as far as consistency goes, we didn't report a conflict for
> +#   path d_1 in testcase 5b due to a different file being in the way.  So,
> +#   we seem to be forced to have cases where users can change things
> +#   slightly and get what they may perceive as inconsistent results.  It
> +#   would be nice to avoid that, but I'm not sure I see how.
> +#
> +#   In this case, I'm leaning towards: commit B was the one that deleted z/d
> +#   and it did the rename of z to y, so the two "conflicts" (rename vs.
> +#   delete) are both coming from commit B, which is non-sensical.  Conflicts
> +#   during merging are supposed to be about opposite sides doing things
> +#   differently.

  "Sensical has not yet become an "official" word in the English language, which
  would be why you can't use it. Nonsense is a word, therefore nonsensical can
  used to describe something of nonsense. However, sense has different meanings
  and doesn't have an adjective for something of sense"

from https://english.stackexchange.com/questions/38582/antonym-of-nonsensical
I don't mind it, the spell checker just made me go on a detour. Maybe illogical?

> +# Testcase 8e, Both sides rename, one side adds to original directory
> +#   Commit A: z/{b,c}
> +#   Commit B: y/{b,c}
> +#   Commit C: w/{b,c}, z/d
> +#
> +# Possible Resolutions:
> +#   Previous git: z/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
> +#   Expected:     y/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
> +#   Preferred:    ??
> +#
> +# Notes: In commit B, directory z got renamed to y.  In commit C, directory z
> +#        did NOT get renamed; the directory is still present; instead it is
> +#        considered to have just renamed a subset of paths in directory z
> +#        elsewhere.  Therefore, the directory rename done in commit B to z/
> +#        applies to z/d and maps it to y/d.
> +#
> +#        It's possible that users would get confused about this, but what
> +#        should we do instead?   Silently leaving at z/d seems just as bad or
> +#        maybe even worse.  Perhaps we could print a big warning about z/d
> +#        and how we're moving to y/d in this case, but when I started thinking
> +#        abouty the ramifications of doing that, I didn't know how to rule out
> +#        that opening other weird edge and corner cases so I just punted.

s/about/abouty

It sort of makes sense from a users POV.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 10/30] directory rename detection: more involved edge/corner testcases
  2017-11-14  0:42   ` Stefan Beller
@ 2017-11-14 21:11     ` Elijah Newren
  2017-11-14 22:47       ` Stefan Beller
  0 siblings, 1 reply; 81+ messages in thread
From: Elijah Newren @ 2017-11-14 21:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Mon, Nov 13, 2017 at 4:42 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:

> "In my opinion" ... sounds like commit message?

Sure, I can move it there.


>> +# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: w/b, x/c, z/d
>> +#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
>> +#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
>
> But the creator of C intended to have z/d, not {w,x}/d, and as {w,x} == y,
> I am not sure I like this result. (I have no concrete counter example, just
> messy logic)

I'm open to alternative interpretations here.  The biggest issue for
me -- going back our discussion at the end of
https://public-inbox.org/git/CABPp-BFKiam6AK-Gg_RzaLuLur-jz0kvv3TqsHNHg5+HTv_uzA@mail.gmail.com/
-- is "simple, predictable rule", which is consistent with the other
rules and limits the number of nasty corner cases as much as possible.
Perhaps you think this is one of those nasty corner cases, and that's
fair, but I think it'd be hard to do much better.

After spending quite a while trying to think of any other alternative
rules or ways of looking at this, I could only come up with two
points:

  1) One could view this as a case where commit C didn't in fact do
any directory rename -- note that directory z/ still exists in that
commit.  Thus, only B did a rename, it renamed z/ -> y/, thus C's z/d
should be moved to y/d.  So, this choice is consistent with the other
rules we've got.

  2) An alternate (or maybe additional?) rule: We could decide that if
a source path is renamed on both sides of history, then we'll just
ignore both renames for consideration of directory rename detection.

The new rule idea would "fix" this testcase to your liking, although
now we'd be somewhat inconsistent with the "directory still exists
implies no directory rename occurred rule".  But what other weirdness
could entail?  Here's a few I've thought of:

Commit O: z/{b,c,d}
Commit A: y/{b,c}
Commit B: z/{newb, newc, e}

Here, A renamed z/ -> y/.  Except B renamed z/b and z/c differently,
so all paths used to detect the z/ -> y/ rename are ignored, so there
isn't a rename after all.  I'm not so sure I like that decision.
Let's keep looking though, and change it up a bit more:

Commit O: z/{b,c,d}
Commit A: y/{b,c}, x/d
Commit B: z/{newb, newc, d, e}

Here, A has a split rename.  Since B renames z/b and z/c differently,
we have to ignore the z/ -> y/ rename, and thus the only rename left
implies z/ -> x/.  Thus we'd end up with z/e getting moved into x/e.
Seems weird to me, and less likely that a user would understand this
rule than the "majority wins" one.


>> +# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
>> +#   (Related to testcases 3b and 5c)
>> +#   Commit A: z/{b,c}, x/d
>> +#   Commit B: y/{b,c}, w/d
>> +#   Commit C: z/{b,c,d}
>> +#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)
>
> CONFLICT(x/d -> y/d vs w/d) ?

I'm afraid I'm not following the question.

>
>> +#   NOTE: z/ was renamed to y/ so we do not want to report
>> +#         either CONFLICT(x/d -> w/d vs. z/d)
>> +#         or CONFLiCT x/d -> w/d vs. y/d vs. z/d)
>
> "neither ... nor" instead of "not either or"?

Yes, thanks.

>> +# Testcase 7e, transitive rename in rename/delete AND dirs in the way
>> +#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
>> +#   (Also related to testcases 9c and 9d)
>> +#   Commit A: z/{b,c},     x/d_1
>> +#   Commit B: y/{b,c,d/g}, x/d/f
>> +#   Commit C: z/{b,c,d_1}
>> +#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
>> +#             y/{b,c,d/g}, y/d_1~C^0, x/d/f
>> +#   NOTE: x/d/f may be slightly confusing here.  x/d_1 -> z/d_1 implies
>> +#         there is a directory rename from x/ -> z/, performed by commit C.
>> +#         However, on the side of commit B, it renamed z/ -> y/, thus
>> +#         making a rename from x/ -> z/ when it was getting rid of z/ seems
>> +#         non-sensical.  Further, putting x/d/f into y/d/f also doesn't
>> +#         make a lot of sense because commit B did the renaming of z to y
>> +#         and it created x/d/f, and it clearly made these things separate,
>> +#         so it doesn't make much sense to push these together.
>
> This is confusing.

Indeed it is.  When I first wrote this testcase, I didn't realize that
I actually had two potentially directory renames involved and a
doubly-transitive rename from it, on top of the D/F conflict.  I can
see two ways to resolve this.

1) Leave the testcase alone, just try to make the NOTE more clear:

NOTE: The main path of interest here is d_1 and where it ends up, but
this is actually a case that has two potential directory renames
involved and D/F conflict(s), so it makes sense to walk through each
step.  Commit B renames z/ -> y/.  Thus everything that C adds to z/
should be instead moved to y/.  This gives us the D/F conflict on y/d
because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.  Further, commit
C renames x/ -> z/, thus everything B adds to x/ should instead be
moved to z/...BUT we removed z/ and renamed it to y/, so maybe
everything should move not from x/ to z/, but from x/ to z/ to y/.
Doing so might make sense from the logic so far, but note that commit
B had both an x/ and a y/; it did the renaming of z/ to y/ and created
x/d/f and it clearly made these things separate, so it doesn't make
much sense to push these together.  Doing so is what I'd call a doubly
transitive rename; see testcases 9c and 9d for further discussion of
this issue and how it's resolved.

2) Modify the testcase so it doesn't have two potential directory
renames involved.  Just add another unrelated file under x/ that
doesn't change on either side, thus removing the x/ -> z/ rename from
the mix.  That wouldn't actually change the expected result (other
than the new file should remain around), but it would change the
reasoning and simplify it:

NOTE: Commit B renames z/ -> y/.  Thus everything that C adds to z/
should be instead moved to y/.  This gives us the D/F conflict on y/d
because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.  As a side note,
one could imagine an alternative implementation trying to resolve D/F
conflicts caused by renames by just undoing the rename, but in this
case that would end up with us needing to write an x/d_1, which would
still be a D/F conflict with x/d/f.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges
  2017-11-14 20:33   ` Stefan Beller
@ 2017-11-14 21:42     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-14 21:42 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Tue, Nov 14, 2017 at 12:33 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:

>> +# Possible Resolutions:
>> +#   Previous git: y/{a,b,f},   z/{c,d},   x/e
>> +#   Expected:     y/{a,b,e,f}, z/{c,d}
>> +#   Preferred:    y/{a,b,e},   z/{c,d,f}
>
> it might be tricky in the future to know what "previous git" is;
> "Previous git" means without directory renames enabled;
>
> "expected" means we expect the algorithm presented in this series to produce
> this output, preferred is what we actually expect.

Yes, how about using:
  "Without dir rename detection:"
  "Currently expected:"
and
  "Optimal:"
?

>> +# Testcase 8b, Dual-directory rename, one into the others' way, with conflicting filenames
>> +#   Commit A. x/{a_1,b_1},     y/{a_2,b_2}
>> +#   Commit B. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
>> +#   Commit C. y/{a_1,b_1},     z/{a_2,b_2}
>> +#
>> +# Possible Resolutions:
>> +#   Previous git: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
>> +#   Scary:        y/{a_1,b_1},     z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
>> +#   Preferred:    y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
>
> It may be common to have sub directories with the same path having different
> blobs, e.g. when having say multiple hardware configurations in different sub
> directories configured. Then renaming becomes a pain when they overlap.

Sure, agreed.  Although, the one nice thing about this particular
testcase is that despite showing suboptimal merge behavior, it's at
least the exact same suboptimal behavior as before when we didn't have
directory rename detection.

>> +# moves directories.  Implment directory rename detection suboptimally, and
>
> Implement

Thanks.

> ok, so add "Expected" as well? (repeating "Previous git", or so?)

Yeah, I should make that more explicit.

>> +# Testcase 8d, rename/delete...or not?
>> +#   (Related to testcase 5b; these may appear slightly inconsistent to users;
>> +#    Also related to testcases 7d and 7e)
>
>> +#   Commit A: z/{b,c,d}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: z/{b,c,d,e}
>> +#   Expected: y/{b,c,e}
>
> Why this?
> * d is deleted in B and not found in the result
> * the rename detection also worked well in z->y  for adding e
>
> I do not see the confusion, yet.

Um...yaay?  If you don't see it as confusing, then maybe others don't?
 I was wondering if folks would expect a rename/delete conflict (x/d
either deleted or renamed to y/d via directory rename detection), and
be annoyed if the merge succeeded and didn't even give so much as a
warning about what happened to 'd'.

>> +#   In this case, I'm leaning towards: commit B was the one that deleted z/d
>> +#   and it did the rename of z to y, so the two "conflicts" (rename vs.
>> +#   delete) are both coming from commit B, which is non-sensical.  Conflicts
>> +#   during merging are supposed to be about opposite sides doing things
>> +#   differently.
>
>   "Sensical has not yet become an "official" word in the English language, which
>   would be why you can't use it. Nonsense is a word, therefore nonsensical can
>   used to describe something of nonsense. However, sense has different meanings
>   and doesn't have an adjective for something of sense"
>
> from https://english.stackexchange.com/questions/38582/antonym-of-nonsensical
> I don't mind it, the spell checker just made me go on a detour. Maybe illogical?

Illogical works for me.

>> +# Testcase 8e, Both sides rename, one side adds to original directory
>> +#   Commit A: z/{b,c}
>> +#   Commit B: y/{b,c}
>> +#   Commit C: w/{b,c}, z/d
>> +#
>> +# Possible Resolutions:
>> +#   Previous git: z/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
>> +#   Expected:     y/d, CONFLICT(z/b -> y/b vs. w/b), CONFLICT(z/c -> y/c vs. w/c)
>> +#   Preferred:    ??
>> +#
>> +# Notes: In commit B, directory z got renamed to y.  In commit C, directory z
>> +#        did NOT get renamed; the directory is still present; instead it is
>> +#        considered to have just renamed a subset of paths in directory z
>> +#        elsewhere.  Therefore, the directory rename done in commit B to z/
>> +#        applies to z/d and maps it to y/d.
>> +#
>> +#        It's possible that users would get confused about this, but what
>> +#        should we do instead?   Silently leaving at z/d seems just as bad or
>> +#        maybe even worse.  Perhaps we could print a big warning about z/d
>> +#        and how we're moving to y/d in this case, but when I started thinking
>> +#        abouty the ramifications of doing that, I didn't know how to rule out
>> +#        that opening other weird edge and corner cases so I just punted.
>
> s/about/abouty

I think you mean the other direction?  Thanks for catching, I'll fix that up.

> It sort of makes sense from a users POV.

I'm afraid I'm unsure what the antecedent of "It" is here.  (Are you
just saying that my rationale for what I listed as "Expected" makes
sense, or something else?)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 10/30] directory rename detection: more involved edge/corner testcases
  2017-11-14 21:11     ` Elijah Newren
@ 2017-11-14 22:47       ` Stefan Beller
  0 siblings, 0 replies; 81+ messages in thread
From: Stefan Beller @ 2017-11-14 22:47 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Tue, Nov 14, 2017 at 1:11 PM, Elijah Newren <newren@gmail.com> wrote:
> On Mon, Nov 13, 2017 at 4:42 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>
>> "In my opinion" ... sounds like commit message?
>
> Sure, I can move it there.
>
>
>>> +# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
>>> +#   Commit A: z/{b,c}
>>> +#   Commit B: y/{b,c}
>>> +#   Commit C: w/b, x/c, z/d
>>> +#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
>>> +#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
>>
>> But the creator of C intended to have z/d, not {w,x}/d, and as {w,x} == y,
>> I am not sure I like this result. (I have no concrete counter example, just
>> messy logic)
>
> I'm open to alternative interpretations here.  The biggest issue for
> me -- going back our discussion at the end of
> https://public-inbox.org/git/CABPp-BFKiam6AK-Gg_RzaLuLur-jz0kvv3TqsHNHg5+HTv_uzA@mail.gmail.com/
> -- is "simple, predictable rule", which is consistent with the other
> rules and limits the number of nasty corner cases as much as possible.
> Perhaps you think this is one of those nasty corner cases, and that's
> fair, but I think it'd be hard to do much better.

yup, I agree that a simple, predictable rule is better than optimizing
for the corner cases we can come up now. (I sent this email before
reading your reply, so thanks for re-iterating the answer.)

> After spending quite a while trying to think of any other alternative
> rules or ways of looking at this, I could only come up with two
> points:
>
>   1) One could view this as a case where commit C didn't in fact do
> any directory rename -- note that directory z/ still exists in that
> commit.  Thus, only B did a rename, it renamed z/ -> y/, thus C's z/d
> should be moved to y/d.  So, this choice is consistent with the other
> rules we've got.

I wonder if we can do a data driven approach, i.e. mine some history
(linux, git, other large projects), that would tell us which of these cases
happens very often, and which of these corner cases can be "safely
ignored because it never happens". My gut feeling tells me that
splitting up a directory into two or three (potential sub-)directories
is a common thing, whereas double renames are not as often.
(But that's just my view, I have no data to back it up; the selection
of the data would open the next debate as it will be very specific to
a given community. But as linux has had such a huge impact on git,
I'd be tempted to claim any study on linux.git is fruitful for gits defaults)

>   2) An alternate (or maybe additional?) rule: We could decide that if
> a source path is renamed on both sides of history, then we'll just
> ignore both renames for consideration of directory rename detection.
>
> The new rule idea would "fix" this testcase to your liking, although
> now we'd be somewhat inconsistent with the "directory still exists
> implies no directory rename occurred rule".  But what other weirdness
> could entail?  Here's a few I've thought of:
>
> Commit O: z/{b,c,d}
> Commit A: y/{b,c}
> Commit B: z/{newb, newc, e}
>
> Here, A renamed z/ -> y/.

.. while deleting d ...


>  Except B renamed z/b and z/c differently,

... and z/d -> z/e, maybe(?).

> so all paths used to detect the z/ -> y/ rename are ignored, so there
> isn't a rename after all.  I'm not so sure I like that decision.
> Let's keep looking though, and change it up a bit more:
>
> Commit O: z/{b,c,d}
> Commit A: y/{b,c}, x/d
> Commit B: z/{newb, newc, d, e}
>
> Here, A has a split rename.  Since B renames z/b and z/c differently,
> we have to ignore the z/ -> y/ rename, and thus the only rename left
> implies z/ -> x/.  Thus we'd end up with z/e getting moved into x/e.
> Seems weird to me, and less likely that a user would understand this
> rule than the "majority wins" one.

It still is "majority wins" except the set of inspected files is filtered first.

In an *ideal, but expensive* algorithm, we might give different
weights to files, e.g.
* large files have more weight than smaller files,
* files with interesting names have more weight (c.f. Makefile vs. xstrbuf.c)
* similar files have more weight than files that are rewritten, or rather
  the more rewrite is done the less impact one file has.
* how unique file content is (LICENSE.txt that exists 23 times in the
  tree has less weight than the sekret-algorithm.c)

and depending on these weights we have a "majority of content" moved
to y/ or x/.

>>> +# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
>>> +#   (Related to testcases 3b and 5c)
>>> +#   Commit A: z/{b,c}, x/d
>>> +#   Commit B: y/{b,c}, w/d
>>> +#   Commit C: z/{b,c,d}
>>> +#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)
>>
>> CONFLICT(x/d -> y/d vs w/d) ?
>
> I'm afraid I'm not following the question.

Yesterday I had the impression the renaming perspective changed,
note the difference in order of y/ and w/ inside the CONFLICT.
I might have been confused already, though.

>>> +# Testcase 7e, transitive rename in rename/delete AND dirs in the way
>>> +#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
>>> +#   (Also related to testcases 9c and 9d)
>>> +#   Commit A: z/{b,c},     x/d_1
>>> +#   Commit B: y/{b,c,d/g}, x/d/f
>>> +#   Commit C: z/{b,c,d_1}
>>> +#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
>>> +#             y/{b,c,d/g}, y/d_1~C^0, x/d/f
>>> +#   NOTE: x/d/f may be slightly confusing here.  x/d_1 -> z/d_1 implies
>>> +#         there is a directory rename from x/ -> z/, performed by commit C.
>>> +#         However, on the side of commit B, it renamed z/ -> y/, thus
>>> +#         making a rename from x/ -> z/ when it was getting rid of z/ seems
>>> +#         non-sensical.  Further, putting x/d/f into y/d/f also doesn't
>>> +#         make a lot of sense because commit B did the renaming of z to y
>>> +#         and it created x/d/f, and it clearly made these things separate,
>>> +#         so it doesn't make much sense to push these together.
>>
>> This is confusing.
>
> Indeed it is.  When I first wrote this testcase, I didn't realize that
> I actually had two potentially directory renames involved and a
> doubly-transitive rename from it, on top of the D/F conflict.  I can
> see two ways to resolve this.
>
> 1) Leave the testcase alone, just try to make the NOTE more clear:
>
> NOTE: The main path of interest here is d_1 and where it ends up, but
> this is actually a case that has two potential directory renames
> involved and D/F conflict(s), so it makes sense to walk through each
> step.  Commit B renames z/ -> y/.  Thus everything that C adds to z/
> should be instead moved to y/.  This gives us the D/F conflict on y/d
> because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.  Further, commit
> C renames x/ -> z/, thus everything B adds to x/ should instead be
> moved to z/...BUT we removed z/ and renamed it to y/, so maybe
> everything should move not from x/ to z/, but from x/ to z/ to y/.
> Doing so might make sense from the logic so far, but note that commit
> B had both an x/ and a y/; it did the renaming of z/ to y/ and created
> x/d/f and it clearly made these things separate, so it doesn't make
> much sense to push these together.  Doing so is what I'd call a doubly
> transitive rename; see testcases 9c and 9d for further discussion of
> this issue and how it's resolved.
>
> 2) Modify the testcase so it doesn't have two potential directory
> renames involved.  Just add another unrelated file under x/ that
> doesn't change on either side, thus removing the x/ -> z/ rename from
> the mix.  That wouldn't actually change the expected result (other
> than the new file should remain around), but it would change the
> reasoning and simplify it:
>
> NOTE: Commit B renames z/ -> y/.  Thus everything that C adds to z/
> should be instead moved to y/.  This gives us the D/F conflict on y/d
> because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.  As a side note,
> one could imagine an alternative implementation trying to resolve D/F
> conflicts caused by renames by just undoing the rename, but in this
> case that would end up with us needing to write an x/d_1, which would
> still be a D/F conflict with x/d/f.

What do you want to test in 7e? AFAICT section 7 is about
"More involved Edge/Corner cases", so keeping it edge sounds fine.
(hence I'd vote for (1), just adjusting the notes)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 15/30] merge-recursive: Move the get_renames() function
  2017-11-14 17:41     ` Elijah Newren
@ 2017-11-15  1:20       ` Junio C Hamano
  0 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2017-11-15  1:20 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> Eek!  My apologies.  I will go through and fix them up.  I see no
> reference to checkpatch.pl in git, but a google search shows there's
> one in the linux source tree.  Is that were I get it from, or is there
> a different one?



> Also, would you like me to make a separate commit that cleans up
> pre-existing issues in merge-recursive.c so that it runs clean, or
> just remove the problems I added?

That's optional.  These three patch series are already sufficiently
large, so I do not mind a clean-up after dust settles down, instead
of preliminary clean-up.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage
  2017-11-10 19:05 ` [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
@ 2017-11-15 20:03   ` Stefan Beller
  2017-11-16 21:17     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-15 20:03 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:

> +###########################################################################
> +# SECTION 9: Other testcases
> +#
> +# I came up with the testcases in the first eight sections before coding up
> +# the implementation.  The testcases in this section were mostly ones I
> +# thought of while coding/debugging, and which I was too lazy to insert
> +# into the previous sections because I didn't want to re-label with all the
> +# testcase references.  :-)

This might also be commit message material, as it describes the workflow,
not the 'misc' aspect of these test cases.

> +###########################################################################
> +
> +# Testcase 9a, Inner renamed directory within outer renamed directory
> +#   (Related to testcase 1f)
> +#   Commit A: z/{b,c,d/{e,f,g}}
> +#   Commit B: y/{b,c}, x/w/{e,f,g}
> +#   Commit C: z/{b,c,d/{e,f,g,h},i}
> +#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
> +#   NOTE: The only reason this one is interesting is because when a directory
> +#         is split into multiple other directories, we determine by the weight
> +#         of which one had the most paths going to it.  A naive implementation
> +#         of that could take the new file in commit C at z/i to x/w/i or x/i.

Makes sense.

> +# Testcase 9b, Transitive rename with content merge
> +#   (Related to testcase 1c)
> +#   Commit A: z/{b,c},   x/d_1
> +#   Commit B: y/{b,c},   x/d_2
> +#   Commit C: z/{b,c,d_3}
> +#   Expected: y/{b,c,d_merged}

Makes sense.

> +# Testcase 9c, Doubly transitive rename?
> +#   (Related to testcase 1c, 7e, and 9d)
> +#   Commit A: z/{b,c},     x/{d,e},    w/f
> +#   Commit B: y/{b,c},     x/{d,e,f,g}
> +#   Commit C: z/{b,c,d,e},             w/f
> +#   Expected: y/{b,c,d,e}, x/{f,g}
> +#
> +#   NOTE: x/f and x/g may be slightly confusing here.  The rename from w/f to
> +#         x/f is clear.  Let's look beyond that.  Here's the logic:
> +#            Commit C renamed x/ -> z/
> +#            Commit B renamed z/ -> y/
> +#         So, we could possibly further rename x/f to z/f to y/f, a doubly
> +#         transient rename.  However, where does it end?  We can chain these
> +#         indefinitely (see testcase 9d).  What if there is a D/F conflict
> +#         at z/f/ or y/f/?  Or just another file conflict at one of those
> +#         paths?  In the case of an N-long chain of transient renamings,
> +#         where do we "abort" the rename at?  Can the user make sense of
> +#         the resulting conflict and resolve it?
> +#
> +#         To avoid this confusion I use the simple rule that if the other side
> +#         of history did a directory rename to a path that your side renamed
> +#         away, then ignore that particular rename from the other side of
> +#         history for any implicit directory renames.

This is repeated in the rule of section 9 below.
Makes sense.

> +# Testcase 9d, N-fold transitive rename?
> +#   (Related to testcase 9c...and 1c and 7e)
> +#   Commit A: z/a, y/b, x/c, w/d, v/e, u/f
> +#   Commit B:  y/{a,b},  w/{c,d},  u/{e,f}
> +#   Commit C: z/{a,t}, x/{b,c}, v/{d,e}, u/f
> +#   Expected: <see NOTE first>
> +#
> +#   NOTE: z/ -> y/ (in commit B)
> +#         y/ -> x/ (in commit C)
> +#         x/ -> w/ (in commit B)
> +#         w/ -> v/ (in commit C)
> +#         v/ -> u/ (in commit B)
> +#         So, if we add a file to z, say z/t, where should it end up?  In u?
> +#         What if there's another file or directory named 't' in one of the
> +#         intervening directories and/or in u itself?  Also, shouldn't the
> +#         same logic that places 't' in u/ also move ALL other files to u/?
> +#         What if there are file or directory conflicts in any of them?  If
> +#         we attempted to do N-way (N-fold? N-ary? N-uple?) transitive renames
> +#         like this, would the user have any hope of understanding any
> +#         conflicts or how their working tree ended up?  I think not, so I'm
> +#         ruling out N-ary transitive renames for N>1.
> +#
> +#   Therefore our expected result is:
> +#     z/t, y/a, x/b, w/c, u/d, u/e, u/f
> +#   The reason that v/d DOES get transitively renamed to u/d is that u/ isn't
> +#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
> +#   simple rules that are consistent with what we need for all the other
> +#   testcases and simplifies things for the user.

Does the merge order matter here?
If B and C were swapped, applying the same logic presented in the NOTE,
one could argue that we expect:

    z/t y/a x/b w/c v/d v/e u/f

I can make a strong point for y/a here, but the v/{d,e} also seem to deviate.

> +# Testcase 9e, N-to-1 whammo
> +#   (Related to testcase 9c...and 1c and 7e)
> +#   Commit A: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
> +#   Commit B: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, dirN/{j,k,l,yo}
> +#   Commit C: combined/{a,b,d,e,g,h,j,k}
> +#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
> +#             dir1/yo, dir2/yo, dir3/yo, dirN/yo

Very neat!

> +# Testcase 9f, Renamed directory that only contained immediate subdirs
> +#   (Related to testcases 1e & 9g)
> +#   Commit A: goal/{a,b}/$more_files
> +#   Commit B: priority/{a,b}/$more_files
> +#   Commit C: goal/{a,b}/$more_files, goal/c
> +#   Expected: priority/{a,b}/$more_files, priority/c

> +# Testcase 9g, Renamed directory that only contained immediate subdirs, immediate subdirs renamed
> +#   (Related to testcases 1e & 9f)
> +#   Commit A: goal/{a,b}/$more_files
> +#   Commit B: priority/{alpha,bravo}/$more_files
> +#   Commit C: goal/{a,b}/$more_files, goal/c
> +#   Expected: priority/{alpha,bravo}/$more_files, priority/c

and if C also added goal/a/another_file, we'd expect it to
become priority/alpha/another_file.

What happens in moving dir hierarchies?

A: root/node1/{leaf1, leaf2}, root/node2/{leaf3, leaf4}
B: "Move node2 one layer down into node1"
    root/node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
C: "Add more leaves"
    root/node1/{leaf1, leaf2, leaf5}, root/node2/{leaf3, leaf4, leaf6}

Or chaining putting things in one another:
(Same A)
B: "Move node2 one layer down into node1"
    root/node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
C: "Move node1 one layer down into node2"
    root/node2/{leaf3, leaf4, node1/{leaf1, leaf2}}

Just food for thought.

> +# Rules suggested by section 9:
> +#
> +#   If the other side of history did a directory rename to a path that your
> +#   side renamed away, then ignore that particular rename from the other
> +#   side of history for any implicit directory renames.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames
  2017-11-10 19:05 ` [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames Elijah Newren
@ 2017-11-15 20:23   ` Stefan Beller
  2017-11-16  3:54     ` Elijah Newren
  0 siblings, 1 reply; 81+ messages in thread
From: Stefan Beller @ 2017-11-15 20:23 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

> +               if (!strcmp(pair->one->path, pair->two->path)) {
> +                       /*
> +                        * Paths should only match if this was initially a
> +                        * non-rename that is being turned into one by
> +                        * directory rename detection.
> +                        */
> +                       assert(pair->status != 'R');
> +               } else {
> +                       assert(pair->status == 'R');

assert() is compiled conditionally depending on whether
NDEBUG is set, which makes potential error reports more interesting and
head-scratching. But we'd rather prefer easy bug reports, therefore
we'd want to (a) either have the condition checked always, when
you know this could occur, e.g. via

  if (<condition>)
    BUG("Git is broken, because...");

or (b) you could omit the asserts if they are more of a developer guideline?

I wonder if we want to introduce a BUG_ON(<condition>, <msg>) macro
that contains (a).


> +                       re->dst_entry->processed = 1;
> +                       //string_list_remove(entries, pair->two->path, 0);

commented code?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames
  2017-11-15 20:23   ` Stefan Beller
@ 2017-11-16  3:54     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-16  3:54 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Wed, Nov 15, 2017 at 12:23 PM, Stefan Beller <sbeller@google.com> wrote:
>> +               if (!strcmp(pair->one->path, pair->two->path)) {
>> +                       /*
>> +                        * Paths should only match if this was initially a
>> +                        * non-rename that is being turned into one by
>> +                        * directory rename detection.
>> +                        */
>> +                       assert(pair->status != 'R');
>> +               } else {
>> +                       assert(pair->status == 'R');
>
> assert() is compiled conditionally depending on whether
> NDEBUG is set, which makes potential error reports more interesting and
> head-scratching. But we'd rather prefer easy bug reports, therefore
> we'd want to (a) either have the condition checked always, when
> you know this could occur, e.g. via
>
>   if (<condition>)
>     BUG("Git is broken, because...");
>
> or (b) you could omit the asserts if they are more of a developer guideline?
>
> I wonder if we want to introduce a BUG_ON(<condition>, <msg>) macro
> that contains (a).

Yeah, I added a few other asserts in other commits too.  None of these
were written with the expectation that they should or could ever occur
for a user; it was just a developer guideline to make sure I (and
future others) didn't break certain invariants during the
implementation or while making modifications to it.

So that makes it more like (b), but I feel that there is something to
be said for having a convenient syntax for expressing pre-conditions
that others shouldn't violate when changing the code, and which will
be given more weight than a comment.  For that, something that is
compiled out on many users systems seemed just fine.

But, I have certainly seen abuses of asserts in my time as well (e.g.
function calls with important side-effects being placed inside
asserts), so if folks have decided it's against git's style, then I
understand.  I'll remove some, and switch the cheaper checks over to
BUG().

>> +                       re->dst_entry->processed = 1;
>> +                       //string_list_remove(entries, pair->two->path, 0);
>
> commented code?

Ugh, that's embarrassing.  I'll clean that out.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage
  2017-11-15 20:03   ` Stefan Beller
@ 2017-11-16 21:17     ` Elijah Newren
  0 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2017-11-16 21:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Wed, Nov 15, 2017 at 12:03 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 10, 2017 at 11:05 AM, Elijah Newren <newren@gmail.com> wrote:
>
>> +# Testcase 9d, N-fold transitive rename?
>> +#   (Related to testcase 9c...and 1c and 7e)
>> +#   Commit A: z/a, y/b, x/c, w/d, v/e, u/f
>> +#   Commit B:  y/{a,b},  w/{c,d},  u/{e,f}
>> +#   Commit C: z/{a,t}, x/{b,c}, v/{d,e}, u/f
>> +#   Expected: <see NOTE first>
>> +#
>> +#   NOTE: z/ -> y/ (in commit B)
>> +#         y/ -> x/ (in commit C)
>> +#         x/ -> w/ (in commit B)
>> +#         w/ -> v/ (in commit C)
>> +#         v/ -> u/ (in commit B)
>> +#         So, if we add a file to z, say z/t, where should it end up?  In u?
>> +#         What if there's another file or directory named 't' in one of the
>> +#         intervening directories and/or in u itself?  Also, shouldn't the
>> +#         same logic that places 't' in u/ also move ALL other files to u/?
>> +#         What if there are file or directory conflicts in any of them?  If
>> +#         we attempted to do N-way (N-fold? N-ary? N-uple?) transitive renames
>> +#         like this, would the user have any hope of understanding any
>> +#         conflicts or how their working tree ended up?  I think not, so I'm
>> +#         ruling out N-ary transitive renames for N>1.
>> +#
>> +#   Therefore our expected result is:
>> +#     z/t, y/a, x/b, w/c, u/d, u/e, u/f
>> +#   The reason that v/d DOES get transitively renamed to u/d is that u/ isn't
>> +#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
>> +#   simple rules that are consistent with what we need for all the other
>> +#   testcases and simplifies things for the user.
>
> Does the merge order matter here?

No.

> If B and C were swapped, applying the same logic presented in the NOTE,
> one could argue that we expect:
>
>     z/t y/a x/b w/c v/d v/e u/f
>
> I can make a strong point for y/a here, but the v/{d,e} also seem to deviate.

I don't understand; I thought my argument as presented was agnostic of
direction.  Perhaps I have an unstated assumption I'm not realizing or
something; could you explain how my logic above could lead to this
conclusion?

Also, let me try a different tack to see if it's clearer than the
above argument I made.  Looking at each path:

* z/t from commit C does not get renamed to y/t despite B's rename of
z/ -> y/ because C renamed y/ elsewhere.
* z/a from commit A was renamed to y/a in commit B.  We do not
transitively rename further from y/a to x/a (despite C's rename of y/
to x/) because B renamed x/ elsewhere.
* y/b from commit A was renamed to x/b in commit C.  We do not
transitively rename further from x/b to w/b (despite B's rename of x/
to w/) because C renamed w/ elsewhere.
* x/c from commit A was renamed to w/c in commit B.  We do not
transitively rename further from w/c to v/c (despite C's rename from
w/ to v/) because B renamed v/ elsewhere.
* w/d from commit A was renamed to v/d in commit C.  We DO
transitively rename from v/d to u/d because of B's rename of v/ to u/
and because C did not rename u/ to somewhere else.

(And, to complete the list, e and f are simple: v/e is renamed to u/e
in commit B, and there's no directory name on u on either side, so
there's no special logic needed at all.  u/f is even simpler; there's
no renames or directory renames or anything affecting it.)


>> +# Testcase 9e, N-to-1 whammo
>> +#   (Related to testcase 9c...and 1c and 7e)
>> +#   Commit A: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
>> +#   Commit B: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, dirN/{j,k,l,yo}
>> +#   Commit C: combined/{a,b,d,e,g,h,j,k}
>> +#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
>> +#             dir1/yo, dir2/yo, dir3/yo, dirN/yo
>
> Very neat!

:-)

>> +# Testcase 9f, Renamed directory that only contained immediate subdirs
>> +#   (Related to testcases 1e & 9g)
>> +#   Commit A: goal/{a,b}/$more_files
>> +#   Commit B: priority/{a,b}/$more_files
>> +#   Commit C: goal/{a,b}/$more_files, goal/c
>> +#   Expected: priority/{a,b}/$more_files, priority/c
>
>> +# Testcase 9g, Renamed directory that only contained immediate subdirs, immediate subdirs renamed
>> +#   (Related to testcases 1e & 9f)
>> +#   Commit A: goal/{a,b}/$more_files
>> +#   Commit B: priority/{alpha,bravo}/$more_files
>> +#   Commit C: goal/{a,b}/$more_files, goal/c
>> +#   Expected: priority/{alpha,bravo}/$more_files, priority/c
>
> and if C also added goal/a/another_file, we'd expect it to
> become priority/alpha/another_file.

Yep!  I thought that was covered enough by other tests, but do you
feel I should add that to this testcase?

> What happens in moving dir hierarchies?
>
> A: root/node1/{leaf1, leaf2}, root/node2/{leaf3, leaf4}
> B: "Move node2 one layer down into node1"
>     root/node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
> C: "Add more leaves"
>     root/node1/{leaf1, leaf2, leaf5}, root/node2/{leaf3, leaf4, leaf6}

Works just fine; similar to testcase 9a.  Do you feel this one is
different enough to add to the testsuite?  I'm happy to do so.

> Or chaining putting things in one another:
> (Same A)
> B: "Move node2 one layer down into node1"
>     root/node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
> C: "Move node1 one layer down into node2"
>     root/node2/{leaf3, leaf4, node1/{leaf1, leaf2}}
>
> Just food for thought.

That's evil.  I mean, it's a brilliant testcase designed to really
mess things up.  I'm not entirely sure what the right answer should
be, but I am confident saying my current implementation handles it
wrong.  I'm digging into why.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2017-11-10 19:05 ` [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging Elijah Newren
@ 2018-06-10 10:56   ` René Scharfe
  2018-06-10 11:03     ` René Scharfe
                       ` (3 more replies)
  0 siblings, 4 replies; 81+ messages in thread
From: René Scharfe @ 2018-06-10 10:56 UTC (permalink / raw)
  To: Elijah Newren, git; +Cc: Junio C Hamano, Jeff King

Am 10.11.2017 um 20:05 schrieb Elijah Newren:
> +static struct dir_rename_entry *check_dir_renamed(const char *path,
> +						  struct hashmap *dir_renames) {
> +	char temp[PATH_MAX];
> +	char *end;
> +	struct dir_rename_entry *entry;
> +
> +	strcpy(temp, path);
> +	while ((end = strrchr(temp, '/'))) {
> +		*end = '\0';
> +		entry = dir_rename_find_entry(dir_renames, temp);
> +		if (entry)
> +			return entry;
> +	}
> +	return NULL;
> +}

The value of PATH_MAX is platform-dependent, so it's easy to exceed when
doing cross-platform development.  It's also not a hard limit on most
operating systems, not even on Windows.  Further reading:

   https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html

So using a fixed buffer is not a good idea, and writing to it without
checking is dangerous.  Here's a fix:

-- >8 --
Subject: [PATCH] merge-recursive: use xstrdup() instead of fixed buffer

Paths can be longer than PATH_MAX.  Avoid a buffer overrun in
check_dir_renamed() by using xstrdup() to make a private copy safely.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 merge-recursive.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index ac27abbd4c..db708176c5 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2211,18 +2211,18 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 static struct dir_rename_entry *check_dir_renamed(const char *path,
 						  struct hashmap *dir_renames)
 {
-	char temp[PATH_MAX];
+	char *temp = xstrdup(path);
 	char *end;
-	struct dir_rename_entry *entry;
+	struct dir_rename_entry *entry = NULL;;
 
-	strcpy(temp, path);
 	while ((end = strrchr(temp, '/'))) {
 		*end = '\0';
 		entry = dir_rename_find_entry(dir_renames, temp);
 		if (entry)
-			return entry;
+			break;
 	}
-	return NULL;
+	free(temp);
+	return entry;
 }
 
 static void compute_collisions(struct hashmap *collisions,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2018-06-10 10:56   ` René Scharfe
@ 2018-06-10 11:03     ` René Scharfe
  2018-06-10 20:44     ` Jeff King
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 81+ messages in thread
From: René Scharfe @ 2018-06-10 11:03 UTC (permalink / raw)
  To: Elijah Newren, git; +Cc: Junio C Hamano, Jeff King

Am 10.06.2018 um 12:56 schrieb René Scharfe:
> Am 10.11.2017 um 20:05 schrieb Elijah Newren:
>> +static struct dir_rename_entry *check_dir_renamed(const char *path,
>> +						  struct hashmap *dir_renames) {
>> +	char temp[PATH_MAX];
>> +	char *end;
>> +	struct dir_rename_entry *entry;
>> +
>> +	strcpy(temp, path);
>> +	while ((end = strrchr(temp, '/'))) {
>> +		*end = '\0';
>> +		entry = dir_rename_find_entry(dir_renames, temp);
>> +		if (entry)
>> +			return entry;
>> +	}
>> +	return NULL;
>> +}
> 
> The value of PATH_MAX is platform-dependent, so it's easy to exceed when
> doing cross-platform development.  It's also not a hard limit on most
> operating systems, not even on Windows.  Further reading:
> 
>     https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
> 
> So using a fixed buffer is not a good idea, and writing to it without
> checking is dangerous.  Here's a fix:

Argh, I meant to reply to v10 of that patch, i.e. this:

   https://public-inbox.org/git/20180419175823.7946-21-newren@gmail.com/

The cited code wasn't changed and is in current master, though, so both
that part and my patch are still relevant.

René

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2018-06-10 10:56   ` René Scharfe
  2018-06-10 11:03     ` René Scharfe
@ 2018-06-10 20:44     ` Jeff King
  2018-06-11 15:03     ` Elijah Newren
  2018-06-14 17:36     ` Junio C Hamano
  3 siblings, 0 replies; 81+ messages in thread
From: Jeff King @ 2018-06-10 20:44 UTC (permalink / raw)
  To: René Scharfe; +Cc: Elijah Newren, git, Junio C Hamano

On Sun, Jun 10, 2018 at 12:56:31PM +0200, René Scharfe wrote:

> The value of PATH_MAX is platform-dependent, so it's easy to exceed when
> doing cross-platform development.  It's also not a hard limit on most
> operating systems, not even on Windows.  Further reading:
> 
>    https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
> 
> So using a fixed buffer is not a good idea, and writing to it without
> checking is dangerous.  Here's a fix:

Even on platforms where it _is_ a hard-limit, we are quite often dealing
with paths that come from tree objects (so even if the OS would
eventually complain about our path, it is small consolation when we
smash the stack before we get there).

Your patch looks good to me, and we definitely should address this
before v2.18-final.

> -	char temp[PATH_MAX];
> +	char *temp = xstrdup(path);
>  	char *end;
> -	struct dir_rename_entry *entry;
> +	struct dir_rename_entry *entry = NULL;;
>  
> -	strcpy(temp, path);

I'm sad that this strcpy() wasn't caught in review. IMHO we should avoid
that function altogether, even when we _think_ it can't trigger an
overflow. That's easier to reason about (and makes auditing easier).

It looks like another one has crept in recently, too.

-- >8 --
Subject: [PATCH] blame: prefer xsnprintf to strcpy for colors

Our color buffers are all COLOR_MAXLEN, which fits the
largest possible color. So we can never overflow the buffer
by copying an existing color. However, using strcpy() makes
it harder to audit the code-base for calls that _are_
problems. We should use something like xsnprintf(), which
shows the reader that we expect this never to fail (and
provides a run-time assertion if it does, just in case).

Signed-off-by: Jeff King <peff@peff.net>
---
Another option would just be color_parse(repeated_meta_color, "cyan").
The run-time cost is slightly higher, but it probably doesn't matter
here, and perhaps it's more readable.

This one is less critical for v2.18.

 builtin/blame.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 4202584f97..45770c5a8c 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1068,7 +1068,9 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
 		find_alignment(&sb, &output_option);
 		if (!*repeated_meta_color &&
 		    (output_option & OUTPUT_COLOR_LINE))
-			strcpy(repeated_meta_color, GIT_COLOR_CYAN);
+			xsnprintf(repeated_meta_color,
+				  sizeof(repeated_meta_color),
+				  "%s", GIT_COLOR_CYAN);
 	}
 	if (output_option & OUTPUT_ANNOTATE_COMPAT)
 		output_option &= ~(OUTPUT_COLOR_LINE | OUTPUT_SHOW_AGE_WITH_COLOR);
-- 
2.18.0.rc1.446.g4486251e51


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2018-06-10 10:56   ` René Scharfe
  2018-06-10 11:03     ` René Scharfe
  2018-06-10 20:44     ` Jeff King
@ 2018-06-11 15:03     ` Elijah Newren
  2018-06-14 17:36     ` Junio C Hamano
  3 siblings, 0 replies; 81+ messages in thread
From: Elijah Newren @ 2018-06-11 15:03 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano, Jeff King

On Sun, Jun 10, 2018 at 3:56 AM, René Scharfe <l.s.r@web.de> wrote:
> Am 10.11.2017 um 20:05 schrieb Elijah Newren:
>> +static struct dir_rename_entry *check_dir_renamed(const char *path,
>> +                                               struct hashmap *dir_renames) {
>> +     char temp[PATH_MAX];
>> +     char *end;
>> +     struct dir_rename_entry *entry;
>> +
>> +     strcpy(temp, path);
>> +     while ((end = strrchr(temp, '/'))) {
>> +             *end = '\0';
>> +             entry = dir_rename_find_entry(dir_renames, temp);
>> +             if (entry)
>> +                     return entry;
>> +     }
>> +     return NULL;
>> +}
>
> The value of PATH_MAX is platform-dependent, so it's easy to exceed when
> doing cross-platform development.  It's also not a hard limit on most
> operating systems, not even on Windows.  Further reading:
>
>    https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
>
> So using a fixed buffer is not a good idea, and writing to it without
> checking is dangerous.  Here's a fix:

Thanks for the pointers, and for providing a fix.

> -- >8 --
> Subject: [PATCH] merge-recursive: use xstrdup() instead of fixed buffer
>
> Paths can be longer than PATH_MAX.  Avoid a buffer overrun in
> check_dir_renamed() by using xstrdup() to make a private copy safely.
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---
>  merge-recursive.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index ac27abbd4c..db708176c5 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2211,18 +2211,18 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
>  static struct dir_rename_entry *check_dir_renamed(const char *path,
>                                                   struct hashmap *dir_renames)
>  {
> -       char temp[PATH_MAX];
> +       char *temp = xstrdup(path);
>         char *end;
> -       struct dir_rename_entry *entry;
> +       struct dir_rename_entry *entry = NULL;;
>
> -       strcpy(temp, path);
>         while ((end = strrchr(temp, '/'))) {
>                 *end = '\0';
>                 entry = dir_rename_find_entry(dir_renames, temp);
>                 if (entry)
> -                       return entry;
> +                       break;
>         }
> -       return NULL;
> +       free(temp);
> +       return entry;
>  }
>
>  static void compute_collisions(struct hashmap *collisions,
> --
> 2.17.1

Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging
  2018-06-10 10:56   ` René Scharfe
                       ` (2 preceding siblings ...)
  2018-06-11 15:03     ` Elijah Newren
@ 2018-06-14 17:36     ` Junio C Hamano
  3 siblings, 0 replies; 81+ messages in thread
From: Junio C Hamano @ 2018-06-14 17:36 UTC (permalink / raw)
  To: René Scharfe; +Cc: Elijah Newren, git, Jeff King

René Scharfe <l.s.r@web.de> writes:

> The value of PATH_MAX is platform-dependent, so it's easy to exceed when
> doing cross-platform development.  It's also not a hard limit on most
> operating systems, not even on Windows.  Further reading:
>
>    https://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html
>
> So using a fixed buffer is not a good idea, and writing to it without
> checking is dangerous.  Here's a fix:
>
> -- >8 --
> Subject: [PATCH] merge-recursive: use xstrdup() instead of fixed buffer
>
> Paths can be longer than PATH_MAX.  Avoid a buffer overrun in
> check_dir_renamed() by using xstrdup() to make a private copy safely.
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---

Thanks.  Makes sense.

>  merge-recursive.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index ac27abbd4c..db708176c5 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2211,18 +2211,18 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
>  static struct dir_rename_entry *check_dir_renamed(const char *path,
>  						  struct hashmap *dir_renames)
>  {
> -	char temp[PATH_MAX];
> +	char *temp = xstrdup(path);
>  	char *end;
> -	struct dir_rename_entry *entry;
> +	struct dir_rename_entry *entry = NULL;;
>  
> -	strcpy(temp, path);
>  	while ((end = strrchr(temp, '/'))) {
>  		*end = '\0';
>  		entry = dir_rename_find_entry(dir_renames, temp);
>  		if (entry)
> -			return entry;
> +			break;
>  	}
> -	return NULL;
> +	free(temp);
> +	return entry;
>  }
>  
>  static void compute_collisions(struct hashmap *collisions,

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2018-06-14 17:36 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-10 19:05 [PATCH 00/30] Add directory rename detection to git Elijah Newren
2017-11-10 19:05 ` [PATCH 01/30] Tighten and correct a few testcases for merging and cherry-picking Elijah Newren
2017-11-13 19:32   ` Stefan Beller
2017-11-10 19:05 ` [PATCH 02/30] merge-recursive: Fix logic ordering issue Elijah Newren
2017-11-13 19:48   ` Stefan Beller
2017-11-13 22:04     ` Elijah Newren
2017-11-13 22:12       ` Stefan Beller
2017-11-13 23:39         ` Elijah Newren
2017-11-13 23:46           ` Stefan Beller
2017-11-10 19:05 ` [PATCH 03/30] merge-recursive: Add explanation for src_entry and dst_entry Elijah Newren
2017-11-13 21:06   ` Stefan Beller
2017-11-13 22:57     ` Elijah Newren
2017-11-13 23:11       ` Stefan Beller
2017-11-14  1:26   ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 04/30] directory rename detection: basic testcases Elijah Newren
2017-11-13 22:04   ` Stefan Beller
2017-11-14  0:57     ` Elijah Newren
2017-11-14  1:21       ` Stefan Beller
2017-11-14  1:40         ` Elijah Newren
2017-11-14  2:03     ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 05/30] directory rename detection: directory splitting testcases Elijah Newren
2017-11-13 23:20   ` Stefan Beller
2017-11-10 19:05 ` [PATCH 06/30] directory rename detection: testcases to avoid taking detection too far Elijah Newren
2017-11-13 23:25   ` Stefan Beller
2017-11-14  1:02     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 07/30] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
2017-11-14  0:07   ` Stefan Beller
2017-11-10 19:05 ` [PATCH 08/30] directory rename detection: files/directories in the way of some renames Elijah Newren
2017-11-14  0:15   ` Stefan Beller
2017-11-14  1:19     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 09/30] directory rename detection: testcases checking which side did the rename Elijah Newren
2017-11-14  0:25   ` Stefan Beller
2017-11-14  1:30     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 10/30] directory rename detection: more involved edge/corner testcases Elijah Newren
2017-11-14  0:42   ` Stefan Beller
2017-11-14 21:11     ` Elijah Newren
2017-11-14 22:47       ` Stefan Beller
2017-11-10 19:05 ` [PATCH 11/30] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
2017-11-14 20:33   ` Stefan Beller
2017-11-14 21:42     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 12/30] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
2017-11-15 20:03   ` Stefan Beller
2017-11-16 21:17     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 13/30] directory rename detection: tests for handling overwriting untracked files Elijah Newren
2017-11-10 19:05 ` [PATCH 14/30] directory rename detection: tests for handling overwriting dirty files Elijah Newren
2017-11-10 19:05 ` [PATCH 15/30] merge-recursive: Move the get_renames() function Elijah Newren
2017-11-14  4:46   ` Junio C Hamano
2017-11-14 17:41     ` Elijah Newren
2017-11-15  1:20       ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 16/30] merge-recursive: Introduce new functions to handle rename logic Elijah Newren
2017-11-14  4:56   ` Junio C Hamano
2017-11-14  5:14     ` Junio C Hamano
2017-11-14 18:24       ` Elijah Newren
2017-11-10 19:05 ` [PATCH 17/30] merge-recursive: Fix leaks of allocated renames and diff_filepairs Elijah Newren
2017-11-14  4:58   ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 18/30] merge-recursive: Make !o->detect_rename codepath more obvious Elijah Newren
2017-11-10 19:05 ` [PATCH 19/30] merge-recursive: Split out code for determining diff_filepairs Elijah Newren
2017-11-14  5:20   ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 20/30] merge-recursive: Add a new hashmap for storing directory renames Elijah Newren
2017-11-10 19:05 ` [PATCH 21/30] merge-recursive: Add get_directory_renames() Elijah Newren
2017-11-14  5:30   ` Junio C Hamano
2017-11-14 18:38     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 22/30] merge-recursive: Check for directory level conflicts Elijah Newren
2017-11-10 19:05 ` [PATCH 23/30] merge-recursive: Add a new hashmap for storing file collisions Elijah Newren
2017-11-10 19:05 ` [PATCH 24/30] merge-recursive: Add computation of collisions due to dir rename & merging Elijah Newren
2018-06-10 10:56   ` René Scharfe
2018-06-10 11:03     ` René Scharfe
2018-06-10 20:44     ` Jeff King
2018-06-11 15:03     ` Elijah Newren
2018-06-14 17:36     ` Junio C Hamano
2017-11-10 19:05 ` [PATCH 25/30] merge-recursive: Check for file level conflicts then get new name Elijah Newren
2017-11-10 19:05 ` [PATCH 26/30] merge-recursive: When comparing files, don't include trees Elijah Newren
2017-11-10 19:05 ` [PATCH 27/30] merge-recursive: Apply necessary modifications for directory renames Elijah Newren
2017-11-15 20:23   ` Stefan Beller
2017-11-16  3:54     ` Elijah Newren
2017-11-10 19:05 ` [PATCH 28/30] merge-recursive: Avoid clobbering untracked files with " Elijah Newren
2017-11-10 19:05 ` [RFC PATCH 29/30] merge-recursive: Fix overwriting dirty files involved in renames Elijah Newren
2017-11-10 19:05 ` [PATCH 30/30] merge-recursive: Fix remaining directory rename + dirty overwrite cases Elijah Newren
2017-11-10 22:27 ` [PATCH 00/30] Add directory rename detection to git Philip Oakley
2017-11-10 23:26   ` Elijah Newren
2017-11-13 15:04     ` Philip Oakley

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).