git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
* BUG() during criss-cross merge with directory rename and deleted file
@ 2019-07-26 22:09 Emily Shaffer
  2019-08-05 22:33 ` [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case Elijah Newren
  0 siblings, 1 reply; 11+ messages in thread
From: Emily Shaffer @ 2019-07-26 22:09 UTC (permalink / raw)
  To: newren; +Cc: git, jrn

Hi all,

As mentioned during the standup in #git-devel today, we encountered a
BUG() during certain circumstances of a criss-cross merge. Timing has
prevented me from getting the repro case together to send just yet, but
it appears that the issue is as follows:

 - During a merge in a repo with fairly complicated, merge-y history,
   the following BUG() is seen:
   "BUG: There are unmerged index entries:
    BUG: 2 baz/bar.txt
    BUG: merge-recursive.c:429: unmerged index entries in
    merge-recursive.c
    Aborted"
 - But when the user then runs `git status` (after clearing
   .git/index.lock) the directory is clean.
 - The repo does not contain a 'baz/bar.txt' (although 'baz/' exists).
 - In the repo's history a 'foo/bar.txt' can be found (that is the only
   'bar.txt' to ever exist in the project).

Digging further shows:

 - The merge had  multiple closest shared ancestors (criss-cross
   merge)
 - Directory 'foo/' was renamed on one side to 'baz/'...
 - ...and 'foo/bar.txt' was deleted on the other side.
 - When the virtual ancestor is generated, the directory rename can't be
   resolved and leaves a conflict.
 - The virtual ancestor being written to disk is in a conflicted state.
 - This causes the top-level merge to fail, printing the BUG() above
   which references a 'baz/bar.txt' that never really lived there.

Furthermore...

 - If merge.directoryrenames is set to any value besides "conflict", the
   merge succeeds (no BUG() generated).

This seems to have been introduced in 8c8e5bd6eb331d055aa7fa6345f6dcdadd658979
although I haven't bisected it yet.

It seems like the solution is to watch for whether we're making a
virtual ancestor during a recursive merge, and if so, to treat
merge.directoryrenames = "conflict" as "true" or "false" instead.

I plan to modify the tests added in 8c8e5bd6 to highlight this issue
(just haven't had time, and probably won't til next week), and send a
patch doing the special treatment on merge.directoryrenames.

Happy to hear other ideas folks have.

 - Emily

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-07-26 22:09 BUG() during criss-cross merge with directory rename and deleted file Emily Shaffer
@ 2019-08-05 22:33 ` Elijah Newren
  2019-08-06 16:57   ` Junio C Hamano
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Elijah Newren @ 2019-08-05 22:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Emily Shaffer, Elijah Newren

Ever since commit 8c8e5bd6eb33 ("merge-recursive: switch directory
rename detection default", 2019-04-05), the default handling with
directory rename detection was to report a conflict and leave unstaged
entries in the index.  However, when creating a virtual merge base in
the recursive case, we absolutely need a tree, and the only way a tree
can be written is if we have no unstaged entries -- otherwise we hit a
BUG().

There are a few fixes possible here which at least fix the BUG(), but
none of them seem optimal for other reasons; see the comments with the
new testcase 13e in t6043 for details (which testcase triggered a BUG()
prior to this patch).  As such, just opt for a very conservative and
simple choice that is still relatively reasonable: have the recursive
case treat 'conflict' as 'false' for opt->detect_directory_renames.

Reported-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---

I really should introduce constants like
  DETECT_DIRECTORY_RENAMES_NEVER    = 0
  DETECT_DIRECTORY_RENAMES_CONFLICT = 1
  DETECT_DIRECTORY_RENAMES_YES      = 2
and then use them in the code to make it clearer, but I wanted to make
the code change as simple and contained as possible given that this is
built on maint, fixes a BUG() and we're already in -rc1.

I know this bug doesn't satisfy the normal criteria for making it into
2.23 (it's a bug that was present in 2.22 rather than a regression in
2.23), but given that it's a BUG() condition, I was hoping it is
important and safe enough to include anyway.

(This fix does merge down cleanly to master, next, and pu.)

 merge-recursive.c                   |   3 +-
 t/t6043-merge-rename-directories.sh | 111 ++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index d2e380b7ed..c7691d9b54 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2856,7 +2856,8 @@ static int detect_and_process_renames(struct merge_options *opt,
 	head_pairs = get_diffpairs(opt, common, head);
 	merge_pairs = get_diffpairs(opt, common, merge);
 
-	if (opt->detect_directory_renames) {
+	if ((opt->detect_directory_renames == 2) ||
+	    (opt->detect_directory_renames == 1 && !opt->call_depth)) {
 		dir_re_head = get_directory_renames(head_pairs);
 		dir_re_merge = get_directory_renames(merge_pairs);
 
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 50b7543483..c966147d5d 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -4403,4 +4403,115 @@ test_expect_success '13d-check(info): messages for rename/rename(1to1) via dual
 	)
 '
 
+# Testcase 13e, directory rename in virtual merge base
+#
+# This testcase has a slightly different setup than all the above cases, in
+# order to include a recursive case:
+#
+#      A   C
+#      o - o
+#     / \ / \
+#  O o   X   ?
+#     \ / \ /
+#      o   o
+#      B   D
+#
+#   Commit O: a/{z,y}
+#   Commit A: b/{z,y}
+#   Commit B: a/{z,y,x}
+#   Commit C: b/{z,y,x}
+#   Commit D: b/{z,y}, a/x
+#   Expected: b/{z,y,x}  (sort of; see below for why this might not be expected)
+#
+#   NOTES: 'X' represents a virtual merge base.  With the default of
+#          directory rename detection yielding conflicts, merging A and B
+#          results in a conflict complaining about whether 'x' should be
+#          under 'a/' or 'b/'.  However, when creating the virtual merge
+#          base 'X', since virtual merge bases need to be written out as a
+#          tree, we cannot have a conflict, so some resolution has to be
+#          picked.
+#
+#          In choosing the right resolution, it's worth noting here that
+#          commits C & D are merges of A & B that choose different
+#          locations for 'x' (i.e. they resolve the conflict differently),
+#          and so it would be nice when merging C & D if git could detect
+#          this difference of opinion and report a conflict.  But the only
+#          way to do so that I can think of would be to have the virtual
+#          merge base place 'x' in some directory other than either 'a/' or
+#          'b/', which seems a little weird -- especially since it'd result
+#          in a rename/rename(1to2) conflict with a source path that never
+#          existed in any version.
+#
+#          So, for now, when directory rename detection is set to
+#          'conflict' just avoid doing directory rename detection at all in
+#          the recursive case.  This will not allow us to detect a conflict
+#          in the outer merge for this special kind of setup, but it at
+#          least avoids hitting a BUG().
+#
+test_expect_success '13e-setup: directory rename detection in recursive case' '
+	test_create_repo 13e &&
+	(
+		cd 13e &&
+
+		mkdir a &&
+		echo z >a/z &&
+		echo y >a/y &&
+		git add a &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv a/ b/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo x >a/x &&
+		git add a &&
+		test_tick &&
+		git commit -m "B" &&
+
+		git branch C A &&
+		git branch D B &&
+
+		git checkout C &&
+		test_must_fail git -c merge.directoryRenames=conflict merge B &&
+		git add b/x &&
+		test_tick &&
+		git commit -m "C" &&
+
+
+		git checkout D &&
+		test_must_fail git -c merge.directoryRenames=conflict merge A &&
+		git add b/x &&
+		mkdir a &&
+		git mv b/x a/x &&
+		test_tick &&
+		git commit -m "D"
+	)
+'
+
+test_expect_success '13e-check: directory rename detection in recursive case' '
+	(
+		cd 13e &&
+
+		git checkout --quiet D^0 &&
+
+		git -c merge.directoryRenames=conflict merge -s recursive C^0 >out 2>err &&
+
+		test_i18ngrep ! CONFLICT out &&
+		test_i18ngrep ! BUG: err &&
+		test_i18ngrep ! core.dumped err &&
+		test_must_be_empty err &&
+
+		git ls-files >paths &&
+		! grep a/x paths &&
+		grep b/x paths
+	)
+'
+
 test_done
-- 
2.22.0.246.g5ddf3d502a


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-05 22:33 ` [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case Elijah Newren
@ 2019-08-06 16:57   ` Junio C Hamano
  2019-08-06 17:26     ` Elijah Newren
  2019-08-06 17:49     ` Junio C Hamano
  2019-08-06 17:26   ` Junio C Hamano
  2019-08-06 20:28   ` Emily Shaffer
  2 siblings, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2019-08-06 16:57 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Emily Shaffer

Elijah Newren <newren@gmail.com> writes:

> Ever since commit 8c8e5bd6eb33 ("merge-recursive: switch directory
> rename detection default", 2019-04-05), the default handling with
> directory rename detection was to report a conflict and leave unstaged
> entries in the index.  However, when creating a virtual merge base in
> the recursive case, we absolutely need a tree, and the only way a tree
> can be written is if we have no unstaged entries -- otherwise we hit a
> BUG().
>
> There are a few fixes possible here which at least fix the BUG(), but
> none of them seem optimal for other reasons; see the comments with the
> new testcase 13e in t6043 for details (which testcase triggered a BUG()
> prior to this patch).  As such, just opt for a very conservative and
> simple choice that is still relatively reasonable: have the recursive
> case treat 'conflict' as 'false' for opt->detect_directory_renames.
>
> Reported-by: Emily Shaffer <emilyshaffer@google.com>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>
> I really should introduce constants like
>   DETECT_DIRECTORY_RENAMES_NEVER    = 0
>   DETECT_DIRECTORY_RENAMES_CONFLICT = 1
>   DETECT_DIRECTORY_RENAMES_YES      = 2

How would they compare with MERGE_DIRECTORY_RENAMES_* macros
I see at the tip of 'pu'?  init_merge_options() seems to read
one of those values from the "repo settings" and copies it to
the detect_directory_renames field, so I am reading that they
must be identical.

> and then use them in the code to make it clearer, but I wanted to make
> the code change as simple and contained as possible given that this is
> built on maint, fixes a BUG() and we're already in -rc1.
>
> I know this bug doesn't satisfy the normal criteria for making it into
> 2.23 (it's a bug that was present in 2.22 rather than a regression in
> 2.23), but given that it's a BUG() condition, I was hoping it is
> important and safe enough to include anyway.

I do agree that it is sensible to avoid doing any funky thing during
the virtual base merges, whose result is much less observable (hence
harder to form the right mental model in end user's head) than the
outermost merge.  Do we want to allow this for inner merges when the
setting is 2?  Wouldn't that hit the same BUG()?

> (This fix does merge down cleanly to master, next, and pu.)
>
>  merge-recursive.c                   |   3 +-
>  t/t6043-merge-rename-directories.sh | 111 ++++++++++++++++++++++++++++
>  2 files changed, 113 insertions(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index d2e380b7ed..c7691d9b54 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2856,7 +2856,8 @@ static int detect_and_process_renames(struct merge_options *opt,
>  	head_pairs = get_diffpairs(opt, common, head);
>  	merge_pairs = get_diffpairs(opt, common, merge);
>  
> -	if (opt->detect_directory_renames) {
> +	if ((opt->detect_directory_renames == 2) ||
> +	    (opt->detect_directory_renames == 1 && !opt->call_depth)) {
>  		dir_re_head = get_directory_renames(head_pairs);
>  		dir_re_merge = get_directory_renames(merge_pairs);
>  
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index 50b7543483..c966147d5d 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -4403,4 +4403,115 @@ test_expect_success '13d-check(info): messages for rename/rename(1to1) via dual
>  	)
>  '
>  
> +# Testcase 13e, directory rename in virtual merge base
> +#
> +# This testcase has a slightly different setup than all the above cases, in
> +# order to include a recursive case:
> +#
> +#      A   C
> +#      o - o
> +#     / \ / \
> +#  O o   X   ?
> +#     \ / \ /
> +#      o   o
> +#      B   D
> +#
> +#   Commit O: a/{z,y}
> +#   Commit A: b/{z,y}
> +#   Commit B: a/{z,y,x}
> +#   Commit C: b/{z,y,x}
> +#   Commit D: b/{z,y}, a/x
> +#   Expected: b/{z,y,x}  (sort of; see below for why this might not be expected)
> +#
> +#   NOTES: 'X' represents a virtual merge base.  With the default of
> +#          directory rename detection yielding conflicts, merging A and B
> +#          results in a conflict complaining about whether 'x' should be
> +#          under 'a/' or 'b/'.  However, when creating the virtual merge
> +#          base 'X', since virtual merge bases need to be written out as a
> +#          tree, we cannot have a conflict, so some resolution has to be
> +#          picked.
> +#
> +#          In choosing the right resolution, it's worth noting here that
> +#          commits C & D are merges of A & B that choose different
> +#          locations for 'x' (i.e. they resolve the conflict differently),
> +#          and so it would be nice when merging C & D if git could detect
> +#          this difference of opinion and report a conflict.  But the only
> +#          way to do so that I can think of would be to have the virtual
> +#          merge base place 'x' in some directory other than either 'a/' or
> +#          'b/', which seems a little weird -- especially since it'd result
> +#          in a rename/rename(1to2) conflict with a source path that never
> +#          existed in any version.
> +#
> +#          So, for now, when directory rename detection is set to
> +#          'conflict' just avoid doing directory rename detection at all in
> +#          the recursive case.  This will not allow us to detect a conflict
> +#          in the outer merge for this special kind of setup, but it at
> +#          least avoids hitting a BUG().
> +#
> +test_expect_success '13e-setup: directory rename detection in recursive case' '
> +	test_create_repo 13e &&
> +	(
> +		cd 13e &&
> +
> +		mkdir a &&
> +		echo z >a/z &&
> +		echo y >a/y &&
> +		git add a &&
> +		test_tick &&
> +		git commit -m "O" &&
> +
> +		git branch O &&
> +		git branch A &&
> +		git branch B &&
> +
> +		git checkout A &&
> +		git mv a/ b/ &&
> +		test_tick &&
> +		git commit -m "A" &&
> +
> +		git checkout B &&
> +		echo x >a/x &&
> +		git add a &&
> +		test_tick &&
> +		git commit -m "B" &&
> +
> +		git branch C A &&
> +		git branch D B &&
> +
> +		git checkout C &&
> +		test_must_fail git -c merge.directoryRenames=conflict merge B &&
> +		git add b/x &&
> +		test_tick &&
> +		git commit -m "C" &&
> +
> +
> +		git checkout D &&
> +		test_must_fail git -c merge.directoryRenames=conflict merge A &&
> +		git add b/x &&
> +		mkdir a &&
> +		git mv b/x a/x &&
> +		test_tick &&
> +		git commit -m "D"
> +	)
> +'
> +
> +test_expect_success '13e-check: directory rename detection in recursive case' '
> +	(
> +		cd 13e &&
> +
> +		git checkout --quiet D^0 &&
> +
> +		git -c merge.directoryRenames=conflict merge -s recursive C^0 >out 2>err &&
> +
> +		test_i18ngrep ! CONFLICT out &&
> +		test_i18ngrep ! BUG: err &&
> +		test_i18ngrep ! core.dumped err &&
> +		test_must_be_empty err &&
> +
> +		git ls-files >paths &&
> +		! grep a/x paths &&
> +		grep b/x paths
> +	)
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-05 22:33 ` [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case Elijah Newren
  2019-08-06 16:57   ` Junio C Hamano
@ 2019-08-06 17:26   ` Junio C Hamano
  2019-08-06 17:29     ` Elijah Newren
  2019-08-06 20:28   ` Emily Shaffer
  2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2019-08-06 17:26 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Emily Shaffer

Elijah Newren <newren@gmail.com> writes:

> I know this bug doesn't satisfy the normal criteria for making it into
> 2.23 (it's a bug that was present in 2.22 rather than a regression in
> 2.23), but given that it's a BUG() condition, I was hoping it is
> important and safe enough to include anyway.

For maintenance and upcoming release, a safer "fix" to do might be
to also (in addition to this patch) flip the default to no to revert
to the stable state before "directory renames" was introduced, while
still allowing those who want to help can explore the right fix to
this codepath.

> (This fix does merge down cleanly to master, next, and pu.)
>
>  merge-recursive.c                   |   3 +-
>  t/t6043-merge-rename-directories.sh | 111 ++++++++++++++++++++++++++++
>  2 files changed, 113 insertions(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index d2e380b7ed..c7691d9b54 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2856,7 +2856,8 @@ static int detect_and_process_renames(struct merge_options *opt,
>  	head_pairs = get_diffpairs(opt, common, head);
>  	merge_pairs = get_diffpairs(opt, common, merge);
>  
> -	if (opt->detect_directory_renames) {
> +	if ((opt->detect_directory_renames == 2) ||
> +	    (opt->detect_directory_renames == 1 && !opt->call_depth)) {



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 16:57   ` Junio C Hamano
@ 2019-08-06 17:26     ` Elijah Newren
  2019-08-06 17:49     ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Elijah Newren @ 2019-08-06 17:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Emily Shaffer, Derrick Stolee

On Tue, Aug 6, 2019 at 9:57 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > Ever since commit 8c8e5bd6eb33 ("merge-recursive: switch directory
> > rename detection default", 2019-04-05), the default handling with
> > directory rename detection was to report a conflict and leave unstaged
> > entries in the index.  However, when creating a virtual merge base in
> > the recursive case, we absolutely need a tree, and the only way a tree
> > can be written is if we have no unstaged entries -- otherwise we hit a
> > BUG().
> >
> > There are a few fixes possible here which at least fix the BUG(), but
> > none of them seem optimal for other reasons; see the comments with the
> > new testcase 13e in t6043 for details (which testcase triggered a BUG()
> > prior to this patch).  As such, just opt for a very conservative and
> > simple choice that is still relatively reasonable: have the recursive
> > case treat 'conflict' as 'false' for opt->detect_directory_renames.
> >
> > Reported-by: Emily Shaffer <emilyshaffer@google.com>
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >
> > I really should introduce constants like
> >   DETECT_DIRECTORY_RENAMES_NEVER    = 0
> >   DETECT_DIRECTORY_RENAMES_CONFLICT = 1
> >   DETECT_DIRECTORY_RENAMES_YES      = 2
>
> How would they compare with MERGE_DIRECTORY_RENAMES_* macros
> I see at the tip of 'pu'?  init_merge_options() seems to read
> one of those values from the "repo settings" and copies it to
> the detect_directory_renames field, so I am reading that they
> must be identical.

Indeed, it looks like Stolee has done my work for me -- though I would
suspect that his change won't be merged down to maint, and I am
assuming that my patch might be.  If you don't want to merge it down
to maint, I can rebase on Stolee's patch; just let me know.  (And if
you do want to merge it down to maint, I can send a subsequent patch
to modify and adopt Stolee's naming.)

> > and then use them in the code to make it clearer, but I wanted to make
> > the code change as simple and contained as possible given that this is
> > built on maint, fixes a BUG() and we're already in -rc1.
> >
> > I know this bug doesn't satisfy the normal criteria for making it into
> > 2.23 (it's a bug that was present in 2.22 rather than a regression in
> > 2.23), but given that it's a BUG() condition, I was hoping it is
> > important and safe enough to include anyway.
>
> I do agree that it is sensible to avoid doing any funky thing during
> the virtual base merges, whose result is much less observable (hence
> harder to form the right mental model in end user's head) than the
> outermost merge.  Do we want to allow this for inner merges when the
> setting is 2?  Wouldn't that hit the same BUG()?

No, it doesn't hit the same BUG().  Hitting the BUG() only came from
having unstaged entries at the end of an inner merge, which only came
from the 'conflict' setting for directory rename detection.  Having
directory rename detection completely on (detect the rename and accept
it as the resolution), or completely off (don't detect directory
renames, i.e. leave old pathname as the resolution) both have well
defined resolutions.

But that doesn't completely answer your question about whether we want
to have directory rename detection for the inner merges, it just
suggests we have to avoid 'conflict' as a setting in that case.  Let
me look at the same testcase:

> > +# Testcase 13e, directory rename in virtual merge base
> > +#
> > +# This testcase has a slightly different setup than all the above cases, in
> > +# order to include a recursive case:
> > +#
> > +#      A   C
> > +#      o - o
> > +#     / \ / \
> > +#  O o   X   ?
> > +#     \ / \ /
> > +#      o   o
> > +#      B   D
> > +#
> > +#   Commit O: a/{z,y}
> > +#   Commit A: b/{z,y}
> > +#   Commit B: a/{z,y,x}
> > +#   Commit C: b/{z,y,x}
> > +#   Commit D: b/{z,y}, a/x

Here, if the user has directory rename detection fully on
(opt->detect_directory_renames == 2), then the fact that commits C and
D resolved the merge differently is perhaps more surprising, because
the default would be to just accept the rename.  That means one side
(commit C) is likely to have just taken the default resolution, but
someone (whoever created commit D) made a manual effort to undo the
directory rename.  Now, if the virtual merge base doesn't do directory
rename detection, then it'll match commit D which undid the directory
rename detection, and result in commit C winning and using the
directory rename.  That seems like the wrong choice to me.

So, I think that if opt->detect_directory_renames == 2, then we really
do want to use directory rename detection on the virtual merge base so
that we don't silently miss someone manually undoing the directory
rename detection on one side of history.

The trickier choice comes when opt->detect_directory_renames == 1 (or
'conflict') because then we don't have a sane
avoid-accidentally-matching-one-side solution.  I'm not sure there's a
"best" choice for this case, but we should certainly at least avoid a
BUG().  The simple ways to do that were to translate 'conflict' to
either 'false' or 'true'.  I can't give a good rationale for why I
picked 'false' over 'true' in that case with this patch; either seem
equally good, but 'false' has a slight performance advantage over
'true' so I went with it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 17:26   ` Junio C Hamano
@ 2019-08-06 17:29     ` Elijah Newren
  0 siblings, 0 replies; 11+ messages in thread
From: Elijah Newren @ 2019-08-06 17:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Emily Shaffer

On Tue, Aug 6, 2019 at 10:26 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I know this bug doesn't satisfy the normal criteria for making it into
> > 2.23 (it's a bug that was present in 2.22 rather than a regression in
> > 2.23), but given that it's a BUG() condition, I was hoping it is
> > important and safe enough to include anyway.
>
> For maintenance and upcoming release, a safer "fix" to do might be
> to also (in addition to this patch) flip the default to no to revert
> to the stable state before "directory renames" was introduced, while
> still allowing those who want to help can explore the right fix to
> this codepath.

That might be a bit more jarring.  Directory rename detection did not
have this bug in git 2.18, 2.19, 2.20, or 2.21.  It was new to 2.22
with the 'conflict' setting

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 16:57   ` Junio C Hamano
  2019-08-06 17:26     ` Elijah Newren
@ 2019-08-06 17:49     ` Junio C Hamano
  1 sibling, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2019-08-06 17:49 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> I do agree that it is sensible to avoid doing any funky thing during
> the virtual base merges, whose result is much less observable (hence
> harder to form the right mental model in end user's head) than the
> outermost merge.  Do we want to allow this for inner merges when the
> setting is 2?  Wouldn't that hit the same BUG()?

Ah, actually no.  2 would make a choice of committing to one
possible merge result and does not leave options to be resolved as
conflicted, so we should be able to write out its decision as a
tree.  If we cannot form a tree object when operating with 2, we do
have a bug and we do want to know about it.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-05 22:33 ` [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case Elijah Newren
  2019-08-06 16:57   ` Junio C Hamano
  2019-08-06 17:26   ` Junio C Hamano
@ 2019-08-06 20:28   ` Emily Shaffer
  2019-08-06 21:16     ` Elijah Newren
  2 siblings, 1 reply; 11+ messages in thread
From: Emily Shaffer @ 2019-08-06 20:28 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Junio C Hamano, git

On Mon, Aug 05, 2019 at 03:33:50PM -0700, Elijah Newren wrote:
> Ever since commit 8c8e5bd6eb33 ("merge-recursive: switch directory
> rename detection default", 2019-04-05), the default handling with
> directory rename detection was to report a conflict and leave unstaged
> entries in the index.  However, when creating a virtual merge base in
> the recursive case, we absolutely need a tree, and the only way a tree
> can be written is if we have no unstaged entries -- otherwise we hit a
> BUG().
> 
> There are a few fixes possible here which at least fix the BUG(), but
> none of them seem optimal for other reasons; see the comments with the
> new testcase 13e in t6043 for details (which testcase triggered a BUG()
> prior to this patch).  As such, just opt for a very conservative and
> simple choice that is still relatively reasonable: have the recursive
> case treat 'conflict' as 'false' for opt->detect_directory_renames.
> 
> Reported-by: Emily Shaffer <emilyshaffer@google.com>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> 
> I really should introduce constants like
>   DETECT_DIRECTORY_RENAMES_NEVER    = 0
>   DETECT_DIRECTORY_RENAMES_CONFLICT = 1
>   DETECT_DIRECTORY_RENAMES_YES      = 2
> and then use them in the code to make it clearer, but I wanted to make
> the code change as simple and contained as possible given that this is
> built on maint, fixes a BUG() and we're already in -rc1.
> 
> I know this bug doesn't satisfy the normal criteria for making it into
> 2.23 (it's a bug that was present in 2.22 rather than a regression in
> 2.23), but given that it's a BUG() condition, I was hoping it is
> important and safe enough to include anyway.
> 
> (This fix does merge down cleanly to master, next, and pu.)

Thanks for picking this up and sorry I didn't end up sending anything -
priority shifts on this end. :)

> 
>  merge-recursive.c                   |   3 +-
>  t/t6043-merge-rename-directories.sh | 111 ++++++++++++++++++++++++++++
>  2 files changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/merge-recursive.c b/merge-recursive.c
> index d2e380b7ed..c7691d9b54 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -2856,7 +2856,8 @@ static int detect_and_process_renames(struct merge_options *opt,
>  	head_pairs = get_diffpairs(opt, common, head);
>  	merge_pairs = get_diffpairs(opt, common, merge);
>  
> -	if (opt->detect_directory_renames) {
So we used to say, "If you want to apply directory renames, or mark
directory renames as merge conflicts, we will go look for renames no
matter what."

> +	if ((opt->detect_directory_renames == 2) ||
> +	    (opt->detect_directory_renames == 1 && !opt->call_depth)) {
But now we say, "If you want to apply directory renames always, or if
you want to mark them as conflicts AND you aren't below the first layer
of a recursive merge, then we will go look for renames."

>  		dir_re_head = get_directory_renames(head_pairs);
>  		dir_re_merge = get_directory_renames(merge_pairs);
That means that when we usually prefer to mark directory renames as
conflicts and we are putting together a virtual ancestor, we
don't try to detect renames at all, which I imagine leaves it to the top
level merge to mark as conflicts for the user to resolve much later.
>  
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index 50b7543483..c966147d5d 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -4403,4 +4403,115 @@ test_expect_success '13d-check(info): messages for rename/rename(1to1) via dual
>  	)
>  '
>  
> +# Testcase 13e, directory rename in virtual merge base
> +#
> +# This testcase has a slightly different setup than all the above cases, in
> +# order to include a recursive case:
> +#
> +#      A   C
> +#      o - o
> +#     / \ / \
> +#  O o   X   ?
> +#     \ / \ /
> +#      o   o
> +#      B   D
> +#
> +#   Commit O: a/{z,y}
> +#   Commit A: b/{z,y}
> +#   Commit B: a/{z,y,x}
> +#   Commit C: b/{z,y,x}
> +#   Commit D: b/{z,y}, a/x
> +#   Expected: b/{z,y,x}  (sort of; see below for why this might not be expected)
> +#
> +#   NOTES: 'X' represents a virtual merge base.  With the default of
> +#          directory rename detection yielding conflicts, merging A and B
> +#          results in a conflict complaining about whether 'x' should be
> +#          under 'a/' or 'b/'.  However, when creating the virtual merge
> +#          base 'X', since virtual merge bases need to be written out as a
> +#          tree, we cannot have a conflict, so some resolution has to be
> +#          picked.
> +#
> +#          In choosing the right resolution, it's worth noting here that
> +#          commits C & D are merges of A & B that choose different
> +#          locations for 'x' (i.e. they resolve the conflict differently),
> +#          and so it would be nice when merging C & D if git could detect
> +#          this difference of opinion and report a conflict.  But the only
> +#          way to do so that I can think of would be to have the virtual
> +#          merge base place 'x' in some directory other than either 'a/' or
> +#          'b/', which seems a little weird -- especially since it'd result
> +#          in a rename/rename(1to2) conflict with a source path that never
> +#          existed in any version.
> +#
> +#          So, for now, when directory rename detection is set to
> +#          'conflict' just avoid doing directory rename detection at all in
> +#          the recursive case.  This will not allow us to detect a conflict
> +#          in the outer merge for this special kind of setup, but it at
> +#          least avoids hitting a BUG().
> +#
> +test_expect_success '13e-setup: directory rename detection in recursive case' '
> +	test_create_repo 13e &&
> +	(
> +		cd 13e &&
> +
> +		mkdir a &&
> +		echo z >a/z &&
> +		echo y >a/y &&
> +		git add a &&
> +		test_tick &&
> +		git commit -m "O" &&
> +
> +		git branch O &&
> +		git branch A &&
> +		git branch B &&
> +
> +		git checkout A &&
> +		git mv a/ b/ &&
> +		test_tick &&
> +		git commit -m "A" &&
So we do a directory rename in branch A...

> +
> +		git checkout B &&
> +		echo x >a/x &&
> +		git add a &&
> +		test_tick &&
> +		git commit -m "B" &&
... And we add a new file to the directory in question in branch B...

So A and B will be the ones combined to make a virtual ancestor.
> +
> +		git branch C A &&
> +		git branch D B &&
> +
> +		git checkout C &&
> +		test_must_fail git -c merge.directoryRenames=conflict merge B &&
> +		git add b/x &&
> +		test_tick &&
> +		git commit -m "C" &&

Then we merge C with B which places B as a mutual ancestor of D as well
as C.

> +
> +
> +		git checkout D &&
> +		test_must_fail git -c merge.directoryRenames=conflict merge A &&

Now we do the same thing merging A with D, which means that D has
ancestors B from branching and A from merge, and C has ancestors A from
branching and B from merge. So D and C have two closest ancestors
(criss-cross merge).

> +		git add b/x &&
> +		mkdir a &&
> +		git mv b/x a/x &&

Now D adds contention over a/x and b/x (which were both mentioned in the
ancestry too) to induce a conflict... or is this adding a resolution
which can be decided on automatically? I guess later you are looking to
make sure no CONFLICT still exists in the output, so you must be
resolving the conflict here?

> +		test_tick &&
> +		git commit -m "D"
> +	)
> +'
> +
> +test_expect_success '13e-check: directory rename detection in recursive case' '
> +	(
> +		cd 13e &&
> +
> +		git checkout --quiet D^0 &&
> +
> +		git -c merge.directoryRenames=conflict merge -s recursive C^0 >out 2>err &&

Now we finally do the recursive merge - C and D have equally likely
ancestors A and B, and A and B have a rename conflict.

> +
> +		test_i18ngrep ! CONFLICT out &&
> +		test_i18ngrep ! BUG: err &&

The BUG is gone. But should it not use i18ngrep? BUG() isn't localized.

> +		test_i18ngrep ! core.dumped err &&
> +		test_must_be_empty err &&
> +
> +		git ls-files >paths &&
> +		! grep a/x paths &&
Finally, make sure that a/x has been truly disappeared...

> +		grep b/x paths
...and b/x is the only x left standing.
> +	)
> +'
> +
>  test_done
> -- 
> 2.22.0.246.g5ddf3d502a
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 20:28   ` Emily Shaffer
@ 2019-08-06 21:16     ` Elijah Newren
  2019-08-06 21:54       ` Emily Shaffer
  2019-08-08 11:00       ` Jeff King
  0 siblings, 2 replies; 11+ messages in thread
From: Elijah Newren @ 2019-08-06 21:16 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Junio C Hamano, Git Mailing List

On Tue, Aug 6, 2019 at 1:28 PM Emily Shaffer <emilyshaffer@google.com> wrote:
>
> On Mon, Aug 05, 2019 at 03:33:50PM -0700, Elijah Newren wrote:
> > Ever since commit 8c8e5bd6eb33 ("merge-recursive: switch directory
> > rename detection default", 2019-04-05), the default handling with
> > directory rename detection was to report a conflict and leave unstaged
> > entries in the index.  However, when creating a virtual merge base in
> > the recursive case, we absolutely need a tree, and the only way a tree
> > can be written is if we have no unstaged entries -- otherwise we hit a
> > BUG().
...
> Thanks for picking this up and sorry I didn't end up sending anything -
> priority shifts on this end. :)

I totally understand.

...
> > +             git checkout C &&
> > +             test_must_fail git -c merge.directoryRenames=conflict merge B &&
> > +             git add b/x &&
> > +             test_tick &&
> > +             git commit -m "C" &&
>
> Then we merge C with B which places B as a mutual ancestor of D as well
> as C.
>
> > +
> > +
> > +             git checkout D &&
> > +             test_must_fail git -c merge.directoryRenames=conflict merge A &&
>
> Now we do the same thing merging A with D, which means that D has
> ancestors B from branching and A from merge, and C has ancestors A from
> branching and B from merge. So D and C have two closest ancestors
> (criss-cross merge).
>
> > +             git add b/x &&
> > +             mkdir a &&
> > +             git mv b/x a/x &&
>
> Now D adds contention over a/x and b/x (which were both mentioned in the
> ancestry too) to induce a conflict... or is this adding a resolution
> which can be decided on automatically? I guess later you are looking to
> make sure no CONFLICT still exists in the output, so you must be
> resolving the conflict here?

Yes, we are resolving the conflict for D by choosing to reject the
directory rename, placing 'x' in a/x instead of b/x.

When merge.directoryRenames=conflict, it'll make only b/x be present
in the working directory and mark it as conflicted (i.e. assume the
directory rename is probably the right resolution, but print a warning
and mark the index as needing to be updated to verify -- this allows
e.g. "git add -u" to do the "right thing").  For commit C, we just did
a "git add b/x" to accept the directory rename.  For commit D, we
wanted to say we didn't want the directory rename which you'd first
guess would be "git mv b/x a/x" BUT: (1) a/x has unstaged entries
which will cause git-mv to fail, and (2) directory a/ didn't exist --
both of these issues had to be corrected before running git-mv.

> > +             test_tick &&
> > +             git commit -m "D"
> > +     )
> > +'
> > +
> > +test_expect_success '13e-check: directory rename detection in recursive case' '
> > +     (
> > +             cd 13e &&
> > +
> > +             git checkout --quiet D^0 &&
> > +
> > +             git -c merge.directoryRenames=conflict merge -s recursive C^0 >out 2>err &&
>
> Now we finally do the recursive merge - C and D have equally likely
> ancestors A and B, and A and B have a rename conflict.
>
> > +
> > +             test_i18ngrep ! CONFLICT out &&
> > +             test_i18ngrep ! BUG: err &&
>
> The BUG is gone. But should it not use i18ngrep? BUG() isn't localized.

Technically, yes, you're right. However, this line's purpose isn't
correctness of the test but documentation for the person reading the
testcase about what it's real original purpose was; my real test was
the "test_must_be_empty err" check I have below it, but I added this
line just to document the intent better.  I kind of like the
"CONFLICT" and "BUG" lines looking similar just so the reader of the
testcase doesn't have to try to reason through why they are different,
although I guess it does present the problem that more careful readers
like yourself might do a double take.

If folks find it more readable to use regular grep instead of
test_i18ngrep, I can change this line as well as the "core dumped"
check immediately below over to regular grep.

> > +             test_i18ngrep ! core.dumped err &&
> > +             test_must_be_empty err &&
> > +
> > +             git ls-files >paths &&
> > +             ! grep a/x paths &&
> Finally, make sure that a/x has been truly disappeared...
>
> > +             grep b/x paths
> ...and b/x is the only x left standing.


Thanks for taking a look.  :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 21:16     ` Elijah Newren
@ 2019-08-06 21:54       ` Emily Shaffer
  2019-08-08 11:00       ` Jeff King
  1 sibling, 0 replies; 11+ messages in thread
From: Emily Shaffer @ 2019-08-06 21:54 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Junio C Hamano, Git Mailing List

> > > +
> > > +             test_i18ngrep ! CONFLICT out &&
> > > +             test_i18ngrep ! BUG: err &&
> >
> > The BUG is gone. But should it not use i18ngrep? BUG() isn't localized.
> 
> Technically, yes, you're right. However, this line's purpose isn't
> correctness of the test but documentation for the person reading the
> testcase about what it's real original purpose was; my real test was
> the "test_must_be_empty err" check I have below it, but I added this
> line just to document the intent better.  I kind of like the
> "CONFLICT" and "BUG" lines looking similar just so the reader of the
> testcase doesn't have to try to reason through why they are different,
> although I guess it does present the problem that more careful readers
> like yourself might do a double take.
> 
> If folks find it more readable to use regular grep instead of
> test_i18ngrep, I can change this line as well as the "core dumped"
> check immediately below over to regular grep.
> 

It's fine by me, I see your point.

> Thanks for taking a look.  :-)

Reviewed-by: Emily Shaffer <emilyshaffer@google.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case
  2019-08-06 21:16     ` Elijah Newren
  2019-08-06 21:54       ` Emily Shaffer
@ 2019-08-08 11:00       ` Jeff King
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff King @ 2019-08-08 11:00 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Emily Shaffer, Junio C Hamano, Git Mailing List

On Tue, Aug 06, 2019 at 02:16:25PM -0700, Elijah Newren wrote:

> > > +             test_i18ngrep ! CONFLICT out &&
> > > +             test_i18ngrep ! BUG: err &&
> >
> > The BUG is gone. But should it not use i18ngrep? BUG() isn't localized.
> 
> Technically, yes, you're right. However, this line's purpose isn't
> correctness of the test but documentation for the person reading the
> testcase about what it's real original purpose was; my real test was
> the "test_must_be_empty err" check I have below it, but I added this
> line just to document the intent better.  I kind of like the
> "CONFLICT" and "BUG" lines looking similar just so the reader of the
> testcase doesn't have to try to reason through why they are different,
> although I guess it does present the problem that more careful readers
> like yourself might do a double take.

I think it would be better to drop the grep for BUG entirely.

Not BUG()-ing should be something we implicitly assume for all commands,
and checking the exit code already covers that[1]. I don't think we
should be cluttering up every test, even ones that are in response to a
BUG(), with redundant checks. If you really want to document it further,
a comment can do that without incurring extra run-time overhead. But I
think in this case that your existing comments and commit message cover
it quite well.

-Peff

[1] There are cases where there's a crash in a sub-process, but in that
    case the failure should be surfaced in the way the test is written.
    It is here, and I'd argue that any case where it isn't probably
    ought to be rewritten (because you're missing not just BUG()s, but
    probably die()).

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-08-08 11:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-26 22:09 BUG() during criss-cross merge with directory rename and deleted file Emily Shaffer
2019-08-05 22:33 ` [PATCH 1/1] merge-recursive: avoid directory rename detection in recursive case Elijah Newren
2019-08-06 16:57   ` Junio C Hamano
2019-08-06 17:26     ` Elijah Newren
2019-08-06 17:49     ` Junio C Hamano
2019-08-06 17:26   ` Junio C Hamano
2019-08-06 17:29     ` Elijah Newren
2019-08-06 20:28   ` Emily Shaffer
2019-08-06 21:16     ` Elijah Newren
2019-08-06 21:54       ` Emily Shaffer
2019-08-08 11:00       ` Jeff King

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git