git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, jgfouca@sandia.gov
Subject: Re: [PATCH 37/48] merge-recursive: Fix modify/delete resolution in the recursive case
Date: Mon, 8 Aug 2011 16:09:08 -0600	[thread overview]
Message-ID: <CABPp-BFHo0NK9Y+QGfDw-=zBBbNUQtfO9Fr4qUmdxyZEy0_7Mw@mail.gmail.com> (raw)
In-Reply-To: <7voc0n1nl8.fsf@alter.siamese.dyndns.org>

On Thu, Jul 21, 2011 at 12:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
>> When o->call_depth>0 and we have conflicts, we try to find "middle ground"
>> when creating the virtual merge base.  In the case of content conflicts,
>> this can be done by doing a three-way content merge and using the result.
>> In all parts where the three-way content merge is clean, it is the correct
>> middle ground, and in parts where it conflicts there is no middle ground
>> but the conflict markers provide a good compromise since they are unlikely
>> to accidentally match any further changes.
>>
>> In the case of a modify/delete conflict, we cannot do the same thing.
>> Accepting either endpoint as the resolution for the virtual merge base
>> runs the risk that when handling the non-recursive case we will silently
>> accept one person's resolution over another without flagging a conflict.
>> In this case, the closest "middle ground" we have is actually the merge
>> base of the candidate merge bases.  (We could alternatively attempt a
>> three way content merge using an empty file in place of the deleted file,
>> but that seems to be more work than necessary.)
>
> What did we use before this patch as the "middle ground"?

We used precisely what you seem to suggest at the end of your email.
It's a trade-off, and I agree with you about the cases where my new
choice is suboptimal, but I personally think the original choice has
worse trade-offs.

> Doesn't the "middle ground" also need to contain information from both
> sides?  At the content level, The half-merge result with conflict marker
> contains all the necessary information on what both sides agreed to do
> outside the marker area, while showing what each side wished to do inside,
> so as you describe, it is a good "middle ground".
>
> Using the merge-base would mean that at the tree-structural level of this
> merge you ignore the wish of one side that wanted to delete the path, and
> then at the content level you also ignore the wish of both sides (the side
> that wanted to delete the path wanted to leave zero content, while the
> other side wanted to modify from the version in merge-base, all of which
> is ignored by the above argument).

I think it is optimal if the "middle ground" can contain information
from both sides, but I simply do not see how it is possible in either
a modify/delete or rename/delete case.  That is, at least not without
some kind of significant change to the index data structures or
allowing a "virtual merge base" to be an index with higher stage merge
entries instead of just a tree (though that would introduce several
new complications).

And yes, I am essentially ignoring the intents of both sides, and
intentionally so.  I thought I addressed that in the arguments above,
but perhaps I could have been more clear and explicit about that.
Your explanation below provides a great framework for clarifying why I
made this change, so let me respond below using the framework you have
beautifully illustrated and explained.

> When you want to make a criss-corss merge between E and F that looks like
> this:
>
>      A---C---E---?
>     / \ /       /
>    O   .       /
>     \ / \     /
>      B---D---F
>
> if there is no deletion, we run a content level merge between C and D by
> using X that is a potentially conflicting merge between A and B as a
> "virtual" ancestor.
>
>      A---C---E---?
>     / \ /       /
>    O   X       /
>     \ / \     /
>      B---D---F
>
> X would contain both what both A and B agreed to do on top of O, and what
> A and B wanted to do that they do not agree with. The differing opinion of
> A and B are recorded inside the conflict markers.  The change that turns X
> into E contains C's opinion (i.e. it would likely agree to take what both
> A and B agreed to do, and it may agree with A and the result in E would
> resemble what A wanted to bring to the result despite B's objection). If C
> and D both agree how to resolve the conflict, then B's "objection" will
> cancel out from the three-way merge between E and F that uses X as the
> ancestor.
>
> In a delete/modify situation (e.g. A modified while B deleted), there are
> three possibilities:
>
>  * C and D both agree to delete the path;
>
>  * C and D both agree to keep the path, with modifications that may or may
>   not conflict (which can be handled by the usual content-level merge);
>
>  * C decides to delete, while D decides to modify (or vice versa).

This is a really nicely written explanation.  I particularly like this
list of three cases.  Let me augment this list of situations (which
I'll number (1)-(3)) with a list of four choices for virtual merge
base X:

(J) Make X not have any content at the path (i.e. accept the deletion in B)

(K) Make X have the content from O at path (i.e. what my patch does)

(L) Make X have the content from A at path (which we did before)

(M) Make X have some other content at path

I'll get back to these 12 situations (3 cases each with how any of 4
choices could affect them) in a minute...

> And in the last case, the outer merge ? needs to decide if it wants to
> keep or delete the path anyway, so a simplest solution is to punt the
> whole recursive business and make it the responsibility of the user to
> resolve it as a merge between E and F using O as the common ancestor.
> This patch does so in all three cases.
>
> I however wonder if we can do better in the second case (I do not think
> the first case would come into the picture, as we would not see such a
> path when merging E and F as it would have been long gone from both
> branches). We wouldn't know which commits C and D are exactly, but we do
> have E and F. If A modified the path and B deleted it, and C and D both
> decided to keep the path and E and F both inherited that path, wouldn't it
> be fair to say that what both branches wanted to do is closer to what A
> did than what B did? In other words, instead of using O, wouldn't it give
> us a better result if we used A (the side that did not delete) as the
> common ancestor for the content level merge when both E and F has the
> path?

Yes, absolutely.  In the second case, using the content of the path
from A for the virtual merge ancestor would be a slight improvement
over using the content for path from O.  But let's look at all three
cases you specified and how different choices for the virtual merge
base affect it.  Here are my goals for any merge, in priority order:

1st goal: detect real conflicts and notify the user
2nd goal: avoid conflicts in cases where we should be able to
reasonably confidently merge cleanly
3rd goal: make it easy for the user to figure out how to resolve problems

Obviously, we can't always meet these goals, particularly due to
"semantic" conflicts that may be present.  But they're useful to keep
in mind.  Here is how I broke down the three cases:



Case 1: C and D both agree to delete the path

In this case, regardless of the choice (J)-(M), the merge will be
resolved correctly.


Case 2: C and D both agree to keep the path, with modifications that
may or may not conflict (which can be handled by the usual
content-level merge);

In this case, choice (J) would cause content conflicts whenever C & D
had different content at path and would treat it as a two-way content
merge.  That makes choice (J) particularly problematic as it breaks
the first goal.  In order to get the merge right, we really need to do
a three-way content merge, using content that can serve as a
reasonable merge-base for the contents chosen at C & D.  That makes
choice (M) for X a poor one.  Both choices (K) and (L) are reasonable,
with (L) being a better choice (as you also point out above) to
optimize the 2nd goal.


Case 3: C decides to delete, while D decides to modify (or vice versa)

It is worth noting here that the most likely choice for the content at
path for D is exactly the content that is at A.  git is designed for
trees of slowly changing data after all.

This is a case where git should detect the modify/delete conflict and
alert the user.  In this case, choice (J) is particularly bad.
Regardless of the content that D has for path, the merge conflict
won't even be detected and git would silently resolve this by
recording the content from D at path.  That's a fail for the first
goal.

Another choice that is likely to fail the first goal is (L).  If the
content at path D is the same as at A (the most likely case), then git
would not detect the conflict and would simply silently delete the
path.

In fact, any choice here for the merge base could cause the conflict
to go silently undetected (whenever the content at D matched our
choice for X), meaning that no choice for the merge base will be
perfect.  That would mean of the four choices, (M) would be best as it
would be least likely to match the choice made at D, but (K) would be
pretty reasonable.


So, as I view it: choice (J) is horrible in two of the three cases;
choice (M) is really bad in the second case for goal 2; and choice (L)
is likely to be bad in the third case for goal 1.  Choice (K) is
pretty reasonable in all three cases, and that choice is what this
patch implements.

  reply	other threads:[~2011-08-08 22:09 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-08  7:30 [PATCH 00/48] Handling more corner cases in merge-recursive.c Elijah Newren
2011-06-08  7:30 ` [PATCH 01/48] t6039: Add a testcase where git deletes an untracked file Elijah Newren
2011-06-08  7:30 ` [PATCH 02/48] t6039: Add failing testcase for rename/modify/add-source conflict Elijah Newren
2011-06-08  7:30 ` [PATCH 03/48] t6039: Add a pair of cases where undetected renames cause issues Elijah Newren
2011-06-08  7:30 ` [PATCH 04/48] t6039: Add a testcase where undetected rename causes silent file deletion Elijah Newren
2011-06-08  7:30 ` [PATCH 05/48] t6039: Add tests for content issues with modify/rename/directory conflicts Elijah Newren
2011-07-18 23:37   ` Junio C Hamano
2011-08-08 15:49     ` Elijah Newren
2011-06-08  7:30 ` [PATCH 06/48] t6039: Add failing testcases for rename/rename/add-{source,dest} conflicts Elijah Newren
2011-07-18 23:38   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 07/48] t6039: Ensure rename/rename conflicts leave index and workdir in sane state Elijah Newren
2011-07-18 23:40   ` Junio C Hamano
2011-08-08 17:59     ` Elijah Newren
2011-06-08  7:30 ` [PATCH 08/48] t6036: Add differently resolved modify/delete conflict in criss-cross test Elijah Newren
2011-07-18 23:38   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 09/48] t6036: criss-cross with weird content can fool git into clean merge Elijah Newren
2011-07-18 23:38   ` Junio C Hamano
2011-08-08 18:02     ` Elijah Newren
2011-06-08  7:30 ` [PATCH 10/48] t6036: tests for criss-cross merges with various directory/file conflicts Elijah Newren
2011-07-18 23:40   ` Junio C Hamano
2011-08-08 19:07     ` Elijah Newren
2011-06-08  7:30 ` [PATCH 11/48] t6036: criss-cross w/ rename/rename(1to2)/modify+rename/rename(2to1)/modify Elijah Newren
2011-07-18 23:38   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 12/48] t6036: criss-cross + rename/rename(1to2)/add-source + modify/modify Elijah Newren
2011-07-18 23:38   ` Junio C Hamano
2011-07-20 23:15     ` Phil Hord
2011-06-08  7:30 ` [PATCH 13/48] t6022: Remove unnecessary untracked files to make test cleaner Elijah Newren
2011-06-08  7:30 ` [PATCH 14/48] t6022: New tests checking for unnecessary updates of files Elijah Newren
2011-06-08  7:30 ` [PATCH 15/48] t6022: Add testcase for merging a renamed file with a simple change Elijah Newren
2011-06-08  7:30 ` [PATCH 16/48] merge-recursive: Make BUG message more legible by adding a newline Elijah Newren
2011-06-08  7:30 ` [PATCH 17/48] merge-recursive: Correct a comment Elijah Newren
2011-06-08  7:30 ` [PATCH 18/48] merge-recursive: Mark some diff_filespec struct arguments const Elijah Newren
2011-07-18 23:40   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 19/48] merge-recursive: Remember to free generated unique path names Elijah Newren
2011-07-18 23:39   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 20/48] merge-recursive: Avoid working directory changes during recursive case Elijah Newren
2011-06-08  7:30 ` [PATCH 21/48] merge-recursive: Fix recursive case with D/F conflict via add/add conflict Elijah Newren
2011-07-18 23:40   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 22/48] merge-recursive: Fix sorting order and directory change assumptions Elijah Newren
2011-07-11  7:04   ` Johannes Sixt
2011-07-12  7:27     ` Johannes Sixt
2011-07-13  7:24       ` Johannes Sixt
2011-07-13 20:34         ` Junio C Hamano
2011-07-18 23:39   ` Junio C Hamano
2011-08-08 19:21     ` Elijah Newren
2011-06-08  7:30 ` [PATCH 23/48] merge-recursive: Fix code checking for D/F conflicts still being present Elijah Newren
2011-06-08  7:30 ` [PATCH 24/48] merge-recursive: Save D/F conflict filenames instead of unlinking them Elijah Newren
2011-06-08  7:30 ` [PATCH 25/48] merge-recursive: Split was_tracked() out of would_lose_untracked() Elijah Newren
2011-06-08  7:30 ` [PATCH 26/48] merge-recursive: Allow make_room_for_path() to remove D/F entries Elijah Newren
2011-07-11  7:14   ` Johannes Sixt
2011-07-13  7:17   ` Johannes Sixt
2011-08-08 20:56     ` Elijah Newren
2011-08-09  7:01       ` Johannes Sixt
2011-07-18 23:39   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 27/48] merge-recursive: Consolidate different update_stages functions Elijah Newren
2011-07-18 23:39   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 28/48] merge-recursive: Split update_stages_and_entry; only update stages at end Elijah Newren
2011-07-18 23:39   ` Junio C Hamano
2011-06-08  7:30 ` [PATCH 29/48] merge-recursive: When we detect we can skip an update, actually skip it Elijah Newren
2011-07-18 23:39   ` Junio C Hamano
2011-06-08  7:31 ` [PATCH 30/48] merge-recursive: Fix deletion of untracked file in rename/delete conflicts Elijah Newren
2011-07-21 18:43   ` Junio C Hamano
2011-06-08  7:31 ` [PATCH 31/48] merge-recursive: Make dead code for rename/rename(2to1) conflicts undead Elijah Newren
2011-06-08  7:31 ` [PATCH 32/48] merge-recursive: Add comments about handling rename/add-source cases Elijah Newren
2011-06-08  7:31 ` [PATCH 33/48] merge-recursive: Improve handling of rename target vs. directory addition Elijah Newren
2011-06-08  7:31 ` [PATCH 34/48] merge-recursive: Consolidate process_entry() and process_df_entry() Elijah Newren
2011-07-21 18:43   ` Junio C Hamano
2011-06-08  7:31 ` [PATCH 35/48] merge-recursive: Cleanup and consolidation of rename_conflict_info Elijah Newren
2011-06-08  7:31 ` [PATCH 36/48] merge-recursive: Provide more info in conflict markers with file renames Elijah Newren
2011-07-21 18:43   ` Junio C Hamano
2011-06-08  7:31 ` [PATCH 37/48] merge-recursive: Fix modify/delete resolution in the recursive case Elijah Newren
2011-07-21 18:43   ` Junio C Hamano
2011-08-08 22:09     ` Elijah Newren [this message]
2011-06-08  7:31 ` [PATCH 38/48] merge-recursive: Introduce a merge_file convenience function Elijah Newren
2011-06-08  7:31 ` [PATCH 39/48] merge-recursive: Fix rename/rename(1to2) resolution for virtual merge base Elijah Newren
2011-07-25 20:55   ` Junio C Hamano
2011-08-08 22:58     ` Elijah Newren
2011-06-08  7:31 ` [PATCH 40/48] merge-recursive: Small cleanups for conflict_rename_rename_1to2 Elijah Newren
2011-06-08  7:31 ` [PATCH 41/48] merge-recursive: Defer rename/rename(2to1) handling until process_entry Elijah Newren
2011-06-08  7:31 ` [PATCH 42/48] merge-recursive: Record more data needed for merging with dual renames Elijah Newren
2011-06-08  7:31 ` [PATCH 43/48] merge-recursive: Create function for merging with branchname:file markers Elijah Newren
2011-06-08  7:31 ` [PATCH 44/48] merge-recursive: Consider modifications in rename/rename(2to1) conflicts Elijah Newren
2011-06-08  7:31 ` [PATCH 45/48] merge-recursive: Make modify/delete handling code reusable Elijah Newren
2011-06-08  7:31 ` [PATCH 46/48] merge-recursive: Have conflict_rename_delete reuse modify/delete code Elijah Newren
2011-06-08  7:31 ` [PATCH 47/48] merge-recursive: add handling for rename/rename/add-dest/add-dest Elijah Newren
2011-06-08  7:31 ` [PATCH 48/48] merge-recursive: Fix working copy handling for rename/rename/add/add Elijah Newren
2011-06-11 18:12 ` [PATCH 00/48] Handling more corner cases in merge-recursive.c Junio C Hamano
     [not found]   ` <BANLkTimd0O70e7KhT-G5quxQhF_Nwc30Hg@mail.gmail.com>
2011-06-12  6:18     ` Junio C Hamano
2011-06-12  6:28       ` Junio C Hamano
2011-08-04  0:20 ` Junio C Hamano
2011-08-04  1:48   ` Junio C Hamano
2011-08-04  2:12     ` Elijah Newren
2011-08-04 17:26   ` Elijah Newren
2011-08-04 19:03     ` Junio C Hamano
2011-08-04 19:16       ` Elijah Newren
2011-08-06  5:22         ` Junio C Hamano
2011-08-06 20:31           ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BFHo0NK9Y+QGfDw-=zBBbNUQtfO9Fr4qUmdxyZEy0_7Mw@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jgfouca@sandia.gov \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).