git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: Re: [PATCH v4 08/24] Ensure index matches head before invoking merge machinery, round N
Date: Tue, 3 Sep 2019 11:17:04 -0700	[thread overview]
Message-ID: <CABPp-BF4yjZrD7-bXLa9tHgEws2zvN2zJxqnMqgtY6zm2Lvk3A@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1909031532160.46@tvgsbejvaqbjf.bet>

Hi Dscho,

On Tue, Sep 3, 2019 at 6:34 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Tue, 3 Sep 2019, Johannes Schindelin wrote:
>
> > On Sat, 17 Aug 2019, Elijah Newren wrote:
> >
> > >   * t3030-merge-recursive.h: this test has always been broken in that it
> > >     didn't make sure to make index match head before running.  But, it
> > >     didn't care about the index or even the merge result, just the
> > >     verbose output while running.  While commit eddd1a411d93
> > >     ("merge-recursive: enforce rule that index matches head before
> > >     merging", 2018-06-30) should have uncovered this broken test, it
> > >     used a test_must_fail wrapper around the merge-recursive call
> > >     because it was known that the merge resulted in a rename/rename
> > >     conflict.  Thus, that fix only made this test fail for a different
> > >     reason, and since the index == head check didn't happen until after
> > >     coming all the way back out of the recursion, the testcase had
> > >     enough information to pass the one check that it did perform.
> >
> > I fear that this test is still broken, it is a Schrödinger bug. Where
> > `qsort()` is the cat, and the property "is it stable?" instead of death.
> >
> > In particular, at some stage in the recursive merge, a diff is generated
> > with rename detection for the target file `a` that contains two lines `hello`
> > but has two equally valid source files: `e` and `a~Temporary merge
> > branch 2_0`, both containing just the line `hello`. And since their file
> > contents are identical, the solution to the problem "which is the
> > correct source file?" is ambiguous.
> >
> > If the `qsort()` in use is stable, the file `e` comes first, and wins.
> > If the `qsort()` in use is not stable, all bets are off, and the file
> > `a~Temporary merge branch 2_0` might be sorted first (which is the case,
> > for example, when using the `qsort()` implementation of MS Visual C's
> > runtime).
> >
> > Now, the _real_ problem is that t3030.35 expects the recursive merge to
> > fail, which it does when `qsort()` is stable. However, when the order of
> > `e` and `a~Temporary merge branch 2_0` is reversed, then that particular
> > merge does _not_ result in a `rename/rename` conflict. And the exit code
> > of the recursive merge is 0 for some reason!
> >
> > I don't quite understand why: clearly, there are conflicts (otherwise we
> > would not have that funny suffix `~Temporary merge branch 2_0`.

So, there are conflicts in the inner merge, but depending on the
tie-breaker for rename handling when two equal matches exist (the
tie-breaker being the order of the filenames after qsort()), there may
or may not be conflicts in the outer merge.  Ouch.

I suspect such cases are pretty rare in "real world repositories"
because (1) exactly equal filename similarities are rare, (2) "slowly
changing trees of content" implies that most files will only be
modified on (at most) one side of history, (3) when files are changed
on both sides of history odds of conflicting changes rapidly go up
making conflicts likely.  You essentially have to thread a needle to
have the end result ambiguously conflict like this.

> > The real problem, though, is that even if it would fail, the outcome of
> > that recursive merge is ambiguous, and that test case should not try to
> > verify the precise order of the generated intermediate trees.

Yes, it is ambiguous -- and the problem is a little deeper too. It's
not just "does-this-merge-conflict?" depending upon the qsort order,
it is also about whether file contents after the merge depend upon the
qsort order.  Whenever there are two filenames that are equally
similar to a rename source, picking one of the two equally similar
filenames for rename pairing means we are basically choosing at random
where to merge the changes from the other side of history to.

Unfortunately, changing this might be difficult to enforce with the
current merge-recursive structure.  For example, what if there are two
equally similar filenames for us to choose from, but the other side of
history didn't modify the rename source file at all?  (e.g. on one
side of history, user leaves A alone, on other side of history, A is
copied to B and C and then A is deleted.  B and C are identical.)  In
such cases, the choice of which of B and C we pair A up with happens
to be irrelevant because we'll get the same result either way and
there should be no merge conflict.  But if we error out early or throw
warnings and conflict notices because the intermediate internal choice
was ambiguous, then we've created useless conflicts for the user.  I'm
worried we have more cases of this kind of thing happening than we do
with ambiguous pairings that change the end result.

I think I might be able to do something here with my alternative merge
strategy, but I haven't gotten back to that for quite a while, so...

> It might not be obvious from my mail, but it took me about 7 hours to
> figure all of this out, hence I was a bit grumpy when I wrote that. My
> apologies.
>
> After having slept (and written a long review about the
> `--update-branches` patch), it occurred to me that we might be better
> off enforcing the use of `git_qsort()` in `diffcore-rename.c`, so that
> we can at least guarantee stable rename detection in Git (which would
> incidentally fix the test suite for the MSVC build that I maintain in
> Git for Windows).
>
> What do you think?

Ooh, absolutely, we should do that in the short term.  Not just
because it fixes the testsuite, but because it increases the
likelihood that folks can reproduce each others' merge problems.  I
want to be able to help users on Windows who report problems and
provide testcases, and making this fix reduces one hurdle toward doing
so.

  reply	other threads:[~2019-09-03 18:17 UTC|newest]

Thread overview: 157+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-25 17:45 [PATCH 00/19] Cleanup merge API Elijah Newren
2019-07-25 17:45 ` [PATCH 01/19] merge-recursive: fix minor memory leak in error condition Elijah Newren
2019-07-25 17:45 ` [PATCH 02/19] merge-recursive: remove another implicit dependency on the_repository Elijah Newren
2019-07-25 17:45 ` [PATCH 03/19] Ensure index matches head before invoking merge machinery, round N Elijah Newren
2019-07-25 19:41   ` Johannes Schindelin
2019-07-25 19:58     ` Elijah Newren
2019-07-26 11:23       ` Johannes Schindelin
2019-07-25 17:45 ` [PATCH 04/19] merge-recursive: exit early if index != head Elijah Newren
2019-07-25 19:51   ` Johannes Schindelin
2019-07-25 20:12     ` Elijah Newren
2019-07-25 17:45 ` [PATCH 05/19] merge-recursive: don't force external callers to do our logging Elijah Newren
2019-07-25 17:45 ` [PATCH 06/19] Change call signature of write_tree_from_memory() Elijah Newren
2019-07-25 19:55   ` Johannes Schindelin
2019-07-26  4:58     ` Elijah Newren
2019-07-25 17:45 ` [PATCH 07/19] Move write_tree_from_memory() from merge-recursive to cache-tree Elijah Newren
2019-07-25 17:46 ` [PATCH 08/19] merge-recursive: fix some overly long lines Elijah Newren
2019-07-25 17:46 ` [PATCH 09/19] merge-recursive: use common name for ancestors/common/base_list Elijah Newren
2019-07-25 17:46 ` [PATCH 10/19] merge-recursive: rename 'mrtree' to 'result_tree', for clarity Elijah Newren
2019-07-25 20:02   ` Johannes Schindelin
2019-07-25 17:46 ` [PATCH 11/19] merge-recursive: rename merge_options argument to opt in header Elijah Newren
2019-07-25 17:46 ` [PATCH 12/19] merge-recursive: move some definitions around to clean up the header Elijah Newren
2019-07-25 17:46 ` [PATCH 13/19] merge-recursive: consolidate unnecessary fields in merge_options Elijah Newren
2019-07-25 17:46 ` [PATCH 14/19] merge-recursive: comment and reorder the merge_options fields Elijah Newren
2019-07-25 17:46 ` [PATCH 15/19] merge-recursive: split internal fields into a separate struct Elijah Newren
2019-07-25 20:12   ` Johannes Schindelin
2019-07-25 20:27     ` Elijah Newren
2019-07-26 11:25       ` Johannes Schindelin
2019-07-26 15:30         ` Elijah Newren
2019-07-25 17:46 ` [PATCH 16/19] merge-recursive: alphabetize include list Elijah Newren
2019-07-25 17:46 ` [PATCH 17/19] merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_* Elijah Newren
2019-07-25 17:46 ` [PATCH 18/19] merge-recursive: be consistent with assert Elijah Newren
2019-07-25 17:46 ` [PATCH 19/19] merge-recursive: provide a better label for diff3 common ancestor Elijah Newren
2019-07-25 20:28   ` Johannes Schindelin
2019-07-25 18:12 ` [PATCH 00/19] Cleanup merge API Junio C Hamano
2019-07-25 19:06   ` Elijah Newren
2019-07-25 22:50     ` Junio C Hamano
2019-07-26 14:07       ` Johannes Schindelin
2019-07-25 19:15 ` Johannes Schindelin
2019-07-26 15:15   ` Elijah Newren
2019-07-26 15:52 ` [PATCH v2 00/20] " Elijah Newren
2019-07-26 15:52   ` [PATCH v2 01/20] merge-recursive: fix minor memory leak in error condition Elijah Newren
2019-07-26 18:31     ` Junio C Hamano
2019-07-26 23:19       ` Elijah Newren
2019-07-29 14:13       ` Derrick Stolee
2019-07-26 15:52   ` [PATCH v2 02/20] merge-recursive: remove another implicit dependency on the_repository Elijah Newren
2019-07-26 18:42     ` Junio C Hamano
2019-07-26 15:52   ` [PATCH v2 03/20] Ensure index matches head before invoking merge machinery, round N Elijah Newren
2019-07-26 19:14     ` Junio C Hamano
2019-07-26 23:22       ` Elijah Newren
2019-07-26 15:52   ` [PATCH v2 04/20] merge-recursive: exit early if index != head Elijah Newren
2019-07-26 19:32     ` Junio C Hamano
2019-07-26 23:26       ` Elijah Newren
2019-07-29 15:56         ` Junio C Hamano
2019-07-26 15:52   ` [PATCH v2 05/20] merge-recursive: remove useless parameter in merge_trees() Elijah Newren
2019-07-26 19:37     ` Junio C Hamano
2019-07-26 15:52   ` [PATCH v2 06/20] merge-recursive: don't force external callers to do our logging Elijah Newren
2019-07-26 19:57     ` Junio C Hamano
2019-07-26 23:28       ` Elijah Newren
2019-07-26 15:52   ` [PATCH v2 07/20] Use write_index_as_tree() in lieu of write_tree_from_memory() Elijah Newren
2019-07-26 20:11     ` Junio C Hamano
2019-07-26 23:39       ` Elijah Newren
2019-07-29  4:46         ` Junio C Hamano
2019-07-26 15:52   ` [PATCH v2 08/20] merge-recursive: fix some overly long lines Elijah Newren
2019-07-26 15:52   ` [PATCH v2 09/20] merge-recursive: use common name for ancestors/common/base_list Elijah Newren
2019-07-26 15:52   ` [PATCH v2 10/20] merge-recursive: rename 'mrtree' to 'result_tree', for clarity Elijah Newren
2019-07-26 15:52   ` [PATCH v2 11/20] merge-recursive: rename merge_options argument to opt in header Elijah Newren
2019-07-26 15:52   ` [PATCH v2 12/20] merge-recursive: move some definitions around to clean up the header Elijah Newren
2019-07-26 15:52   ` [PATCH v2 13/20] merge-recursive: consolidate unnecessary fields in merge_options Elijah Newren
2019-07-26 15:52   ` [PATCH v2 14/20] merge-recursive: comment and reorder the merge_options fields Elijah Newren
2019-07-26 15:52   ` [PATCH v2 15/20] merge-recursive: avoid losing output and leaking memory holding that output Elijah Newren
2019-07-26 15:52   ` [PATCH v2 16/20] merge-recursive: split internal fields into a separate struct Elijah Newren
2019-07-26 15:52   ` [PATCH v2 17/20] merge-recursive: alphabetize include list Elijah Newren
2019-07-26 15:52   ` [PATCH v2 18/20] merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_* Elijah Newren
2019-07-26 15:52   ` [PATCH v2 19/20] merge-recursive: be consistent with assert Elijah Newren
2019-07-26 15:52   ` [PATCH v2 20/20] merge-recursive: provide a better label for diff3 common ancestor Elijah Newren
2019-08-15 21:40   ` [PATCH v3 00/24] Clean up merge API Elijah Newren
2019-08-15 21:40     ` [PATCH v3 01/24] merge-recursive: be consistent with assert Elijah Newren
2019-08-15 21:40     ` [PATCH v3 02/24] checkout: provide better conflict hunk description with detached HEAD Elijah Newren
2019-08-15 21:40     ` [PATCH v3 03/24] merge-recursive: enforce opt->ancestor != NULL when calling merge_trees() Elijah Newren
2019-08-16 21:05       ` Junio C Hamano
2019-08-15 21:40     ` [PATCH v3 04/24] merge-recursive: provide a better label for diff3 common ancestor Elijah Newren
2019-08-16 21:33       ` Junio C Hamano
2019-08-16 22:39         ` Elijah Newren
2019-08-15 21:40     ` [PATCH v3 05/24] merge-recursive: introduce an enum for detect_directory_renames values Elijah Newren
2019-08-15 21:40     ` [PATCH v3 06/24] merge-recursive: future-proof update_file_flags() against memory leaks Elijah Newren
2019-08-15 21:40     ` [PATCH v3 07/24] merge-recursive: remove another implicit dependency on the_repository Elijah Newren
2019-08-15 21:40     ` [PATCH v3 08/24] Ensure index matches head before invoking merge machinery, round N Elijah Newren
2019-08-15 21:40     ` [PATCH v3 09/24] merge-recursive: exit early if index != head Elijah Newren
2019-08-15 21:40     ` [PATCH v3 10/24] merge-recursive: remove useless parameter in merge_trees() Elijah Newren
2019-08-15 21:40     ` [PATCH v3 11/24] merge-recursive: don't force external callers to do our logging Elijah Newren
2019-08-15 21:40     ` [PATCH v3 12/24] cache-tree: share code between functions writing an index as a tree Elijah Newren
2019-08-16 22:01       ` Junio C Hamano
2019-08-16 22:39         ` Elijah Newren
2019-08-15 21:40     ` [PATCH v3 13/24] merge-recursive: fix some overly long lines Elijah Newren
2019-08-15 21:40     ` [PATCH v3 14/24] merge-recursive: use common name for ancestors/common/base_list Elijah Newren
2019-08-15 21:40     ` [PATCH v3 15/24] merge-recursive: rename 'mrtree' to 'result_tree', for clarity Elijah Newren
2019-08-15 21:40     ` [PATCH v3 16/24] merge-recursive: rename merge_options argument to opt in header Elijah Newren
2019-08-15 21:40     ` [PATCH v3 17/24] merge-recursive: move some definitions around to clean up the header Elijah Newren
2019-08-15 21:40     ` [PATCH v3 18/24] merge-recursive: consolidate unnecessary fields in merge_options Elijah Newren
2019-08-16 22:14       ` Junio C Hamano
2019-08-16 22:59         ` Elijah Newren
2019-08-16 23:24           ` Junio C Hamano
2019-08-15 21:40     ` [PATCH v3 19/24] merge-recursive: comment and reorder the merge_options fields Elijah Newren
2019-08-15 21:40     ` [PATCH v3 20/24] merge-recursive: avoid losing output and leaking memory holding that output Elijah Newren
2019-08-15 21:40     ` [PATCH v3 21/24] merge-recursive: split internal fields into a separate struct Elijah Newren
2019-08-16 21:19       ` SZEDER Gábor
2019-08-16 23:00         ` Elijah Newren
2019-08-16 22:17       ` Junio C Hamano
2019-08-15 21:40     ` [PATCH v3 22/24] merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_* Elijah Newren
2019-08-15 21:40     ` [PATCH v3 23/24] merge-recursive: add sanity checks for relevant merge_options Elijah Newren
2019-08-16 19:52       ` Junio C Hamano
2019-08-16 22:08         ` Elijah Newren
2019-08-16 23:15           ` Junio C Hamano
2019-08-16 19:59       ` Junio C Hamano
2019-08-16 22:09         ` Elijah Newren
2019-08-15 21:40     ` [PATCH v3 24/24] merge-recursive: alphabetize include list Elijah Newren
2019-08-17 18:41     ` [PATCH v4 00/24] Clean up merge API Elijah Newren
2019-08-17 18:41       ` [PATCH v4 01/24] merge-recursive: be consistent with assert Elijah Newren
2019-08-17 18:41       ` [PATCH v4 02/24] checkout: provide better conflict hunk description with detached HEAD Elijah Newren
2019-08-17 18:41       ` [PATCH v4 03/24] merge-recursive: enforce opt->ancestor != NULL when calling merge_trees() Elijah Newren
2019-08-17 18:41       ` [PATCH v4 04/24] merge-recursive: provide a better label for diff3 common ancestor Elijah Newren
2019-09-30 21:14         ` Jeff King
2019-09-30 21:19           ` Jeff King
2019-09-30 22:54           ` [PATCH] merge-recursive: fix the diff3 common ancestor label for virtual commits Elijah Newren
2019-10-01  6:56             ` Elijah Newren
2019-10-01  6:58             ` [PATCH v2] " Elijah Newren
2019-10-01 14:49               ` Jeff King
2019-10-01 18:17                 ` [PATCH v3] " Elijah Newren
2019-10-07  2:51                   ` Junio C Hamano
2019-10-07 15:52                     ` [PATCH] merge-recursive: fix the fix to the diff3 common ancestor label Elijah Newren
2019-10-08  2:32                       ` Junio C Hamano
2019-10-08  2:36                       ` Junio C Hamano
2019-10-08 16:16                         ` Elijah Newren
2019-08-17 18:41       ` [PATCH v4 05/24] merge-recursive: introduce an enum for detect_directory_renames values Elijah Newren
2019-08-17 18:41       ` [PATCH v4 06/24] merge-recursive: future-proof update_file_flags() against memory leaks Elijah Newren
2019-08-17 18:41       ` [PATCH v4 07/24] merge-recursive: remove another implicit dependency on the_repository Elijah Newren
2019-08-17 18:41       ` [PATCH v4 08/24] Ensure index matches head before invoking merge machinery, round N Elijah Newren
2019-09-02 23:01         ` Johannes Schindelin
2019-09-03 13:34           ` Johannes Schindelin
2019-09-03 18:17             ` Elijah Newren [this message]
2019-08-17 18:41       ` [PATCH v4 09/24] merge-recursive: exit early if index != head Elijah Newren
2019-08-17 18:41       ` [PATCH v4 10/24] merge-recursive: remove useless parameter in merge_trees() Elijah Newren
2019-08-17 18:41       ` [PATCH v4 11/24] merge-recursive: don't force external callers to do our logging Elijah Newren
2019-08-17 18:41       ` [PATCH v4 12/24] cache-tree: share code between functions writing an index as a tree Elijah Newren
2019-08-17 18:41       ` [PATCH v4 13/24] merge-recursive: fix some overly long lines Elijah Newren
2019-08-17 18:41       ` [PATCH v4 14/24] merge-recursive: use common name for ancestors/common/base_list Elijah Newren
2019-08-17 18:41       ` [PATCH v4 15/24] merge-recursive: rename 'mrtree' to 'result_tree', for clarity Elijah Newren
2019-08-17 18:41       ` [PATCH v4 16/24] merge-recursive: rename merge_options argument to opt in header Elijah Newren
2019-08-17 18:41       ` [PATCH v4 17/24] merge-recursive: move some definitions around to clean up the header Elijah Newren
2019-08-17 18:41       ` [PATCH v4 18/24] merge-recursive: consolidate unnecessary fields in merge_options Elijah Newren
2019-08-17 18:41       ` [PATCH v4 19/24] merge-recursive: comment and reorder the merge_options fields Elijah Newren
2019-08-17 18:41       ` [PATCH v4 20/24] merge-recursive: avoid losing output and leaking memory holding that output Elijah Newren
2019-08-17 18:41       ` [PATCH v4 21/24] merge-recursive: split internal fields into a separate struct Elijah Newren
2019-08-19 17:17         ` Junio C Hamano
2019-08-17 18:41       ` [PATCH v4 22/24] merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_* Elijah Newren
2019-08-17 18:41       ` [PATCH v4 23/24] merge-recursive: add sanity checks for relevant merge_options Elijah Newren
2019-08-17 18:41       ` [PATCH v4 24/24] merge-recursive: alphabetize include list Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BF4yjZrD7-bXLa9tHgEws2zvN2zJxqnMqgtY6zm2Lvk3A@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).