git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Glen Choo <chooglen@google.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Bug in merge-ort (rename detection can have collisions?)
Date: Tue, 21 Jun 2022 21:30:31 -0700	[thread overview]
Message-ID: <CABPp-BGN0DoSr3bcjTmGZkcoj_dSVzOgFUQ++R=_z8v=nAJsTg@mail.gmail.com> (raw)
In-Reply-To: <kl6l35g87bga.fsf@chooglen-macbookpro.roam.corp.google.com>

Sorry for the long delay.  I haven't gotten much Git time lately...

On Mon, Jun 13, 2022 at 9:52 AM Glen Choo <chooglen@google.com> wrote:
>
[...]
> Elijah Newren <newren@gmail.com> writes:
>
> > On Fri, Jun 10, 2022 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Elijah Newren <newren@gmail.com> writes:
> >>
> >> > On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
> >> >>
> >> >> (I'm not 100% what the bug _is_, only that there is one.)
> >> >>
> >> >> = Report
> >> >>
> >> >> At $DAYJOB, there was a report that "git merge" was failing on certain
> >> >> branches. Fortunately, the repo is publicly accessible, so I can share
> >> >> the full reproduction recipe:
> >> >> ...
> >> > Thanks for the detailed report; very cool.  Interestingly, if you
> >> > reverse the direction of the merge (checkout origin/upstream-master
> >> > and merge origin/master) then you get a different error:
> >> > ...
> >> > Anyway, long story short...I don't have a fix yet, but just thought
> >> > I'd mention I saw the email and spent some hours digging in.
> >>
> >> Thanks for continued support for the ort strategy.  From the very
> >> beginning, I was hesitant to make our tools try to be too clever
> >> with excessive heuristics, but at least we are not making a silent
> >> mismerge in this case, so it is probably OK, especially with "-s
> >> recursive" still left as an escape hatch.
> >
> > In fact, the more general problem area here appears to affect the
> > recursive strategy as well.  I'm glad the specific testcase reported
> > works under recursive and gave Glen (or his user) a workaround, but
> > that feels like luck rather than design because my minimal
> > reproduction testcase not only triggers the same issue he saw with the
> > ort strategy, but also triggers a previously unknown fatal bug in the
> > recursive strategy too.
>
> Yeah, hm. I was surprised that we encountered this bug at all, but it's
> not so surprising after seeing how many edge conditions this bug
> contains.

To be fair, I've dug into cases with more.  :-)

> I wonder if there other rename detection bugs lurking beyond the
> horizon,

I was trying to dig around for related issues so I can fix the class
of problems rather than just the instance.  Reversing the direction of
the merge was just one component of that (and I reported that
particular tweak since it triggered something a little different).

The original motivation for writing merge-ort was to address bugs I
couldn't otherwise fix within merge-recursive's implementation.  I've
put a lot of time into corner cases, many of which (perhaps even the
majority) were not actually motivated by real-life testcases but me
just having an obsession with making Git's merge machinery handle
weird inputs.  Junio even commented on some of my testcases with 'I am
not sure if there is a single "correct" answer everybody can agree on
for each of these "insane" cases, though.'.  Now, obviously I can miss
some inputs, as evidenced by the issue you reported, so there is
always a chance there are more.  However...

> whether we already assume that these bugs exist, and if so,
> whether we should officially document "merge without rename detection"
> as a workaround [1].
>
> [1] Assuming that the workaround works of course. I tried to disable
> rename detection several times, but I couldn't really figure out whether
> I did it correctly or whether it fixed the bug (which is why I didn't
> include it in the initial report.)

Turning off renames and relying on users to correct merge issues may
be reasonable when there are only a few.  When there are more than a
few, my experience in the past with turning off rename detection (or
there being too many renames that rename detection turns itself off)
is that users often:

  * don't match up renamed files and do a three-way merge, but just
pick one of the two conflicting sides, unknowingly discarding changes
made on the other side
  * sometimes notice the files that should have been renames, and
manually hand apply the subset of changes they remember from one file
to the other, and unknowingly discarding the remaining subset of
changes (which were often changes made by people other than the one
doing the merges).

In the particular repository in question, you've got 600+ renames on
one side, and 200+ on the other -- including multiple different entire
directories.  (Also, since lack of rename detection makes renames get
reported as modify/delete conflicts, and you've got 400+ actual
modify/delete conflicts on top of all the renames, users would have
lots of "fun" attempting to sort things all out without tool support.)
 So, I'm worried the "fallback"/"workaround" is likely to put users in
a worse situation rather than a better one.

But, even if your goal really is to have a fallback, why not just use
the `resolve` strategy?  Your testcase doesn't have multiple merge
bases, and the resolve strategy is roughly the recursive strategy
minus the renames and the multiple merge base handling.

(Also, I'm not just avoiding work.  I have already written patches to
turn off rename detection in the ort strategy.  I pointed these out to
Stolee and Dscho for a special internal usecase of theirs, and at
least one of those emails cc'ed the mailing list. so you should be
able to find those patches with a search.  I'm just not convinced of
the need to merge those patches.)



Anyway, that all said, I posted a fix for this issue over at
https://lore.kernel.org/git/pull.1268.git.1655871651.gitgitgadget@gmail.com/.
With it, I can repeat the tensorflow merge you highlighted, in either
direction, without issue.

  reply	other threads:[~2022-06-22  4:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  0:11 Bug in merge-ort (rename detection can have collisions?) Glen Choo
2022-06-10  6:41 ` Elijah Newren
2022-06-10 16:53   ` Junio C Hamano
2022-06-11  8:56     ` Elijah Newren
2022-06-13 16:52       ` Glen Choo
2022-06-22  4:30         ` Elijah Newren [this message]
2022-06-22 16:58           ` Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BGN0DoSr3bcjTmGZkcoj_dSVzOgFUQ++R=_z8v=nAJsTg@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).