git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug in merge-ort (rename detection can have collisions?)
@ 2022-06-08  0:11 Glen Choo
  2022-06-10  6:41 ` Elijah Newren
  0 siblings, 1 reply; 7+ messages in thread
From: Glen Choo @ 2022-06-08  0:11 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren


(I'm not 100% what the bug _is_, only that there is one.)

= Report

At $DAYJOB, there was a report that "git merge" was failing on certain
branches. Fortunately, the repo is publicly accessible, so I can share
the full reproduction recipe:

  git clone https://android.googlesource.com/platform/external/tensorflow tensorflow &&
  cd tensorflow &&
  git merge origin/upstream-master # HEAD is at origin/master

This gives:

  Performing inexact rename detection: 100% (4371280/4371280), done.
  Performing inexact rename detection: 100% (12529218/12529218), done.
  Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410. 

This bug seems specific to merge-ort; "git merge -s recursive
origin/upstream-master" seems to work as expected.

In case the branches have changed since then, here are the commit ids:

  $ git rev-parse origin/master
  68e55281824e8a79fa67e1a3061f39bd4c4b2e57
  $ git rev-parse origin/upstream-master
  0be5bb09aeeff3a6825842326fadc8159a5553ab
  $ git merge-base 68e55281824e8a79fa67e1a3061f39bd4c4b2e57 0be5bb09aeeff3a6825842326fadc8159a5553ab
  8e819019081f39d83df42baba4acfced3abf3f90

= Interesting info

I don't understand the merge-ort code enough to understand what's going
on, but I was able to find some (hopefully helpful) details. I added
this log line just above the offending assert() call:

	trace2_printf("0 %s, 1 %s, 2 %s, fm %d, dm %d", ci->pathnames[0],
    ci->pathnames[1], ci->pathnames[2], ci->filemask, ci->dirmask);

Here are the lines I thought were suspicious:

  0 <path1>, 1 <path1>, 2 <path1>, fm 2, dm 0
  [...]
  0 <path2>, 1 <path1>, 2 <path2>, fm 6, dm 0 # this is the last line

Notice that the last line detected a rename from <path2> to <path1>, but
we already saw <path1> earlier.

IIUC "(ci->filemask == 2 || ci->filemask == 4)" can be read as "the path
either exists on only the left side or only the right side of the
merge", so ci->filemask == 6 should mean "the path exists on both sides
of the merge"?

"-s recursive" seems to handle the rename just fine (it picks <path2>
IIRC).

I also dug into each commit to see which paths were present:

  head="origin/master"
  other="origin/upstream-master"
  merge_base="$(git merge-base origin/master origin/upstream-master)"
  path1="tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb"
  path2="tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb"

  git rev-parse "$head:$path1" # (exists)
  git rev-parse "$head:$path2" # (doesn't exist)

  git rev-parse "$other:$path1" # (doesn't exist)
  git rev-parse "$other:$path2" # (exists)

  git rev-parse "$merge_base:$path1" # (doesn't exist)
  git rev-parse "$merge_base:$path2" # (doesn't exist)

i.e. both files are new and are renames of each other. I haven't tried
using this property to create a minimally-reproducing recipe though.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-08  0:11 Bug in merge-ort (rename detection can have collisions?) Glen Choo
@ 2022-06-10  6:41 ` Elijah Newren
  2022-06-10 16:53   ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Elijah Newren @ 2022-06-10  6:41 UTC (permalink / raw)
  To: Glen Choo; +Cc: Git Mailing List

On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
>
> (I'm not 100% what the bug _is_, only that there is one.)
>
> = Report
>
> At $DAYJOB, there was a report that "git merge" was failing on certain
> branches. Fortunately, the repo is publicly accessible, so I can share
> the full reproduction recipe:
>
>   git clone https://android.googlesource.com/platform/external/tensorflow tensorflow &&
>   cd tensorflow &&
>   git merge origin/upstream-master # HEAD is at origin/master
>
> This gives:
>
>   Performing inexact rename detection: 100% (4371280/4371280), done.
>   Performing inexact rename detection: 100% (12529218/12529218), done.
>   Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410.
>
> This bug seems specific to merge-ort; "git merge -s recursive
> origin/upstream-master" seems to work as expected.
>
> In case the branches have changed since then, here are the commit ids:
>
>   $ git rev-parse origin/master
>   68e55281824e8a79fa67e1a3061f39bd4c4b2e57
>   $ git rev-parse origin/upstream-master
>   0be5bb09aeeff3a6825842326fadc8159a5553ab
>   $ git merge-base 68e55281824e8a79fa67e1a3061f39bd4c4b2e57 0be5bb09aeeff3a6825842326fadc8159a5553ab
>   8e819019081f39d83df42baba4acfced3abf3f90
>
> = Interesting info
>
> I don't understand the merge-ort code enough to understand what's going
> on, but I was able to find some (hopefully helpful) details. I added
> this log line just above the offending assert() call:
>
>         trace2_printf("0 %s, 1 %s, 2 %s, fm %d, dm %d", ci->pathnames[0],
>     ci->pathnames[1], ci->pathnames[2], ci->filemask, ci->dirmask);
>
> Here are the lines I thought were suspicious:
>
>   0 <path1>, 1 <path1>, 2 <path1>, fm 2, dm 0
>   [...]
>   0 <path2>, 1 <path1>, 2 <path2>, fm 6, dm 0 # this is the last line
>
> Notice that the last line detected a rename from <path2> to <path1>, but
> we already saw <path1> earlier.
>
> IIUC "(ci->filemask == 2 || ci->filemask == 4)" can be read as "the path
> either exists on only the left side or only the right side of the
> merge", so ci->filemask == 6 should mean "the path exists on both sides
> of the merge"?
>
> "-s recursive" seems to handle the rename just fine (it picks <path2>
> IIRC).
>
> I also dug into each commit to see which paths were present:
>
>   head="origin/master"
>   other="origin/upstream-master"
>   merge_base="$(git merge-base origin/master origin/upstream-master)"
>   path1="tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb"
>   path2="tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb"
>
>   git rev-parse "$head:$path1" # (exists)
>   git rev-parse "$head:$path2" # (doesn't exist)
>
>   git rev-parse "$other:$path1" # (doesn't exist)
>   git rev-parse "$other:$path2" # (exists)
>
>   git rev-parse "$merge_base:$path1" # (doesn't exist)
>   git rev-parse "$merge_base:$path2" # (doesn't exist)
>
> i.e. both files are new and are renames of each other. I haven't tried
> using this property to create a minimally-reproducing recipe though.

Thanks for the detailed report; very cool.  Interestingly, if you
reverse the direction of the merge (checkout origin/upstream-master
and merge origin/master) then you get a different error:

    error: cache entry has null sha1:
tensorflow/lite/g3doc/examples/convert/metadata_writer_tutorial.ipynb
    fatal: unable to write .git/index

This merge has a lot of stuff going on, including some big directory
renames, new files, loads of conflicts, etc.  I think the relevant
bits are the following:

merge base -> side1:
    Rename tensorflow/lite/g3doc/models/ -> tensorflow/lite/g3doc/examples
    Add tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb

merge base -> side2:
    Rename tensorflow/lite/g3doc/convert/ ->
tensorflow/lite/g3doc/models/convert/
    Add tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb

So the combination of the above means:
    side1:tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb
would be affected by the side2 directory rename to become
    tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb
which would match the name of the file on side2, BUT the side2 version of
that file, i.e.:
    side2:tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb
would be affected by the side1 directory rename to become
    tensorflow/lite/g3doc/examples/convert/metadata_writer_tutorial.ipynb
and what becomes of the side1 version of that file?  It gets messy...


I have a small reproduction recipe (using the style from t6423 to explain it):

#   Commit O: sub1/file,                 sub2/other
#   Commit A: sub3/file,                 sub2/{other, new_add_add_file_1}
#   Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2}
#
#   In words:
#     A: sub1/ -> sub3/, add sub2/new_add_add_file_1
#     B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2

This small reproduction recipe triggers the same assertion-failure you
reported; further, when the merge direction is reversed this testcase
shows the same alternative error I showed above for your bigger
testcase.  However, interestingly, this simple testcase also triggers
those same null-sha1 and unable-to-write-.git/index errors in
merge-recursive, regardless of the direction of the merge.  I don't
know why my testcase triggers bugs in merge-recursive and your bigger
testcase doesn't, but I wasn't too motivated to find out either; I was
mostly focused on trying to understand the merge-ort side of things.

Now, there's code in both merge-recursive and merge-ort for avoiding
doubly transitive renames (i.e. one side renames directory A/ -> B/,
and the other side renames directory B/ -> C/, and even worse if the
original side also renames C/ -> D/, because this combination would
otherwise make a mess for new files added to A/ on the first side and
wondering which directory they end up in).  However, this is a
testcase where a _leading directory_ of B/ is renamed to C/, which
sidesteps the normal doubly transitive rename check, and then the code
heads down what looks like the wrong path until it is caught by the
assertion check you reported.  In addition to the
funny-doubly-transitive-rename (involving the leading directory), your
testcase also has both sides add a file, one side to A/ and the other
side to B/ (with the same basename but different file contents), which
adds to the "fun".

Anyway, long story short...I don't have a fix yet, but just thought
I'd mention I saw the email and spent some hours digging in.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-10  6:41 ` Elijah Newren
@ 2022-06-10 16:53   ` Junio C Hamano
  2022-06-11  8:56     ` Elijah Newren
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2022-06-10 16:53 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Glen Choo, Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
>>
>> (I'm not 100% what the bug _is_, only that there is one.)
>>
>> = Report
>>
>> At $DAYJOB, there was a report that "git merge" was failing on certain
>> branches. Fortunately, the repo is publicly accessible, so I can share
>> the full reproduction recipe:
>> ...
> Thanks for the detailed report; very cool.  Interestingly, if you
> reverse the direction of the merge (checkout origin/upstream-master
> and merge origin/master) then you get a different error:
> ...
> Anyway, long story short...I don't have a fix yet, but just thought
> I'd mention I saw the email and spent some hours digging in.

Thanks for continued support for the ort strategy.  From the very
beginning, I was hesitant to make our tools try to be too clever
with excessive heuristics, but at least we are not making a silent
mismerge in this case, so it is probably OK, especially with "-s
recursive" still left as an escape hatch.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-10 16:53   ` Junio C Hamano
@ 2022-06-11  8:56     ` Elijah Newren
  2022-06-13 16:52       ` Glen Choo
  0 siblings, 1 reply; 7+ messages in thread
From: Elijah Newren @ 2022-06-11  8:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Glen Choo, Git Mailing List

On Fri, Jun 10, 2022 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
> >>
> >> (I'm not 100% what the bug _is_, only that there is one.)
> >>
> >> = Report
> >>
> >> At $DAYJOB, there was a report that "git merge" was failing on certain
> >> branches. Fortunately, the repo is publicly accessible, so I can share
> >> the full reproduction recipe:
> >> ...
> > Thanks for the detailed report; very cool.  Interestingly, if you
> > reverse the direction of the merge (checkout origin/upstream-master
> > and merge origin/master) then you get a different error:
> > ...
> > Anyway, long story short...I don't have a fix yet, but just thought
> > I'd mention I saw the email and spent some hours digging in.
>
> Thanks for continued support for the ort strategy.  From the very
> beginning, I was hesitant to make our tools try to be too clever
> with excessive heuristics, but at least we are not making a silent
> mismerge in this case, so it is probably OK, especially with "-s
> recursive" still left as an escape hatch.

I'm pretty sure the bug would still trigger even if we removed all the
heuristic differences that the ort strategy has relative to the
recursive one; I don't think those are related to this problem at all.

In fact, the more general problem area here appears to affect the
recursive strategy as well.  I'm glad the specific testcase reported
works under recursive and gave Glen (or his user) a workaround, but
that feels like luck rather than design because my minimal
reproduction testcase not only triggers the same issue he saw with the
ort strategy, but also triggers a previously unknown fatal bug in the
recursive strategy too.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-11  8:56     ` Elijah Newren
@ 2022-06-13 16:52       ` Glen Choo
  2022-06-22  4:30         ` Elijah Newren
  0 siblings, 1 reply; 7+ messages in thread
From: Glen Choo @ 2022-06-13 16:52 UTC (permalink / raw)
  To: Elijah Newren, Junio C Hamano; +Cc: Git Mailing List


Thanks Elijah, your explanation is really great :)
This looks like a pretty nasty bug. Ugh..

Elijah Newren <newren@gmail.com> writes:

> On Fri, Jun 10, 2022 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Elijah Newren <newren@gmail.com> writes:
>>
>> > On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
>> >>
>> >> (I'm not 100% what the bug _is_, only that there is one.)
>> >>
>> >> = Report
>> >>
>> >> At $DAYJOB, there was a report that "git merge" was failing on certain
>> >> branches. Fortunately, the repo is publicly accessible, so I can share
>> >> the full reproduction recipe:
>> >> ...
>> > Thanks for the detailed report; very cool.  Interestingly, if you
>> > reverse the direction of the merge (checkout origin/upstream-master
>> > and merge origin/master) then you get a different error:
>> > ...
>> > Anyway, long story short...I don't have a fix yet, but just thought
>> > I'd mention I saw the email and spent some hours digging in.
>>
>> Thanks for continued support for the ort strategy.  From the very
>> beginning, I was hesitant to make our tools try to be too clever
>> with excessive heuristics, but at least we are not making a silent
>> mismerge in this case, so it is probably OK, especially with "-s
>> recursive" still left as an escape hatch.
>
> In fact, the more general problem area here appears to affect the
> recursive strategy as well.  I'm glad the specific testcase reported
> works under recursive and gave Glen (or his user) a workaround, but
> that feels like luck rather than design because my minimal
> reproduction testcase not only triggers the same issue he saw with the
> ort strategy, but also triggers a previously unknown fatal bug in the
> recursive strategy too.

Yeah, hm. I was surprised that we encountered this bug at all, but it's
not so surprising after seeing how many edge conditions this bug
contains.

I wonder if there other rename detection bugs lurking beyond the
horizon, whether we already assume that these bugs exist, and if so,
whether we should officially document "merge without rename detection"
as a workaround [1].

[1] Assuming that the workaround works of course. I tried to disable
rename detection several times, but I couldn't really figure out whether
I did it correctly or whether it fixed the bug (which is why I didn't
include it in the initial report.)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-13 16:52       ` Glen Choo
@ 2022-06-22  4:30         ` Elijah Newren
  2022-06-22 16:58           ` Glen Choo
  0 siblings, 1 reply; 7+ messages in thread
From: Elijah Newren @ 2022-06-22  4:30 UTC (permalink / raw)
  To: Glen Choo; +Cc: Junio C Hamano, Git Mailing List

Sorry for the long delay.  I haven't gotten much Git time lately...

On Mon, Jun 13, 2022 at 9:52 AM Glen Choo <chooglen@google.com> wrote:
>
[...]
> Elijah Newren <newren@gmail.com> writes:
>
> > On Fri, Jun 10, 2022 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Elijah Newren <newren@gmail.com> writes:
> >>
> >> > On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
> >> >>
> >> >> (I'm not 100% what the bug _is_, only that there is one.)
> >> >>
> >> >> = Report
> >> >>
> >> >> At $DAYJOB, there was a report that "git merge" was failing on certain
> >> >> branches. Fortunately, the repo is publicly accessible, so I can share
> >> >> the full reproduction recipe:
> >> >> ...
> >> > Thanks for the detailed report; very cool.  Interestingly, if you
> >> > reverse the direction of the merge (checkout origin/upstream-master
> >> > and merge origin/master) then you get a different error:
> >> > ...
> >> > Anyway, long story short...I don't have a fix yet, but just thought
> >> > I'd mention I saw the email and spent some hours digging in.
> >>
> >> Thanks for continued support for the ort strategy.  From the very
> >> beginning, I was hesitant to make our tools try to be too clever
> >> with excessive heuristics, but at least we are not making a silent
> >> mismerge in this case, so it is probably OK, especially with "-s
> >> recursive" still left as an escape hatch.
> >
> > In fact, the more general problem area here appears to affect the
> > recursive strategy as well.  I'm glad the specific testcase reported
> > works under recursive and gave Glen (or his user) a workaround, but
> > that feels like luck rather than design because my minimal
> > reproduction testcase not only triggers the same issue he saw with the
> > ort strategy, but also triggers a previously unknown fatal bug in the
> > recursive strategy too.
>
> Yeah, hm. I was surprised that we encountered this bug at all, but it's
> not so surprising after seeing how many edge conditions this bug
> contains.

To be fair, I've dug into cases with more.  :-)

> I wonder if there other rename detection bugs lurking beyond the
> horizon,

I was trying to dig around for related issues so I can fix the class
of problems rather than just the instance.  Reversing the direction of
the merge was just one component of that (and I reported that
particular tweak since it triggered something a little different).

The original motivation for writing merge-ort was to address bugs I
couldn't otherwise fix within merge-recursive's implementation.  I've
put a lot of time into corner cases, many of which (perhaps even the
majority) were not actually motivated by real-life testcases but me
just having an obsession with making Git's merge machinery handle
weird inputs.  Junio even commented on some of my testcases with 'I am
not sure if there is a single "correct" answer everybody can agree on
for each of these "insane" cases, though.'.  Now, obviously I can miss
some inputs, as evidenced by the issue you reported, so there is
always a chance there are more.  However...

> whether we already assume that these bugs exist, and if so,
> whether we should officially document "merge without rename detection"
> as a workaround [1].
>
> [1] Assuming that the workaround works of course. I tried to disable
> rename detection several times, but I couldn't really figure out whether
> I did it correctly or whether it fixed the bug (which is why I didn't
> include it in the initial report.)

Turning off renames and relying on users to correct merge issues may
be reasonable when there are only a few.  When there are more than a
few, my experience in the past with turning off rename detection (or
there being too many renames that rename detection turns itself off)
is that users often:

  * don't match up renamed files and do a three-way merge, but just
pick one of the two conflicting sides, unknowingly discarding changes
made on the other side
  * sometimes notice the files that should have been renames, and
manually hand apply the subset of changes they remember from one file
to the other, and unknowingly discarding the remaining subset of
changes (which were often changes made by people other than the one
doing the merges).

In the particular repository in question, you've got 600+ renames on
one side, and 200+ on the other -- including multiple different entire
directories.  (Also, since lack of rename detection makes renames get
reported as modify/delete conflicts, and you've got 400+ actual
modify/delete conflicts on top of all the renames, users would have
lots of "fun" attempting to sort things all out without tool support.)
 So, I'm worried the "fallback"/"workaround" is likely to put users in
a worse situation rather than a better one.

But, even if your goal really is to have a fallback, why not just use
the `resolve` strategy?  Your testcase doesn't have multiple merge
bases, and the resolve strategy is roughly the recursive strategy
minus the renames and the multiple merge base handling.

(Also, I'm not just avoiding work.  I have already written patches to
turn off rename detection in the ort strategy.  I pointed these out to
Stolee and Dscho for a special internal usecase of theirs, and at
least one of those emails cc'ed the mailing list. so you should be
able to find those patches with a search.  I'm just not convinced of
the need to merge those patches.)



Anyway, that all said, I posted a fix for this issue over at
https://lore.kernel.org/git/pull.1268.git.1655871651.gitgitgadget@gmail.com/.
With it, I can repeat the tensorflow merge you highlighted, in either
direction, without issue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug in merge-ort (rename detection can have collisions?)
  2022-06-22  4:30         ` Elijah Newren
@ 2022-06-22 16:58           ` Glen Choo
  0 siblings, 0 replies; 7+ messages in thread
From: Glen Choo @ 2022-06-22 16:58 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Junio C Hamano, Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> Sorry for the long delay.  I haven't gotten much Git time lately...
>
> On Mon, Jun 13, 2022 at 9:52 AM Glen Choo <chooglen@google.com> wrote:
>>
> [...]
>> Elijah Newren <newren@gmail.com> writes:
>>
>> > On Fri, Jun 10, 2022 at 9:53 AM Junio C Hamano <gitster@pobox.com> wrote:
>> >>
>> >> Elijah Newren <newren@gmail.com> writes:
>> >>
>> >> > On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@google.com> wrote:
>> >> >>
>> >> >> (I'm not 100% what the bug _is_, only that there is one.)
>> >> >>
>> >> >> = Report
>> >> >>
>> >> >> At $DAYJOB, there was a report that "git merge" was failing on certain
>> >> >> branches. Fortunately, the repo is publicly accessible, so I can share
>> >> >> the full reproduction recipe:
>> >> >> ...
>> >> > Thanks for the detailed report; very cool.  Interestingly, if you
>> >> > reverse the direction of the merge (checkout origin/upstream-master
>> >> > and merge origin/master) then you get a different error:
>> >> > ...
>> >> > Anyway, long story short...I don't have a fix yet, but just thought
>> >> > I'd mention I saw the email and spent some hours digging in.
>> >>
>> >> Thanks for continued support for the ort strategy.  From the very
>> >> beginning, I was hesitant to make our tools try to be too clever
>> >> with excessive heuristics, but at least we are not making a silent
>> >> mismerge in this case, so it is probably OK, especially with "-s
>> >> recursive" still left as an escape hatch.
>> >
>> > In fact, the more general problem area here appears to affect the
>> > recursive strategy as well.  I'm glad the specific testcase reported
>> > works under recursive and gave Glen (or his user) a workaround, but
>> > that feels like luck rather than design because my minimal
>> > reproduction testcase not only triggers the same issue he saw with the
>> > ort strategy, but also triggers a previously unknown fatal bug in the
>> > recursive strategy too.
>>
>> Yeah, hm. I was surprised that we encountered this bug at all, but it's
>> not so surprising after seeing how many edge conditions this bug
>> contains.
>
> To be fair, I've dug into cases with more.  :-)

:)

>> I wonder if there other rename detection bugs lurking beyond the
>> horizon,
>
> I was trying to dig around for related issues so I can fix the class
> of problems rather than just the instance.  Reversing the direction of
> the merge was just one component of that (and I reported that
> particular tweak since it triggered something a little different).
>
> The original motivation for writing merge-ort was to address bugs I
> couldn't otherwise fix within merge-recursive's implementation.  I've
> put a lot of time into corner cases, many of which (perhaps even the
> majority) were not actually motivated by real-life testcases but me
> just having an obsession with making Git's merge machinery handle
> weird inputs.  Junio even commented on some of my testcases with 'I am
> not sure if there is a single "correct" answer everybody can agree on
> for each of these "insane" cases, though.'.  Now, obviously I can miss
> some inputs, as evidenced by the issue you reported, so there is
> always a chance there are more.  However...
>
>> whether we already assume that these bugs exist, and if so,
>> whether we should officially document "merge without rename detection"
>> as a workaround [1].
>>
>> [1] Assuming that the workaround works of course. I tried to disable
>> rename detection several times, but I couldn't really figure out whether
>> I did it correctly or whether it fixed the bug (which is why I didn't
>> include it in the initial report.)
>
> Turning off renames and relying on users to correct merge issues may
> be reasonable when there are only a few.  When there are more than a
> few, my experience in the past with turning off rename detection (or
> there being too many renames that rename detection turns itself off)
> is that users often:
>
>   * don't match up renamed files and do a three-way merge, but just
> pick one of the two conflicting sides, unknowingly discarding changes
> made on the other side
>   * sometimes notice the files that should have been renames, and
> manually hand apply the subset of changes they remember from one file
> to the other, and unknowingly discarding the remaining subset of
> changes (which were often changes made by people other than the one
> doing the merges).
>
> In the particular repository in question, you've got 600+ renames on
> one side, and 200+ on the other -- including multiple different entire
> directories.  (Also, since lack of rename detection makes renames get
> reported as modify/delete conflicts, and you've got 400+ actual
> modify/delete conflicts on top of all the renames, users would have
> lots of "fun" attempting to sort things all out without tool support.)
>  So, I'm worried the "fallback"/"workaround" is likely to put users in
> a worse situation rather than a better one.

Ah, that's a good point. I hadn't given too much thought to the
complexity of manually resolving a merge without rename detection, but,
like you said, it's probably not a useful fallback anyway.


> But, even if your goal really is to have a fallback, why not just use
> the `resolve` strategy?  Your testcase doesn't have multiple merge
> bases, and the resolve strategy is roughly the recursive strategy
> minus the renames and the multiple merge base handling.

Oh, today I learned about the `resolve` strategy..


> (Also, I'm not just avoiding work.  I have already written patches to
> turn off rename detection in the ort strategy.  I pointed these out to
> Stolee and Dscho for a special internal usecase of theirs, and at
> least one of those emails cc'ed the mailing list. so you should be
> able to find those patches with a search.  I'm just not convinced of
> the need to merge those patches.)
>
>
>
> Anyway, that all said, I posted a fix for this issue over at
> https://lore.kernel.org/git/pull.1268.git.1655871651.gitgitgadget@gmail.com/.
> With it, I can repeat the tensorflow merge you highlighted, in either
> direction, without issue.

That's fantastic. Thanks for keeping me in the loop :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-22 16:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-08  0:11 Bug in merge-ort (rename detection can have collisions?) Glen Choo
2022-06-10  6:41 ` Elijah Newren
2022-06-10 16:53   ` Junio C Hamano
2022-06-11  8:56     ` Elijah Newren
2022-06-13 16:52       ` Glen Choo
2022-06-22  4:30         ` Elijah Newren
2022-06-22 16:58           ` Glen Choo

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).