git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Glen Choo <chooglen@google.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>
Subject: Bug in merge-ort (rename detection can have collisions?)
Date: Tue, 07 Jun 2022 17:11:09 -0700	[thread overview]
Message-ID: <kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com> (raw)


(I'm not 100% what the bug _is_, only that there is one.)

= Report

At $DAYJOB, there was a report that "git merge" was failing on certain
branches. Fortunately, the repo is publicly accessible, so I can share
the full reproduction recipe:

  git clone https://android.googlesource.com/platform/external/tensorflow tensorflow &&
  cd tensorflow &&
  git merge origin/upstream-master # HEAD is at origin/master

This gives:

  Performing inexact rename detection: 100% (4371280/4371280), done.
  Performing inexact rename detection: 100% (12529218/12529218), done.
  Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410. 

This bug seems specific to merge-ort; "git merge -s recursive
origin/upstream-master" seems to work as expected.

In case the branches have changed since then, here are the commit ids:

  $ git rev-parse origin/master
  68e55281824e8a79fa67e1a3061f39bd4c4b2e57
  $ git rev-parse origin/upstream-master
  0be5bb09aeeff3a6825842326fadc8159a5553ab
  $ git merge-base 68e55281824e8a79fa67e1a3061f39bd4c4b2e57 0be5bb09aeeff3a6825842326fadc8159a5553ab
  8e819019081f39d83df42baba4acfced3abf3f90

= Interesting info

I don't understand the merge-ort code enough to understand what's going
on, but I was able to find some (hopefully helpful) details. I added
this log line just above the offending assert() call:

	trace2_printf("0 %s, 1 %s, 2 %s, fm %d, dm %d", ci->pathnames[0],
    ci->pathnames[1], ci->pathnames[2], ci->filemask, ci->dirmask);

Here are the lines I thought were suspicious:

  0 <path1>, 1 <path1>, 2 <path1>, fm 2, dm 0
  [...]
  0 <path2>, 1 <path1>, 2 <path2>, fm 6, dm 0 # this is the last line

Notice that the last line detected a rename from <path2> to <path1>, but
we already saw <path1> earlier.

IIUC "(ci->filemask == 2 || ci->filemask == 4)" can be read as "the path
either exists on only the left side or only the right side of the
merge", so ci->filemask == 6 should mean "the path exists on both sides
of the merge"?

"-s recursive" seems to handle the rename just fine (it picks <path2>
IIRC).

I also dug into each commit to see which paths were present:

  head="origin/master"
  other="origin/upstream-master"
  merge_base="$(git merge-base origin/master origin/upstream-master)"
  path1="tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb"
  path2="tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb"

  git rev-parse "$head:$path1" # (exists)
  git rev-parse "$head:$path2" # (doesn't exist)

  git rev-parse "$other:$path1" # (doesn't exist)
  git rev-parse "$other:$path2" # (exists)

  git rev-parse "$merge_base:$path1" # (doesn't exist)
  git rev-parse "$merge_base:$path2" # (doesn't exist)

i.e. both files are new and are renames of each other. I haven't tried
using this property to create a minimally-reproducing recipe though.

             reply	other threads:[~2022-06-08  2:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  0:11 Glen Choo [this message]
2022-06-10  6:41 ` Bug in merge-ort (rename detection can have collisions?) Elijah Newren
2022-06-10 16:53   ` Junio C Hamano
2022-06-11  8:56     ` Elijah Newren
2022-06-13 16:52       ` Glen Choo
2022-06-22  4:30         ` Elijah Newren
2022-06-22 16:58           ` Glen Choo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com \
    --to=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).