From: Philippe Blain <levraiphilippeblain@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: git <git@vger.kernel.org>
Subject: Re: rebase - "fixup!" conflict applied at the wrong code location, why ?
Date: Mon, 13 Jul 2020 22:42:07 -0400 [thread overview]
Message-ID: <D99E27A0-04BC-40B7-A073-66E1301FFE5D@gmail.com> (raw)
In-Reply-To: <CABPp-BEu=7xSLk5AE8sQX+S-vvYXxgn+avQ8+9ttdSKDjikb9w@mail.gmail.com>
Hi Elijah,
> Le 9 juil. 2020 à 00:07, Elijah Newren <newren@gmail.com> a écrit :
>
> On Wed, Jul 8, 2020 at 5:07 PM Philippe Blain
> <levraiphilippeblain@gmail.com> wrote:
>>
>> Hello,
>>
>> I've been working on a branch for a while. I've been using `git commit --fixup` and `git commit --squash`
>> when I noticed that I had forgotten to add something to a previous commit.
>> Today I did `git rebase --autosquash` to clean up my history, and the rebase failed at the
>> first 'fixup!' commit with a conflict. However, the conflict is not located at the right place
>> in the code (it's not in the right subroutine!). This is very surprising to me, and I would
>> like to understand why it happens.
>>
>> Steps to reproduce:
>>
>> git clone -b branch-to-be-rebased https://github.com/phil-blain/CICE.git cice
>> cd cice
>> git rebase -i --autosquash my-first-commit
>> # save the todo list without modifications
>> Auto-merging <file>
>> CONFLICT (content): Merge conflict in <file>
>> error: could not apply e8bfa55... fixup! <commit message of f4e1ae6>
>> # the rebase stops at f4e1ae6
>> git diff
>> # tangential question : for some reason the hunk header does not appear here, I don't know why...
>> git diff -2 # but it appears here
>> git grep -p -e '<<<<<<< HEAD' -e '>>>>>>> e8bfa55...' # or here
>> # ok, the conflict appears in subroutine 'picard_solver'
>> git show REBASE_HEAD -U5
>> # but the original "fixup!" commit only modifies the subroutine 'anderson_solver' !!
>>
>> I would have expected that the conflict be created around lines 1118-1132
>> (line numbers in f4e1ae6), in the 'anderson_solver' subroutine.
>>
>> I don't know if this plays a part here, but commit f4e1ae6 (where the rebase stops)
>> is the commit where the 'anderson_solver' subroutine is added to the code...
>>
>> Thanks,
>>
>> Philippe.
>
> If you take a look where the rebase stops, you see:
First, thanks a lot for your answer. I have a few questions.
>
> $ git ls-files -u
> 100644 ee4377f1ec6836fa05573976a473373906c37d9f 1
> cicecore/cicedynB/dynamics/ice_dyn_vp.F90
> 100644 30c699ac371c2a751052fa98d04317e84a96ec47 2
> cicecore/cicedynB/dynamics/ice_dyn_vp.F90
> 100644 276f224e9048fe0bbd7c25822695049547362c87 3
> cicecore/cicedynB/dynamics/ice_dyn_vp.F90
>
> The difference from the merge base to "other" (index 3) is pretty
> tiny, you just moved one line in the "anderson_solver" subroutine
> about 10 lines down.
Yes, the output from
$ git diff :1:cicecore/cicedynB/dynamics/ice_dyn_vp.F90 :3:cicecore/cicedynB/dynamics/ice_dyn_vp.F90
seems to be the same as the one from
$ git show REBASE_HEAD
This is a little confusing to me, in the sense that I don't understand why
the merge-base is what it is. At this point, I do
$ git merge-base HEAD REBASE_HEAD
f4e1ae67b7d6ca36c6f3ea7c9da43d81caf24067
Ok, so 'git merge-base' finds that the merge-base between HEAD and
REBASE_HEAD is HEAD; this makes sense to me (no previous commits
have been rewritten so far, so REBASE_HEAD is directly ahead of HEAD).
But, if I try to find the commit that contains
the blob ee4377f1ec6836fa05573976a473373906c37d9f (index 1), I find
REBASE_HEAD's parent:
$ git log --all --find-object=ee4377f1ec6836fa05573976a473373906c37d9f --format='commmit %H%ntree %T%nparent %P%n%n %s%n'
commmit e8bfa557d3c81b75116d6557784b0439b792a308
tree f6fecb8193c3b877f22bcb8f4d8d2c203e17f06f
parent 7a8d5a82984dfedd7fac1d7ed7c7fbd1781c1f61
fixup! Add Anderson acceleration for Picard iteration
commmit 7a8d5a82984dfedd7fac1d7ed7c7fbd1781c1f61
tree 11fd096851015c0c16b793d9bbb5db039776483b
parent 63d4c73c1dd973f620307833bd363a1d5069d090
ice_dyn_vp: introduce 'CICE_USE_LAPACK' preprocessor macro
$ blob=ee4377f1ec6836fa05573976a473373906c37d9f
$ parent=7a8d5a82984dfedd7fac1d7ed7c7fbd1781c1f61
$ git ls-tree -r $parent^{tree} | grep $blob
100644 blob ee4377f1ec6836fa05573976a473373906c37d9f cicecore/cicedynB/dynamics/ice_dyn_vp.F90
I don't understand why at this point of the rebase, Git
determines that the merge-base between HEAD and
REBASE_HEAD is REBASE_HEAD's parent... this commit
is not even an ancestor of HEAD (or maybe I don't understand
what the "merge-base" is in this context?)...
> The more interesting difference is from the
> merge base to "ours" (index 2), seen with:
>
> $ git diff :1:cicecore/cicedynB/dynamics/ice_dyn_vp.F90
> :2:cicecore/cicedynB/dynamics/ice_dyn_vp.F90
>
> If you search for "anderson_solver" in that diff, you see that the
> reconstructed diff looks like "subroutine anderson_solver" is renamed
> to "subroutine picard_solver" with similar but slightly different
> arguments. The diff routine successfully finds lots of lines common
> between these two subroutines so it seems quite logical as a way of
> representing the changes, especially since they occur near each other
> in the order of the file. If you keep looking further in the diff, it
> says that a few hundred lines later there's another subroutine rename,
> this time from "subroutine calc_zeta_Pr" to "subroutine
> anderson_solver". Looking at it this way, it's quite natural to apply
> the change from the rebased commit to the picard_solver since the
> other side of history "renamed" anderson_solver to picard_solver.
>
> This might not make as much sense to you, since you tend to think in
> terms of parse trees and "subroutines" are a high level unit. But
> diffs work in terms of lines and have no knowledge of any kind of file
> structure or semantics. If you have a string of functions A,B,C and
> insert a new function Z on one side of history somewhere in a file
> that happens to look a lot like existing functions (meaning several
> identical lines between the two), then diffs won't necessarily treat
> it as a contiguous block of inserted lines but instead compare
> function Z on one side to function B on the other, then function B on
> the first side to function C on the second, etc.
>
> There are alternative diff algorithms that try to minimize the number
> of changed lines shown by the diff, such as --histogram and
> --patience, and they do shrink this particular diff, but they both
> yield the same treatment of comparing "subroutine anderson_solver" to
> "subroutine picard_solver" essentially treating it as a rename. To
> get a merge like you want, you'd need some kind of higher level
> semantic merge, or at least need functions to not have so many lines
> in common between them.
Thanks for these explanations. The behaviour makes sense.
>
> This kind of problem also causes big issues when reordering functions
> in a file. line-by-line diffs and the diff3 merge algorithm tend to
> really struggle with those. (See e.g.
> https://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf if you want
> to read up more on these.)
Thanks for the pointer!
>
> Hope that helps,
> Elijah
Thanks,
Philippe.
next prev parent reply other threads:[~2020-07-14 2:42 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-08 23:03 rebase - "fixup!" conflict applied at the wrong code location, why ? Philippe Blain
2020-07-08 23:07 ` Philippe Blain
2020-07-09 4:07 ` Elijah Newren
2020-07-14 2:42 ` Philippe Blain [this message]
2020-07-14 5:45 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D99E27A0-04BC-40B7-A073-66E1301FFE5D@gmail.com \
--to=levraiphilippeblain@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).