git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git rebase --rebase-merges information loss (and other woes)
       [not found] <1874143044.520636715.1617442122946.JavaMail.root@zimbra39-e7>
@ 2021-04-03 10:42 ` ydirson
  2021-04-03 14:06   ` Sergey Organov
  0 siblings, 1 reply; 4+ messages in thread
From: ydirson @ 2021-04-03 10:42 UTC (permalink / raw)
  To: git

I've been going through a couple of "rebase -i -r" lately, and would
like to share a couple of thoughts, starting with something looking
like a bug.

1. when a merge has been done with "-s ours", rebase replays it without
any special options, I proceed with the manual resolution, and if I just
--continue, the rebase mechanism believes I want to drop the commit, which
could not be more wrong.  I can still be careful myself, and use "git commit
--allow-empty" before --continue, but this feels awkward.

Is there any compelling reason not record the merge here ?


2. more generally, when a merge has been done with special options, it
would be a useful help in solving conflicts if rebase could use the same
options.  Maybe we could allow the rebase "merge" instruction to use more
merge options.  The user would still have to edit the instruction sheet
manually for those, however, and we could then want "rebase -i" to fill
them automatically, but that would seem to require recording the merge
options somewhere to start with - maybe in a note.


3. while it's made clear that any conflict resolution and amendments
have to be redone, maybe we could provide some support for a common
use case, namely "sink that commit/fixup down".  The conflict
resolution would then be like "checkout $OLD && cherry-pick -n $FIXUP".

Maybe this could be activated by a merge option in rebase-interactive
instructions, like "merge -C$OLD --fixup $F1 --fixup $F2".

Would that seem reasonable ?

-- 
Yann

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git rebase --rebase-merges information loss (and other woes)
  2021-04-03 10:42 ` git rebase --rebase-merges information loss (and other woes) ydirson
@ 2021-04-03 14:06   ` Sergey Organov
  2021-04-03 16:09     ` ydirson
  0 siblings, 1 reply; 4+ messages in thread
From: Sergey Organov @ 2021-04-03 14:06 UTC (permalink / raw)
  To: ydirson; +Cc: git

ydirson@free.fr writes:

> I've been going through a couple of "rebase -i -r" lately, and would
> like to share a couple of thoughts, starting with something looking
> like a bug.

I feel your pain and sympathize deeply.

>
> 1. when a merge has been done with "-s ours", rebase replays it without
> any special options, I proceed with the manual resolution, and if I just
> --continue, the rebase mechanism believes I want to drop the commit, which
> could not be more wrong.  I can still be careful myself, and use "git commit
> --allow-empty" before --continue, but this feels awkward.
>
> Is there any compelling reason not record the merge here ?

This looks like rather easy case to fix indeed. I mean empty commit
issue, not the original cause of the problem.

>
> 2. more generally, when a merge has been done with special options, it
> would be a useful help in solving conflicts if rebase could use the same
> options.  Maybe we could allow the rebase "merge" instruction to use more
> merge options.  The user would still have to edit the instruction sheet
> manually for those, however, and we could then want "rebase -i" to fill
> them automatically, but that would seem to require recording the merge
> options somewhere to start with - maybe in a note.

That could help now an then, but doesn't solve the problem in general,
as, first, the behavior of merge algorithms could change over time, and,
second, the merge could have been performed with external merge
algorithm in the first place, including entirely manual merge, and after
all, the person rebasing may have no idea at all how the original merge
has been achieved.

Recording information about merges at merge time has similar problems to
recording information about renames, both being "obvious" solutions that
in fact end-up being sub-optimal.

Fortunately, we still have the original merge handy, that Git simply
doesn't care to take into account, see below.

>
> 3. while it's made clear that any conflict resolution and amendments
> have to be redone, maybe we could provide some support for a common
> use case, namely "sink that commit/fixup down".  The conflict
> resolution would then be like "checkout $OLD && cherry-pick -n $FIXUP".
>
> Maybe this could be activated by a merge option in rebase-interactive
> instructions, like "merge -C$OLD --fixup $F1 --fixup $F2".
>
> Would that seem reasonable ?

I still (as this has been already heavily discussed some time ago)
believe that the most reasonable solution to all this is to rebase
merges rather than to throw them away. Redoing them, as Git does, is
wrong choice in most cases as what it means is that Git, despite the
option name --rebase-merges (and even better old name
--preserve-merges), simply still throws away your precious merge
commits, only then it substitutes something potentially entirely
different for them, often silently.

In addition to the problems you've encountered, silent drop of user
content is possible, and what's worse than that for a content preserving
tool? As a result, to be on the safe side, with current approach to
handling merges during rebase, any non-trivial merge that is expected to
be rebased (and how would one be sure it never will?) is to be very
carefully performed in 2 commits: merge itself and fixups, otherwise
chances are high fixups are silently lost during rebase.

Further, even this two-step approach doesn't solve all the problems. For
instance, issues with merges being originally performed with non-default
algorithm still remain (as in your case 1.) Moreover, if we notice that
default (or any thereof) algorithm itself could change over time,
inherent problems with the policy of recreating merge commits from
scratch during rebase get even more obvious.

Overall, to get this right, Git should finally refrain (at least by
default) from generally hopeless attempts to re-create merges from
scratch on rebase. Instead it should try to actually rebase existing
merges when user asks to preserve history shape. When and if automatic
rebase fails, one of the options to resolve the issue, besides fixing
rebase conflicts, is indeed to redo the merge, but then the user will be
perfectly aware of particular re-merge, and will be responsible for the
end result himself.

-- Sergey

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git rebase --rebase-merges information loss (and other woes)
  2021-04-03 14:06   ` Sergey Organov
@ 2021-04-03 16:09     ` ydirson
  2021-04-03 19:01       ` Sergey Organov
  0 siblings, 1 reply; 4+ messages in thread
From: ydirson @ 2021-04-03 16:09 UTC (permalink / raw)
  To: Sergey Organov; +Cc: git

Hi Sergey,

> > 1. when a merge has been done with "-s ours", rebase replays it
> > without
> > any special options, I proceed with the manual resolution, and if I
> > just
> > --continue, the rebase mechanism believes I want to drop the
> > commit, which
> > could not be more wrong.  I can still be careful myself, and use
> > "git commit
> > --allow-empty" before --continue, but this feels awkward.
> >
> > Is there any compelling reason not record the merge here ?
> 
> This looks like rather easy case to fix indeed. I mean empty commit
> issue, not the original cause of the problem.
> 
> >
> > 2. more generally, when a merge has been done with special options,
> > it
> > would be a useful help in solving conflicts if rebase could use the
> > same
> > options.  Maybe we could allow the rebase "merge" instruction to
> > use more
> > merge options.  The user would still have to edit the instruction
> > sheet
> > manually for those, however, and we could then want "rebase -i" to
> > fill
> > them automatically, but that would seem to require recording the
> > merge
> > options somewhere to start with - maybe in a note.
> 
> That could help now an then, but doesn't solve the problem in
> general,
> as, first, the behavior of merge algorithms could change over time,
> and,
> second, the merge could have been performed with external merge
> algorithm in the first place, including entirely manual merge, and
> after
> all, the person rebasing may have no idea at all how the original
> merge
> has been achieved.
> 
> Recording information about merges at merge time has similar problems
> to
> recording information about renames, both being "obvious" solutions
> that
> in fact end-up being sub-optimal.
> 
> Fortunately, we still have the original merge handy, that Git simply
> doesn't care to take into account, see below.
> 
> >
> > 3. while it's made clear that any conflict resolution and
> > amendments
> > have to be redone, maybe we could provide some support for a common
> > use case, namely "sink that commit/fixup down".  The conflict
> > resolution would then be like "checkout $OLD && cherry-pick -n
> > $FIXUP".
> >
> > Maybe this could be activated by a merge option in
> > rebase-interactive
> > instructions, like "merge -C$OLD --fixup $F1 --fixup $F2".
> >
> > Would that seem reasonable ?
> 
> I still (as this has been already heavily discussed some time ago)
> believe that the most reasonable solution to all this is to rebase
> merges rather than to throw them away. Redoing them, as Git does, is
> wrong choice in most cases as what it means is that Git, despite the
> option name --rebase-merges (and even better old name
> --preserve-merges), simply still throws away your precious merge
> commits, only then it substitutes something potentially entirely
> different for them, often silently.
> 
> In addition to the problems you've encountered, silent drop of user
> content is possible, and what's worse than that for a content
> preserving
> tool? As a result, to be on the safe side, with current approach to
> handling merges during rebase, any non-trivial merge that is expected
> to
> be rebased (and how would one be sure it never will?) is to be very
> carefully performed in 2 commits: merge itself and fixups, otherwise
> chances are high fixups are silently lost during rebase.

This reminds me of the approach used in git-reintegrate: to get merges
redone without loosing the fixup, it allows you to do exactly this, and
then is able to use rerere information to redo that often-incomplete part
of conflict resolution.  Then it squashes the fixup commit in the merge,
and is able to do that as long as the fixup commit is reachable.

One thing we could do given a merge commit, and provided that 1. we have
access to rerere cache, 2. a "standard merge" was done, and 3. the merge
algorithm did not change, we can pretty easily derive the two "separate
commits" (or arguably "separate parts of the merge").

That alone could maybe form the basis of the "redo merge" you're suggesting,
and would already cover a good number of use-cases.

For the case where the rerere cache is not available any more, I saw
we have a contrib/rerere-train.sh script, although I never tried it, as I
had written mine at the time, though I felt it had left it had too many rough edges
to share.  I'm attaching it for reference, as it also creates the separate fixup
commit (originally for use by git-reintegrate).

In fact, I wonder how much replaying merges created by other strategies would
perform, if we simply try to apply this idea to them too.


However, before I get too high on the idea, I have to say that in the rebase
that triggered this mail the rerere cache failed to get used in a couple of
situations: I did not care to check but I'd wager those were the commits in
conflict with precisely the fixup I was bringing down below the merges.
In this case (quite lots of conflicts to re-resolve because of a one-line
conflict, I felt so bad), the "apply the one-line by hand and juste resolve
*that* conflict" approach was really effective - so maybe it makes sense to
provide the two options, which may be suitable for different situations.

> 
> Further, even this two-step approach doesn't solve all the problems.
> For
> instance, issues with merges being originally performed with
> non-default
> algorithm still remain (as in your case 1.) Moreover, if we notice
> that
> default (or any thereof) algorithm itself could change over time,
> inherent problems with the policy of recreating merge commits from
> scratch during rebase get even more obvious.
> 
> Overall, to get this right, Git should finally refrain (at least by
> default) from generally hopeless attempts to re-create merges from
> scratch on rebase. Instead it should try to actually rebase existing
> merges when user asks to preserve history shape. When and if
> automatic
> rebase fails, one of the options to resolve the issue, besides fixing
> rebase conflicts, is indeed to redo the merge, but then the user will
> be
> perfectly aware of particular re-merge, and will be responsible for
> the
> end result himself.
> 
> -- Sergey
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git rebase --rebase-merges information loss (and other woes)
  2021-04-03 16:09     ` ydirson
@ 2021-04-03 19:01       ` Sergey Organov
  0 siblings, 0 replies; 4+ messages in thread
From: Sergey Organov @ 2021-04-03 19:01 UTC (permalink / raw)
  To: ydirson; +Cc: git

ydirson@free.fr writes:

> Hi Sergey,

Hi Yann,


[...]

> This reminds me of the approach used in git-reintegrate: to get merges
> redone without loosing the fixup, it allows you to do exactly this, and
> then is able to use rerere information to redo that often-incomplete part
> of conflict resolution.  Then it squashes the fixup commit in the merge,
> and is able to do that as long as the fixup commit is reachable.
>
> One thing we could do given a merge commit, and provided that 1. we have
> access to rerere cache, 2. a "standard merge" was done, and 3. the merge
> algorithm did not change, we can pretty easily derive the two "separate
> commits" (or arguably "separate parts of the merge").

Unfortunately, rerere is unreliable as you may rebase in a different
repository in the first place. Fortunately, all the needed information
is still there in the original merge commit though.

I suggest you read the following. You will find that it exactly talks
about 2 separate parts of the merge, but does not need rerere to do the
job:

https://public-inbox.org/git/87r2oxe3o1.fsf@javad.com/

To give you even more background, here is a reference to "Git Rev News"
that discusses the issue:

https://git.github.io/rev_news/2018/04/18/edition-38/#general

Unfortunately I've turned to other issues and lost track of what current
situation is, but according to your question the cart remains there
still.

>
> That alone could maybe form the basis of the "redo merge" you're suggesting,
> and would already cover a good number of use-cases.
>
> For the case where the rerere cache is not available any more, I saw
> we have a contrib/rerere-train.sh script, although I never tried it,
> as I had written mine at the time, though I felt it had left it had
> too many rough edges to share. I'm attaching it for reference, as it
> also creates the separate fixup commit (originally for use by
> git-reintegrate).
>
> In fact, I wonder how much replaying merges created by other
> strategies would perform, if we simply try to apply this idea to them
> too.

I deeply believe that Git should not care. You already have a merge
commit. What "strategies" or algorithm have been used to create that
commit should not matter for Git when it rebases *that commit*, the same
way it doesn't care how exactly you've created a non-merge commit.

I think that only after the basic rebasing is done right, some
additional niceties, such as guessing the strategy, could be
implemented, on top of fundamentally correct rebasing.

>
> However, before I get too high on the idea, I have to say that in the
>rebase that triggered this mail the rerere cache failed to get used in
>a couple of situations: I did not care to check but I'd wager those
>were the commits in conflict with precisely the fixup I was bringing
>down below the merges. In this case (quite lots of conflicts to
>re-resolve because of a one-line conflict, I felt so bad), the "apply
>the one-line by hand and juste resolve *that* conflict" approach was
>really effective - so maybe it makes sense to provide the two options,
>which may be suitable for different situations.

I'd not rely on rerere for rebasing merges if at all possible, and it
*is* possible, see the aforementioned reference. Options are definitely
good to have, but the default one must be the safest, that currently is
not the case at all.

If Git tried to actually rebase the merge, it could have happened there
would be no conflict, or else, but the worst situation currently is when
Git silently replaces original merge with something rather different
that just happens to result in no textual conflicts.

-- Sergey Organov

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-03 19:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1874143044.520636715.1617442122946.JavaMail.root@zimbra39-e7>
2021-04-03 10:42 ` git rebase --rebase-merges information loss (and other woes) ydirson
2021-04-03 14:06   ` Sergey Organov
2021-04-03 16:09     ` ydirson
2021-04-03 19:01       ` Sergey Organov

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).