git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* rebase: confusing behaviour since --fork-point
@ 2017-03-03 16:37 Laszlo Kiss
  0 siblings, 0 replies; only message in thread
From: Laszlo Kiss @ 2017-03-03 16:37 UTC (permalink / raw)
  To: git; +Cc: ldubox-coding101

Hi all,

This will be a bit long, sorry in advance.

I've hit a problem with how rebase works since the introduction of the
--fork-point option. I initially thought it was a bug until the kind
folks over at git-for-windows patiently told me otherwise.

Consider the following:

-------8<-------
# On branch master which is at origin/master
# hack hack hack

git commit -am'First topic commit'
# optionally followed by more commits

# realize I want all this on a different branch instead
# (maybe I just forgot to create one; would be typical)

git checkout -b topic --track
git branch -f master origin/master

# optionally add more commits to 'topic' and/or update 'master'
# with changes to 'origin/master', then finally:

git rebase    # or git pull --rebase
------->8-------

What happens is that 'First topic commit' is discarded because rebase,
with this syntax, defaults to --fork-point, which then works out from
master's reflog that the commit in question was previously part of the
upstream branch. The rebase is then carried out under the assumption
that, even though that specific commit is no longer in master,
something equivalent to it or superseding it is, or else it is meant to
be dropped - which would likely be the case if a complicated history
rewrite happened, but is not the case here.

There are two reasons why I think this is confusing and should be
changed:

(1) My mental model of git after some 5 years of using it is such that
    I view the reflog as auxiliary information for purposes of mining
    history or recovering from mistakes, and I rely only on the
    contents and parent-child relationships of commits to work out what
    a command I'm about to run will do. I would expect the vast
    majority of users to have a similar mental model, in which case the
    fact that the same commits can be rebased differently depending on
    the data in the reflog will be equally confusing to those users.

(2) Most people don't like to rebase too often, consequently the time
    elapsed from a user creating the topic branch and resetting its
    upstream until actually rebasing it can be days or weeks, by the
    end of which they are unlikely to remember the circumstances of
    creating the branch. Even more alarmingly, they may have committed
    dozens of further changes, and therefore may not even notice that
    the first few of those silently disappeared. (I'm not making this
    up: see discussion links below in PS.)

I believe the correct design would be to always make --no-fork-point
the default for rebase, and only use --fork-point when explicitly
specified. This would potentially be inconvenient for those who rebase
on top of complicated history rewrites, but said inconvenience would be
mitigated by several factors:
- Anyone doing rebases like that will already know that they can go
  wrong in a million ways.
- Perhaps more crucially, they will have a way of noticing if it went
  wrong, with all relevant information in short-term memory, so would
  recover easily.
- If the rebase turns out to be really complex, they are likely to
  resort to rebase -i, which shows full details of what is about to
  happen, moreover it seems relatively simple to enhance the contents
  of the rebase-todo file so that full information about the merge-base
  and fork-point is readily available.

If changing the default is not an option, e.g. because of backwards
compatibility concerns, then some configurability could still be
helpful, e.g. rebase.useForkPoint = never / auto / always (default
auto, to keep the current behaviour). Although I suspect this would
just lead to everyone suggesting setting this to 'never'. In any case
this would provide a way to ensure that any git newbies in my
organization don't spend days trying to figure this out like I've just
done.

Assuming there is agreement to do one of the above (I don't even know
whose agreement is required), what's the process for getting it
implemented? Sorry, that probably counts as a dumb question, but I've
never been around open-source projects much & need someone to show me
the ropes.

Many thanks
Laszlo

PS. Further reading about the same topic if anyone is interested:
- http://marc.info/?l=git&m=140991293402880&w=2 (from this same mailing
  list 2+ years ago, but I can see no clear conclusion)
- https://github.com/git-for-windows/git/issues/1076 (my bug report
  where contacting this list was suggested)
- http://stackoverflow.com/questions/22790765 and
  http://stackoverflow.com/questions/35320740 (various SO users being
  confused and asking about / discussing the same thing)

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-03-03 16:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03 16:37 rebase: confusing behaviour since --fork-point Laszlo Kiss

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).