git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [RFD] should all merge bases be equal?
Date: Thu, 9 Feb 2017 15:44:09 +0100	[thread overview]
Message-ID: <8a9b3f20-eed2-c59b-f7ea-3c68b3c30bf5@alum.mit.edu> (raw)
In-Reply-To: <xmqqmvi2sj8f.fsf@gitster.mtv.corp.google.com>

On 10/18/2016 12:28 AM, Junio C Hamano wrote:
> [...]
> Being accustomed how fast my merges go, there is one merge that
> hiccups for me every once in a few days: merging back from 'master'
> to 'next'.  [...]
> 
> The reason why this merge is slow is because it typically have many
> merge bases.  [...]

I overlooked this topic until just now :-(

I spent a lot of time looking at merge bases a couple of years ago [1],
originally motivated by the crappy diffs you get from

    git diff master...branch

when the merge base is chosen poorly. In that email I include a lot of
data and suggest a different heuristic, namely to define the "best"
merge base $M to be the one that minimizes the number of non-merge
commits between $M and either of the branch tips (it doesn't matter
which one you choose); i.e., the one that minimizes

    git rev-list --count --no-merges $M..$TIP

. The idea is that a merge base that is "closer" content-wise to the
tips will probably yield smaller diffs. I would expect that merge base
also to yield simpler merges, though I didn't test that. Relying on the
number of commits (rather than some other measure of how much the
content has been changed) is only a heuristic, but it seems to work well
and it can be implemented pretty cheaply.

We actually use an algorithm like the one I described at GitHub, though
it is implemented as a script rather than ever having been integrated
into git. And (for no particular reason) we include merge commits in the
commit count (it doesn't make much difference whether merges are
included or excluded).

Your idea to look at the first-parent histories of the two branch tips
is an interesting one and has the nice theoretical property that it is
based on the DAG topology rather than a count of commits. I'd be very
curious to see how the sizes of asymmetric diffs differ between your
method and mine, because for me smaller and more readable diffs are one
of the main benefits of better merge bases.

I would worry a bit that your proposed algorithm won't perform as well
for people who use less disciplined workflows than git.git or the Linux
kernel. For example, many people merge a lot more frequently and
chaotically, maybe even with the parents reversed from the canonical order.

Anyway, I mostly wanted to remind you of the earlier discussion of this
topic. There's a lot more information there.

Michael

[1] http://public-inbox.org/git/539A25BF.4060501@alum.mit.edu/


  parent reply	other threads:[~2017-02-09 14:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 22:28 [RFD] should all merge bases be equal? Junio C Hamano
2016-10-19  4:23 ` [PATCH 0/7] Rejecting useless merge bases Junio C Hamano
2016-10-19  4:23   ` [PATCH 1/7] commit: simplify fastpath of merge-base computation Junio C Hamano
2016-10-19  4:23   ` [PATCH 2/7] sha1_name: remove ONELINE_SEEN bit Junio C Hamano
2016-10-19  4:23   ` [PATCH 3/7] merge-base: stop moving commits around in remove_redundant() Junio C Hamano
2016-10-19  4:23   ` [PATCH 4/7] merge-base: expose get_merge_bases_many_0() a bit more Junio C Hamano
2016-10-19  4:23   ` [PATCH 5/7] merge-base: mark bases that are on first-parent chain Junio C Hamano
2016-10-19 17:42     ` Junio C Hamano
2016-10-19  4:23   ` [PATCH 6/7] merge-base: limit the output to " Junio C Hamano
2016-10-19  4:23   ` [PATCH 7/7] merge: allow to use only the fp-only merge bases Junio C Hamano
2016-10-19 21:34   ` [PATCH 0/7] Rejecting useless " Junio C Hamano
2017-02-09 14:44 ` Michael Haggerty [this message]
2017-02-09 16:57   ` [RFD] should all merge bases be equal? Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a9b3f20-eed2-c59b-f7ea-3c68b3c30bf5@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).