git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Michael Haggerty <mhagger@alum.mit.edu>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>,
	Jakub Narebski <jnareb@gmail.com>,
	Jeff Hostetler <git@jeffhostetler.com>,
	Jacob Keller <jacob.keller@gmail.com>,
	Stefan Beller <sbeller@google.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: commit-graph: change in "best" merge-base when ambiguous
Date: Tue, 22 May 2018 08:48:04 -0400	[thread overview]
Message-ID: <8b480e9e-1fd3-35ff-2974-653fadd49fa7@gmail.com> (raw)
In-Reply-To: <3705af00-00b7-b620-cc77-eef8f0a73bc1@alum.mit.edu>

On 5/22/2018 1:39 AM, Michael Haggerty wrote:
> On 05/21/2018 08:10 PM, Derrick Stolee wrote:
>> [...]
>> In the Discussion section of the `git merge-base` docs [1], we have the
>> following:
>>
>>      When the history involves criss-cross merges, there can be more than
>> one best common ancestor for two commits. For example, with this topology:
>>
>>      ---1---o---A
>>          \ /
>>           X
>>          / \
>>      ---2---o---o---B
>>
>>      both 1 and 2 are merge-bases of A and B. Neither one is better than
>> the other (both are best merge bases). When the --all option is not
>> given,     it is unspecified which best one is output.
>>
>> This means our official documentation mentions that we do not have a
>> concrete way to differentiate between these choices. This makes me think
>> that this change in behavior is not a bug, but it _is_ a change in
>> behavior. It's worth mentioning, but I don't think there is any value in
>> making sure `git merge-base` returns the same output.
>>
>> Does anyone disagree? Is this something we should solidify so we always
>> have a "definitive" merge-base?
>> [...]
> This may be beyond the scope of what you are working on, but there are
> significant advantages to selecting a "best" merge base from among the
> candidates. Long ago [1] I proposed that the "best" merge base is the
> merge base candidate that minimizes the number of non-merge commits that
> are in
>
>      git rev-list $candidate..$branch
>
> that are already in master:
>
>      git rev-list $master
>
> (assuming merging branch into master), which is equivalent to choosing
> the merge base that minimizes
>
>      git rev-list --count $candidate..$branch
>
> In fact, this criterion is symmetric if you exchange branch ↔ master,
> which is a nice property, and indeed generalizes pretty simply to
> computing the merge base of more than two commits.
>
> In that email I also included some data showing that the "best" merge
> base almost always results in either the same or a shorter diff than the
> more or less arbitrary algorithm that we currently use. Sometimes the
> difference in diff length is dramatic.
>
> To me it feels like the best *deterministic* merge base would be based
> on the above criterion, maybe with first-parent reachability, commit
> times, and SHA-1s used (in that order) to break ties.

Thanks, everyone, for your perspective on this. I'm walking away with 
these conclusions:

1. While this is a change in behavior, it is not a regression. We do not 
need to act immediately to preserve old behavior in these ambiguous cases.

2. We should (eventually) define tie-breaking conditions. I like 
Michael's suggestion above.

Thanks,
-Stolee

  reply	other threads:[~2018-05-22 12:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21 18:10 commit-graph: change in "best" merge-base when ambiguous Derrick Stolee
2018-05-21 18:33 ` Elijah Newren
2018-05-21 21:50   ` Jeff King
2018-05-21 22:28     ` Stefan Beller
2018-05-21 21:54 ` Jeff King
2018-05-21 22:25   ` Jacob Keller
2018-05-22  5:39 ` Michael Haggerty
2018-05-22 12:48   ` Derrick Stolee [this message]
2018-05-24 22:08     ` Jakub Narebski
2018-05-25  6:03       ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b480e9e-1fd3-35ff-2974-653fadd49fa7@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jnareb@gmail.com \
    --cc=mhagger@alum.mit.edu \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).