git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Andreas Ericsson <ae@op5.se>
Cc: darxus@chaosreigns.com, git@vger.kernel.org
Subject: Re: Feature request: don't require both bad and good when bisecting
Date: Mon, 19 Mar 2012 11:30:06 -0400	[thread overview]
Message-ID: <20120319153006.GD24848@sigill.intra.peff.net> (raw)
In-Reply-To: <4F67468B.4070502@op5.se>

On Mon, Mar 19, 2012 at 03:45:31PM +0100, Andreas Ericsson wrote:

> On 03/18/2012 10:29 PM, darxus@chaosreigns.com wrote:
> > I'd like to be able to tell get only that I know the latest commit is bad,
> > and have it go find a good commit, then do the bisecting.  Maybe something
> > like the opposite of a binary search, start with the last commit, then
> > second to last, then 4th to last, 8th to last, etc., till it finds a good
> > commit.
> > 
> 
> Assuming the good commit is the 13'th from HEAD, you'd get the same nr
> of attempts by just specifying a commit 100 revisions in the past and
> doing the already implemented binary search as you would from trying 4
> commits at a time to get at the good one.
> 
> Binary search is a "divide and conquer" algorithm (running in O(log n)
> time), so it handles extremely large datasets very efficiently.

Yeah. The OP's suggestion is to search backwards, increasing the stride
exponentially. That would end up finding a good commit in O(lg n),
though not with any great accuracy (e.g., for an old bug, you'd end up
considering the whole first half of history as a single stride).  Since
bisection would then narrow the result in O(lg n), I think
asymptotically you are not any better off than you would be just
arbitrarily checking the root commit[1], and then starting the bisection
from there.

But both schemes run into a problem where old commits are often not very
testable. For example, when I am bisecting in git.git, I will run into
something like this:

  1. Some feature is introduced in v1.7.0.

  2. A bug in the feature is introduced in v1.7.2.

  3. Somebody notices and reports the bug in v1.7.5.

There is no point in testing anything prior to v1.7.0, as your test
cannot succeed before the feature existed. And worse, it will actively
break a bisection. Pre-v1.7.0 versions will appear buggy, but it is in
fact a _different_ bug than the one you are searching for (the bug is
that the feature isn't there yet). This has been discussed many times on
the list, but the short of it is that you will not get sensible
bisection results if you have multiple bugs (or a bug that comes and
goes throughout history).

So bisect really needs some input from the user to find a sensible
boundary. And finding that boundary (if the user doesn't already know
it) is generally a manual thing. Because it is usually easy for a human
to recognize that the failure mode for points (1) and points (3) above
are different, but hard to write a script that correctly tests for it.

IOW, my procedure for a bug like the above is usually to walk backwards
along major tagged versions, manually interpreting the results. When I
try v1.6.0 and my test blows up (because the feature isn't implemented),
I recognize it, dig a little with "git log" to find where it was
implemented, and only then write a script for automated bisection.

-Peff

[1] There can also be multiple roots, which makes a backwards-walking
    algorithm much more complex. I think instead you could simply test
    and mark all the roots, and then start the bisection from there. But
    again, you are unlikely to have written a test script that will work
    on such antique versions of the project.

  reply	other threads:[~2012-03-19 15:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-18 21:29 Feature request: don't require both bad and good when bisecting darxus
2012-03-19 14:45 ` Andreas Ericsson
2012-03-19 15:30   ` Jeff King [this message]
2012-03-19 16:24     ` Andreas Ericsson
2012-03-19 16:45       ` Jeff King
2012-03-19 16:49       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120319153006.GD24848@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=ae@op5.se \
    --cc=darxus@chaosreigns.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).