git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bisect marking new commits incorrectly
@ 2017-11-22 14:39 Adam Dinwoodie
  2017-11-22 16:21 ` Christian Couder
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Dinwoodie @ 2017-11-22 14:39 UTC (permalink / raw)
  To: git

In trying to do a bisect on the Git repository, I seem to have come
across surprising behavior where the order in which `git bisect` appears
to forget that previous commits were marked as new.

You can see this behaviour in the following commands, run on the Git
repository, where the order of the `git bisect new` commands affects the
results.

Case 1:

    $ git bisect start; git bisect old v2.15.0

    $ git bisect new 95a731ce
    Bisecting: 40 revisions left to test after this (roughly 5 steps)
    [934e330c9d0d12f7a0dd82b9699456c891e4dd4a] Merge branch 'ad/5580-unc-tests-on-cygwin' into maint

    $ git bisect new 14c63a9d
    Bisecting: 153 revisions left to test after this (roughly 7 steps)
    [421f21c98f8b515412ca683ae3743013a8b3bda2] Merge branch 'js/mingw-redirect-std-handles'

    $ for h in 95a731ce 14c63a9d; do git log -1 --format=oneline --decorate "$h"; done
    95a731ce92c7576d927f0d8a9b27c206cb58c2e6 (origin/maint) Almost ready for 2.15.1
    14c63a9dc093d6738454f6369a4f5663ca732cf7 (origin/master, origin/HEAD, refs/bisect/new) Sync with maint

Case 2:

    $ git bisect start; git bisect old v2.15.0

    $ git bisect new 14c63a9d
    Bisecting: 153 revisions left to test after this (roughly 7 steps)
    [421f21c98f8b515412ca683ae3743013a8b3bda2] Merge branch 'js/mingw-redirect-std-handles'

    $ git bisect new 95a731ce
    Bisecting: 40 revisions left to test after this (roughly 5 steps)
    [934e330c9d0d12f7a0dd82b9699456c891e4dd4a] Merge branch 'ad/5580-unc-tests-on-cygwin' into maint

    $ for h in 95a731ce 14c63a9d; do git log -1 --format=oneline --decorate "$h"; done
    95a731ce92c7576d927f0d8a9b27c206cb58c2e6 (origin/maint, refs/bisect/new) Almost ready for 2.15.1
    14c63a9dc093d6738454f6369a4f5663ca732cf7 (origin/master, origin/HEAD) Sync with maint

As you can see, in both cases, only the most recent "new" command
appears to have any effect.  I'd have expected that both commits would
have been marked as "new", and the bisect run would use both facts to
work out which commit is the target of the bisection.

This is using v2.15.0.  It's possibly relevant that 95a731ce is a
direct parent of 14c63a9d.

Is this a bug, or intentional behaviour?  Am I missing something that
means it's actually sensible to have Git silently discard some bisect
commands in this sort of circumstance?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bisect marking new commits incorrectly
  2017-11-22 14:39 Bisect marking new commits incorrectly Adam Dinwoodie
@ 2017-11-22 16:21 ` Christian Couder
  2017-11-22 17:01   ` Adam Dinwoodie
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Couder @ 2017-11-22 16:21 UTC (permalink / raw)
  To: Adam Dinwoodie; +Cc: git

On Wed, Nov 22, 2017 at 3:39 PM, Adam Dinwoodie <adam@dinwoodie.org> wrote:
> In trying to do a bisect on the Git repository, I seem to have come
> across surprising behavior where the order in which `git bisect` appears
> to forget that previous commits were marked as new.

Yeah, the algorithm uses many "old" commits and only one "new" commit.

[...]

> As you can see, in both cases, only the most recent "new" command
> appears to have any effect.  I'd have expected that both commits would
> have been marked as "new", and the bisect run would use both facts to
> work out which commit is the target of the bisection.

I didn't look at your test case, but the algorithm to find another
commit to test is described here:

https://git-scm.com/docs/git-bisect-lk2009.html#_bisection_algorithm

and you can see that the first rule of the algorithm is to keep only
the commits that are ancestors of the "new" commit (including the
"new" commit itself).

The reason for this rule is that other commits cannot have introduced
the behavior we are looking for. The result of this rule is that git
bisect will always ask you to test a commit that is an ancestor of the
"new" commit.

So if you test a commit that git bisect asks you to test, and it
appears that this commit is "new", then you can just discard the
previous "new" commit because it will give you less information than
the new "new" one.
(The old "new" will not let you discard any commits that the new "new"
commit allows you to discard, because it is a descendant of the new
"new" commit.)

If you don't test the commit that git bisect asks you to test, then
git bisect assumes that you know better and discards the old "new".

> This is using v2.15.0.  It's possibly relevant that 95a731ce is a
> direct parent of 14c63a9d.
>
> Is this a bug, or intentional behaviour?  Am I missing something that
> means it's actually sensible to have Git silently discard some bisect
> commands in this sort of circumstance?

Well, instead of silently discarding a the old "new" commit when the
new "new" commit is not an ancestor of it, git bisect could perhaps
warn or ask you to confirm that you know what you are doing.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bisect marking new commits incorrectly
  2017-11-22 16:21 ` Christian Couder
@ 2017-11-22 17:01   ` Adam Dinwoodie
  2017-11-22 21:37     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Dinwoodie @ 2017-11-22 17:01 UTC (permalink / raw)
  To: Christian Couder; +Cc: git

On Wednesday 22 November 2017 at 05:21 pm +0100, Christian Couder wrote:
> On Wed, Nov 22, 2017 at 3:39 PM, Adam Dinwoodie <adam@dinwoodie.org> wrote:
> > In trying to do a bisect on the Git repository, I seem to have come
> > across surprising behavior where the order in which `git bisect` appears
> > to forget that previous commits were marked as new.
> 
> Yeah, the algorithm uses many "old" commits and only one "new" commit.
> 
> [...]
> 
> > As you can see, in both cases, only the most recent "new" command
> > appears to have any effect.  I'd have expected that both commits would
> > have been marked as "new", and the bisect run would use both facts to
> > work out which commit is the target of the bisection.
> 
> I didn't look at your test case, but the algorithm to find another
> commit to test is described here:
> 
> https://git-scm.com/docs/git-bisect-lk2009.html#_bisection_algorithm
> 
> and you can see that the first rule of the algorithm is to keep only
> the commits that are ancestors of the "new" commit (including the
> "new" commit itself).
> 
> The reason for this rule is that other commits cannot have introduced
> the behavior we are looking for. The result of this rule is that git
> bisect will always ask you to test a commit that is an ancestor of the
> "new" commit.
> 
> So if you test a commit that git bisect asks you to test, and it
> appears that this commit is "new", then you can just discard the
> previous "new" commit because it will give you less information than
> the new "new" one.
> (The old "new" will not let you discard any commits that the new "new"
> commit allows you to discard, because it is a descendant of the new
> "new" commit.)
> 
> If you don't test the commit that git bisect asks you to test, then
> git bisect assumes that you know better and discards the old "new".

Ah, that makes sense, thank you for the explanation.

In my case, I knew two commits were "new", but didn't know which would
provide more information to the bisect algorithm; I'd assumed if I
provided both, Git would work out the appropriate thing to do with the
combined information without me needing to work it out.

> > This is using v2.15.0.  It's possibly relevant that 95a731ce is a
> > direct parent of 14c63a9d.
> >
> > Is this a bug, or intentional behaviour?  Am I missing something that
> > means it's actually sensible to have Git silently discard some bisect
> > commands in this sort of circumstance?
> 
> Well, instead of silently discarding a the old "new" commit when the
> new "new" commit is not an ancestor of it, git bisect could perhaps
> warn or ask you to confirm that you know what you are doing.

Warning the user seems a sensible suggestion to me.  I'll add it to my
list of things to spend some time on at some point in the future :)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bisect marking new commits incorrectly
  2017-11-22 17:01   ` Adam Dinwoodie
@ 2017-11-22 21:37     ` Jeff King
  2017-11-24  5:29       ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2017-11-22 21:37 UTC (permalink / raw)
  To: Adam Dinwoodie; +Cc: Christian Couder, git

On Wed, Nov 22, 2017 at 05:01:29PM +0000, Adam Dinwoodie wrote:

> > So if you test a commit that git bisect asks you to test, and it
> > appears that this commit is "new", then you can just discard the
> > previous "new" commit because it will give you less information than
> > the new "new" one.
> > (The old "new" will not let you discard any commits that the new "new"
> > commit allows you to discard, because it is a descendant of the new
> > "new" commit.)
> > 
> > If you don't test the commit that git bisect asks you to test, then
> > git bisect assumes that you know better and discards the old "new".
> 
> Ah, that makes sense, thank you for the explanation.
> 
> In my case, I knew two commits were "new", but didn't know which would
> provide more information to the bisect algorithm; I'd assumed if I
> provided both, Git would work out the appropriate thing to do with the
> combined information without me needing to work it out.

Yeah, I really would have expected this to work, too. Should we be
taking the merge base of the set of "new" commits, and using that as the
true "new"?

I.e., with this graph:

  A--B--C--D
      \
       E--F

if I mark D and F as bad, can we assume that B, C, and E are "new", too,
and treat "B" as the new single "new"? That breaks down if the bug was
introduced independently on each branch, but I think bisect explicitly
doesn't handle such a case.

What might give it trouble is if there are multiple merge bases, like:


    G--H-----
   /      \  \
  A--B--C--D  \
      \        \
       E--------F

The flip to "new" could have been on the G-H branch, or in B.

So maybe that's not workable.

(I've never really dug into the bisect algorithm, and this is written
largely off the cuff, so I'm fine if the response is "nope, you have no
idea what you are talking about").

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bisect marking new commits incorrectly
  2017-11-22 21:37     ` Jeff King
@ 2017-11-24  5:29       ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-11-24  5:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Adam Dinwoodie, Christian Couder, git

Jeff King <peff@peff.net> writes:

> Yeah, I really would have expected this to work, too. Should we be
> taking the merge base of the set of "new" commits, and using that as the
> true "new"?
> ...
> So maybe that's not workable.
>
> (I've never really dug into the bisect algorithm, and this is written
> largely off the cuff, so I'm fine if the response is "nope, you have no
> idea what you are talking about").

You reached the right conclusion, I think.  

A single merge-base case still might be worth optimizing for (and
IIRC, unlike the "git log A..B" traversal that could be fooled by
clock skew, merge-base reliably rejects falsely independent bases in
the presence of clock skew, so it should be safe to do), but in a
scenario your second illustration depicts, the merge-bases would not
help us that much.  

When starting from D and F, both of which are known to be "bad", all
we know is that some of the merge bases between them are "bad",
while other bases are not "bad" as they do not inherit the badness
in the common ancestor of those "bad" bases.  

And taking a merge base of these multiple merge-bases recursively
would not help us there.  We started from "all of these, i.e. D and
F, are known to be bad", which is different from "some of these
among multiple bases are bad, but we do not know which ones", so
even if we were to go recursive, the first step needs to be "now
let's see which one of these multiple bases are bad, so that we can
take merge-base across them".

      

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-24  5:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22 14:39 Bisect marking new commits incorrectly Adam Dinwoodie
2017-11-22 16:21 ` Christian Couder
2017-11-22 17:01   ` Adam Dinwoodie
2017-11-22 21:37     ` Jeff King
2017-11-24  5:29       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).