git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)
@ 2017-01-13 14:44 Christian Couder
  2017-01-13 19:14 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Couder @ 2017-01-13 14:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Manuel Ullmann, Matthieu Moy, Christian Couder

The following part of the description:

git bisect (bad|new) [<rev>]
git bisect (good|old) [<rev>...]

may be a bit confusing, as a reader may wonder if instead it should be:

git bisect (bad|good) [<rev>]
git bisect (old|new) [<rev>...]

Of course the difference between "[<rev>]" and "[<rev>...]" should hint
that there is a good reason for the way it is.

But we can further clarify and complete the description by adding
"<term-new>" and "<term-old>" to the "bad|new" and "good|old"
alternatives.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-bisect.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index 2bb9a577a2..bdd915a66b 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -18,8 +18,8 @@ on the subcommand:
 
  git bisect start [--term-{old,good}=<term> --term-{new,bad}=<term>]
 		  [--no-checkout] [<bad> [<good>...]] [--] [<paths>...]
- git bisect (bad|new) [<rev>]
- git bisect (good|old) [<rev>...]
+ git bisect (bad|new|<term-new>) [<rev>]
+ git bisect (good|old|<term-old>) [<rev>...]
  git bisect terms [--term-good | --term-bad]
  git bisect skip [(<rev>|<range>)...]
  git bisect reset [<commit>]
-- 
2.11.0.313.g11b7cc88e6.dirty


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)
  2017-01-13 14:44 [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad) Christian Couder
@ 2017-01-13 19:14 ` Junio C Hamano
  2017-01-15 14:51   ` Christian Couder
  2017-01-16  9:17   ` Matthieu Moy
  0 siblings, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-01-13 19:14 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Manuel Ullmann, Matthieu Moy, Christian Couder

Christian Couder <christian.couder@gmail.com> writes:

> The following part of the description:
>
> git bisect (bad|new) [<rev>]
> git bisect (good|old) [<rev>...]
>
> may be a bit confusing, as a reader may wonder if instead it should be:
>
> git bisect (bad|good) [<rev>]
> git bisect (old|new) [<rev>...]
>
> Of course the difference between "[<rev>]" and "[<rev>...]" should hint
> that there is a good reason for the way it is.
>
> But we can further clarify and complete the description by adding
> "<term-new>" and "<term-old>" to the "bad|new" and "good|old"
> alternatives.
>
> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
> ---
>  Documentation/git-bisect.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Thanks.  The patch looks good.

A related tangent.  

Last night, I was trying to think if there is a fundamental reason
why "bad/new/term-new" cannot take more than one <rev>s on the newer
side of the bisection, and couldn't quite think of any before
falling asleep.

Currently we keep track of a single bisect/bad, while marking all the
revs given as good previously as bisect/good-<SHA-1>.

Because the next "bad" is typically chosen from the region of the
commit DAG that is bounded by bad and good commits, i.e. "rev-list
bisect/bad --not bisect/good-*", the current bisect/bad will always
be an ancestor of all bad commits that used to be bisect/bad, and
keeping previous bisect/bad as bisect/bad-<SHA-1> won't change the
region of the commit DAG yet to be explored.

As a reason why we need to use only a single bisect/bad, the above
description is understandable.  But as a reason why we cannot have
more than one, it is tautological.  It merely says "if we start from
only one and dig history to find older culprit, we need only one
bad".

I fell asleep last night without thinking further than that.

I think the answer to the question "why do we think we need a single
bisect/bad?" is "because bisection is about assuming that there is
only one commit that flips the tree state from 'old' to 'new' and
finding that single commit".  That would mean that even if we had
bisect/bad-A and bisect/bad-B, e.g.

                          o---o---o---bad-A
                         /
    -----Good---o---o---o
                         \
                          o---o---o---bad-B


where 'o' are all commits whose goodness is not yet known, because
bisection is valid only when we are hunting for a single commit that
flips the state from good to bad, that commit MUST be at or before
the merge base of bad-A and bad-B.  So even if we allowed

	$ git bisect bad bad-A bad-B

on the command line, we won't have to set bisect/bad-A and
bisect/bad-B.  We only need a single bisect/bad that points at the
merge base of these two.

But what if bad-A and bad-B have more than one merge bases?  We
won't know which side the badness came from.

                          o---o---o---bad-A
                         /     \ / 
    -----Good---o---o---o       / 
                         \     / \
                          o---o---o---bad-B

Being able to bisect the region of DAG bound by "^Good bad-A bad-B"
may have value in such a case.  I dunno.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)
  2017-01-13 19:14 ` Junio C Hamano
@ 2017-01-15 14:51   ` Christian Couder
  2017-01-16  9:17   ` Matthieu Moy
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Couder @ 2017-01-15 14:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Manuel Ullmann, Matthieu Moy, Christian Couder

On Fri, Jan 13, 2017 at 8:14 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> The following part of the description:
>>
>> git bisect (bad|new) [<rev>]
>> git bisect (good|old) [<rev>...]
>>
>> may be a bit confusing, as a reader may wonder if instead it should be:
>>
>> git bisect (bad|good) [<rev>]
>> git bisect (old|new) [<rev>...]
>>
>> Of course the difference between "[<rev>]" and "[<rev>...]" should hint
>> that there is a good reason for the way it is.
>>
>> But we can further clarify and complete the description by adding
>> "<term-new>" and "<term-old>" to the "bad|new" and "good|old"
>> alternatives.
>>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>>  Documentation/git-bisect.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Thanks.  The patch looks good.
>
> A related tangent.
>
> Last night, I was trying to think if there is a fundamental reason
> why "bad/new/term-new" cannot take more than one <rev>s on the newer
> side of the bisection, and couldn't quite think of any before
> falling asleep.
>
> Currently we keep track of a single bisect/bad, while marking all the
> revs given as good previously as bisect/good-<SHA-1>.
>
> Because the next "bad" is typically chosen from the region of the
> commit DAG that is bounded by bad and good commits, i.e. "rev-list
> bisect/bad --not bisect/good-*", the current bisect/bad will always
> be an ancestor of all bad commits that used to be bisect/bad, and
> keeping previous bisect/bad as bisect/bad-<SHA-1> won't change the
> region of the commit DAG yet to be explored.
>
> As a reason why we need to use only a single bisect/bad, the above
> description is understandable.  But as a reason why we cannot have
> more than one, it is tautological.  It merely says "if we start from
> only one and dig history to find older culprit, we need only one
> bad".
>
> I fell asleep last night without thinking further than that.
>
> I think the answer to the question "why do we think we need a single
> bisect/bad?" is "because bisection is about assuming that there is
> only one commit that flips the tree state from 'old' to 'new' and
> finding that single commit".  That would mean that even if we had
> bisect/bad-A and bisect/bad-B, e.g.
>
>                           o---o---o---bad-A
>                          /
>     -----Good---o---o---o
>                          \
>                           o---o---o---bad-B
>
>
> where 'o' are all commits whose goodness is not yet known, because
> bisection is valid only when we are hunting for a single commit that
> flips the state from good to bad, that commit MUST be at or before
> the merge base of bad-A and bad-B.  So even if we allowed
>
>         $ git bisect bad bad-A bad-B
>
> on the command line, we won't have to set bisect/bad-A and
> bisect/bad-B.  We only need a single bisect/bad that points at the
> merge base of these two.
>
> But what if bad-A and bad-B have more than one merge bases?  We
> won't know which side the badness came from.
>
>                           o---o---o---bad-A
>                          /     \ /
>     -----Good---o---o---o       /
>                          \     / \
>                           o---o---o---bad-B
>
> Being able to bisect the region of DAG bound by "^Good bad-A bad-B"
> may have value in such a case.  I dunno.

I agree that there could be improvements in this area. Though from my
experience with special cases, like when a good commit is not an
ancestor of the bad commit (where there are probably bugs still
lurking), I think it could be tricky to implement correctly in all
cases, and it could make it even more difficult, than it sometimes
already is, to explain the resulting behavior to users.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)
  2017-01-13 19:14 ` Junio C Hamano
  2017-01-15 14:51   ` Christian Couder
@ 2017-01-16  9:17   ` Matthieu Moy
  2017-01-17 19:58     ` Junio C Hamano
  1 sibling, 1 reply; 5+ messages in thread
From: Matthieu Moy @ 2017-01-16  9:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Couder, git, Manuel Ullmann, Christian Couder

Junio C Hamano <gitster@pobox.com> writes:

> Christian Couder <christian.couder@gmail.com> writes:
>
>> The following part of the description:
>>
>> git bisect (bad|new) [<rev>]
>> git bisect (good|old) [<rev>...]
>>
>> may be a bit confusing, as a reader may wonder if instead it should be:
>>
>> git bisect (bad|good) [<rev>]
>> git bisect (old|new) [<rev>...]
>>
>> Of course the difference between "[<rev>]" and "[<rev>...]" should hint
>> that there is a good reason for the way it is.
>>
>> But we can further clarify and complete the description by adding
>> "<term-new>" and "<term-old>" to the "bad|new" and "good|old"
>> alternatives.
>>
>> Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
>> ---
>>  Documentation/git-bisect.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Thanks.  The patch looks good.

Looks good to me too.

> I think the answer to the question "why do we think we need a single
> bisect/bad?" is "because bisection is about assuming that there is
> only one commit that flips the tree state from 'old' to 'new' and
> finding that single commit".

I wouldn't say it's about "assuming" there's only one commit, but it's
about finding *one* such commit, i.e. it works if there are several such
commits, but won't find them all.

> But what if bad-A and bad-B have more than one merge bases?  We
> won't know which side the badness came from.
>
>                           o---o---o---bad-A
>                          /     \ / 
>     -----Good---o---o---o       / 
>                          \     / \
>                           o---o---o---bad-B
>
> Being able to bisect the region of DAG bound by "^Good bad-A bad-B"
> may have value in such a case.  I dunno.

I could help finding several guilty commits, but anyway you can't
guarantee you'll find them all as soon as you use a binary search: if
the history looks like

--- Good --- Bad --- Good --- Good --- Bad --- Good --- Bad

then without examining all commits, you can't tell how many good->bad
switches occured.

But keeping several bad commits wouldn't help keeping the set of
potentially guilty commits small: bad commits appear on the positive
side in "^Good bad-A bad-B", so having more bad commits mean having a
larger DAG to explore (which is a bit counter-intuitive: without
thinking about it I'd have said "more info => less commits to explore").

So, if finding all guilty commits is not possible, I'm not sure how
valuable it is to try to find several of them.

OTOH, keeping several good commits is needed to find a commit for which
all parents are good and the commit is bad, i.e. distinguish

Good
    \
     Bad <-- this is the one.
    /
Good

and

Good
    \
     Bad <-- need to dig further
    /
 Bad

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad)
  2017-01-16  9:17   ` Matthieu Moy
@ 2017-01-17 19:58     ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-01-17 19:58 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Christian Couder, git, Manuel Ullmann, Christian Couder

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

>> But what if bad-A and bad-B have more than one merge bases?  We
>> won't know which side the badness came from.
>>
>>                           o---o---o---bad-A
>>                          /     \ / 
>>     -----Good---o---o---o       / 
>>                          \     / \
>>                           o---o---o---bad-B
>>
>> Being able to bisect the region of DAG bound by "^Good bad-A bad-B"
>> may have value in such a case.  I dunno.
>
> I could help finding several guilty commits, but anyway you can't
> guarantee you'll find them all as soon as you use a binary search: if
> the history looks like
>
> --- Good --- Bad --- Good --- Good --- Bad --- Good --- Bad
>
> then without examining all commits, you can't tell how many good->bad
> switches occured.
>
> But keeping several bad commits wouldn't help keeping the set of
> potentially guilty commits small: bad commits appear on the positive
> side in "^Good bad-A bad-B", so having more bad commits mean having a
> larger DAG to explore (which is a bit counter-intuitive: without
> thinking about it I'd have said "more info => less commits to explore").
>
> So, if finding all guilty commits is not possible, I'm not sure how
> valuable it is to try to find several of them.

The criss-cross merge example, is not trying to find multiple
sources of badness.  It still assumes [*1*] that there is only one
event that introduced the badness observed at bad-A and bad-B, both
of which inherited the badness from the same such event.  Unlike a
case with a single/unique merge-base, we cannot say "we can start
from the merge-base, as their common badness must be coming from the
same place".  The badness may exist in the first 'o' on the same
line as bad-A in the above picture, which is an ancestor of one
merge-base on that line and does not break the other merge base on
the same line as bad-B, for example.

> OTOH, keeping several good commits is needed to find a commit for which
> all parents are good and the commit is bad.

Yes, that is correct.


[Footnote]

*1* The assumption is what makes "bisect" workable.  If the
    assumption does not hold, then "bisect" would not give a useful
    answer "where did I screw up?".  It gives a fairly useless "I
    found one bad commit whose parent is good---there is no
    guarantee if that has anything to do with the badness you are
    seeing at the tip".

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-01-17 19:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-13 14:44 [PATCH] Documentation/bisect: improve on (bad|new) and (good|bad) Christian Couder
2017-01-13 19:14 ` Junio C Hamano
2017-01-15 14:51   ` Christian Couder
2017-01-16  9:17   ` Matthieu Moy
2017-01-17 19:58     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).