git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Olliver Schinagl <oliver@schinagl.nl>
To: Junio C Hamano <gitster@pobox.com>, phillip.wood123@gmail.com
Cc: phillip.wood@dunelm.org.uk, git@vger.kernel.org,
	Christian Couder <christian.couder@gmail.com>,
	Stefan Haller <lists@haller-berlin.de>
Subject: Re: [RFC] bisect: Introduce skip-when to automatically skip commits
Date: Wed, 10 Apr 2024 12:39:03 +0200	[thread overview]
Message-ID: <116dd27e-2e30-4915-a131-6c71c999fccd@schinagl.nl> (raw)
In-Reply-To: <xmqqzfu3dcl1.fsf@gitster.g>

On 08-04-2024 18:49, Junio C Hamano wrote:
> phillip.wood123@gmail.com writes:
>
>>>> get_terms() wants to read the first line into `term_bad` and the
>>>> second line into `term_good` so it makes sense that it uses two
>>>> calls to `strbuf_getline()` to do that. It does not want to read
>>>> the whole file into a single buffer as we do here.
>>> Right, but I why not use strbuf_getline()?
>> Because you want the whole file, not just one line as the script name
>> could potentially contain a newline
> It is technically true, but it somehow sounds like an implausible
> scenario to me.  The real reason why read_file() is preferrable is
> because you do not have to write, and we do not want to see you write,
> the whole "open (and handle error), read, chomp, and return" sequence.
>
> I would even suspect that get_terms() is a poorly written
> anti-pattern.  If I were adding that function to the system today, I
> wouldn't be surprised if I did read_file() the whole thing and
> worked in-core to split two items out.

So I've peaked at it, and think something like:

+int bisect_read_terms(const char **read_bad, const char **read_good)
+{
+       struct strbuf sb = STRBUF_INIT;
+
+       if (!strbuf_read_file(&sb, git_path_bisect_terms(), 0)) {
+               *read_bad = "bad";
+               *read_good = "good";
+               return -1;
+       }
+
+       terms = strbuf_split(&sb);
+       *term_bad = strbuf_detach(terms[0], NULL);
+       *term_good = strbuf_detach(terms[1], NULL);
+
+       strbuf_release(&sb);
+
+       return 0;
+}

would do the trick. This function could then be called from 
builtin/bisect.c as well to have a single interface. Right now, there's 
two ways to do the same thing, just because the arguments to the 
function are different, and the body is slightly different, but the same.

Shall I send a MR with this?

>
>> If I understand correctly we're encouraging the user to run "git
>> bisect skip" from the post checkout script. Doesn't that mean we'll
>> end up with a set of processes that look like
>>
>> 	- git bisect start
>> 	  - post checkout script
>>              - git bisect skip
>>                - post checkout script
>>                  - git bisect skip
>>                    ...
>>
>> as the "git bisect start" is waiting for the post checkout script to
>> finish running, but that script is waiting for "git bisect skip" to
>> finish running and so on. Each of those processes takes up system
>> resources, similar to how a recursive function can exhaust the
>> available stack space by calling itself over and over again.
> True.  What such a post-checkout script can do is to only mark the
> HEAD as "untestable", just like a run script given to "bisect run"
> signals that fact by returnint 125.  And at that point, I doubt it
> makes sense to add such a post-checkout script for the purpose of
> allowing "bisect skip".
>
> Having said that, a post-checkout script and pre-resume script may
> have a huge value in helping those whose tests cannot be automated
> (in other words, they cannot do "git bisect run") when they need to
> tweak the working tree during bisection.  We all have seen, during a
> bisection session that spans a segment of history that has another
> bug that affects our test *but* is orthogonal to the bug we are
> chasing, that we "cherry-pick --no-commit" the fix for that other
> problem inside "git bisect run" script.  It might look something
> like
>
>      #!/bin/sh
>      if git merge-base --is-ancestor $the_other_bug HEAD
>      then
> 	# we need the fix
> 	git cherry-pick --no-commit $fix_for_the_other_bug ||
> 	exit 125
>      fi
>
>      make test
>      status=$?
>      git reset --hard ;# undo the cherry-pick
>      exit $status
>
> But to those whose test is not a good match to "git bisect run", if
> we had a mechanism to tweak the checked out working tree after the
> "bisect next" (which is an internal mechanism that "bisect good",
> "bisect bad", and "bisect skip" share to give you the next HEAD and
> the working tree to test) checks out the working tree before it
> gives the control back to you, we could split the above script into
> two parts and throw the "conditionally cherry-pick the fix" part
> into that mechanism.  We'd need to have a companion script to "redo
> the damage" (the "reset --hard" in the above illustration) if this
> were to work seamlessly.  That obviously is totally orthogonal to
> what we are discussing in this thread, but may make a good #leftoverbits
> material (but not for novices).

While completly orthonogal, I agree; it would be nice to have and 
'abuse' for the bisect-skip usecase. So if we ignore the fact that it 
can be abused for this (which I don't think is a bad thing, it just 
risks the recursive issue Phillip mentioned.


As I'm not familiar with the deeps of bisect, I just use it as a dumb 
simple user, e.g. start; good, bad; I'm not sure the usecase you are 
describing is completly clear to me.

Are you saying 'git bisect run` is great, but not useful in all 
situations, and so in some cases, we want what you just said? Or would 
this also be part of git bisect run?

I've drafted the post-checkout and pre-resume here: 
https://gitlab.com/olliver/git/-/commit/6b5415377600551c0d94a359fd4b8ca7a3678dcf 
where I'm not clear on what the best points are for for the pre/post 
points. I've put the 'pre' bit in the bisect_state function, as that was 
being triggered by many suboptions, but might not be correct based on 
your answer to the above 
(https://gitlab.com/olliver/git/-/commit/6b5415377600551c0d94a359fd4b8ca7a3678dcf#46324e17f99db64a67eb9a5983ffc3a680914ee3_1001_1028). 
The post-checkout part, I've put in bisect_next 
(https://gitlab.com/olliver/git/-/commit/43993fca32f174f1005c7a445887c0ba5c4036b5#46324e17f99db64a67eb9a5983ffc3a680914ee3_672_717) 
which seems to match what you described.


Olliver



  reply	other threads:[~2024-04-10 10:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-30  8:10 [RFC] bisect: Introduce skip-when to automatically skip commits Olliver Schinagl
2024-04-05  6:50 ` Olliver Schinagl
2024-04-06  1:08   ` Junio C Hamano
2024-04-06 10:06     ` Olliver Schinagl
2024-04-06 13:50       ` Phillip Wood
2024-04-06 19:17         ` Olliver Schinagl
2024-04-07 14:09           ` phillip.wood123
2024-04-07 14:52             ` Olliver Schinagl
2024-04-07 15:12               ` phillip.wood123
2024-04-07 21:11                 ` Olliver Schinagl
2024-04-08 16:49                 ` Junio C Hamano
2024-04-10 10:39                   ` Olliver Schinagl [this message]
2024-04-10 16:47                     ` Junio C Hamano
2024-04-10 19:22                       ` Olliver Schinagl
2024-04-10 19:31                         ` Junio C Hamano
2024-04-10 19:39                           ` Olliver Schinagl
2024-04-12 13:35                   ` Phillip Wood
2024-04-06 17:07       ` Junio C Hamano
2024-04-06 19:19         ` Olliver Schinagl
2024-04-06 10:22     ` Olliver Schinagl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=116dd27e-2e30-4915-a131-6c71c999fccd@schinagl.nl \
    --to=oliver@schinagl.nl \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=lists@haller-berlin.de \
    --cc=phillip.wood123@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).