From: "Miriam R." <mirucam@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git <git@vger.kernel.org>
Subject: Re: [PATCH v3 09/12] bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C
Date: Sat, 20 Jun 2020 10:04:46 +0200 [thread overview]
Message-ID: <CAN7CjDAnksyerW_sEDLFS=GT0g7rLt53dUA09a8jAqnZb9P_8w@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2005222311120.56@tvgsbejvaqbjf.bet>
Hi Johannes,
I'm finishing the next patch series version and I have an issue about
one of your suggestions:
El sáb., 23 may. 2020 a las 0:06, Johannes Schindelin
(<Johannes.Schindelin@gmx.de>) escribió:
>
> Hi Miriam,
>
> On Thu, 23 Apr 2020, Miriam Rubio wrote:
>
> > From: Pranit Bauva <pranit.bauva@gmail.com>
> >
> > Reimplement the `bisect_state()` shell functions in C and also add a
> > subcommand `--bisect-state` to `git-bisect--helper` to call them from
> > git-bisect.sh .
> >
> > Using `--bisect-state` subcommand is a temporary measure to port shell
> > function to C so as to use the existing test suite. As more functions
> > are ported, this subcommand will be retired and will be called by some
> > other methods.
> >
> > `bisect_head()` is only called from `bisect_state()`, thus it is not
> > required to introduce another subcommand.
> >
> > Mentored-by: Lars Schneider <larsxschneider@gmail.com>
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> > Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
> > Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
> > Signed-off-by: Miriam Rubio <mirucam@gmail.com>
> > ---
> > builtin/bisect--helper.c | 70 +++++++++++++++++++++++++++++++++++++++-
> > git-bisect.sh | 55 +++----------------------------
> > 2 files changed, 73 insertions(+), 52 deletions(-)
> >
> > diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> > index 2d8660c79f..9db72f5891 100644
> > --- a/builtin/bisect--helper.c
> > +++ b/builtin/bisect--helper.c
> > @@ -31,6 +31,8 @@ static const char * const git_bisect_helper_usage[] = {
> > N_("git bisect--helper --bisect-next"),
> > N_("git bisect--helper --bisect-auto-next"),
> > N_("git bisect--helper --bisect-autostart"),
> > + N_("git bisect--helper --bisect-state (bad|new) [<rev>]"),
> > + N_("git bisect--helper --bisect-state (good|old) [<rev>...]"),
> > NULL
> > };
> >
> > @@ -834,6 +836,64 @@ static int bisect_autostart(struct bisect_terms *terms)
> > return bisect_start(terms, 0, NULL, 0);
> > }
> >
> > +static int bisect_head(struct object_id *oid)
> > +{
> > + if (!file_exists(git_path_bisect_head()))
> > + return get_oid("HEAD", oid);
> > +
> > + return get_oid("BISECT_HEAD", oid);
>
> This can be easily reduced to
>
> return get_oid(file_exists(git_path_bisect_head()) ?
> "BISECT_HEAD" : "HEAD", oid);
>
> At the same time, it is wrong, just like the shell script version was
> wrong: in particular in light of the `hn/reftable` effort, we do _not_
> want to assume that all refs are backed by files!
>
> So really, what this should do instead is this:
>
> enum get_oid_result res = get_oid("BISECT_HEAD", oid);
>
> if (res == MISSING_OBJECT)
> res = get_oid("HEAD", oid);
>
> Given that this is still only three lines long, the overhead of having it
> in its own function for just a _single_ call seems excessive. I'd prefer
> it to be inlined in `bisect_state()`.
>
> > +}
> > +
> > +static enum bisect_error bisect_state(struct bisect_terms *terms, const char **argv,
> > + int argc)
> > +{
>
> I offered a lengthy discussion about this function in
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2002272244150.9783@tvgsbejvaqbjf.bet/
>
> It does not look, however, as if v3 benefitted from the entirety of my
> analysis: All the `check_expected_revs()` function does is to verify that
> the passed list of revs matches exactly the contents of the
> `BISECT_EXPECTED_REV` file.
>
> That can be done in a much simpler way, though, by first reading the file
> and parsing the contents into an OID, and then comparing to that parsed
> OID instead.
>
> Besides, `check_expected_revs()` is only used to check one rev at a time.
>
> In other words, it could be simplified to something like this:
>
> static void check_expected_rev(struct object_id *oid)
> {
> struct object_id expected;
> struct strbuf buf = STRBUF_INIT;
>
> if (strbuf_read_file(&buf, git_path_bisect_expected_rev(), 0)
> < the_hash_algo->hexsz ||
> get_oid_hex(buf.buf, &expected) < 0)
> return; /* Ignore invalid file contents */
>
> if (!oideq(oid, &expected)) {
> ... unlink ...
> return;
> }
> }
>
> But even that would be wasteful, as we would read the file over and over
> and over again.
>
> The good news is that we do not even _need_ `check_expected_rev()`.
> Because we do not need to have two call sites, we can simplify the code
> much further. See below:
>
> > + const char *state;
> > + const char *hex;
> > + int i;
> > + struct oid_array revs = OID_ARRAY_INIT;
> > + struct object_id oid;
> > +
> > + if (!argc)
> > + return error(_("Please call `--bisect-state` with at least one argument"));
> > + state = argv[0];
> > + if (check_and_set_terms(terms, state) ||
> > + !one_of(state, terms->term_good,terms->term_bad, "skip", NULL))
> > + return BISECT_FAILED;
> > + argv++;
> > + argc--;
> > + if (!strcmp(state, terms->term_bad) && (argc > 1))
> > + return error(_("'git bisect %s' can take only one argument."),terms->term_bad);
> > + if (argc == 0) {
> > + if (bisect_head(&oid))
> > + return error(_("Bad bisect_head rev input"));
> > + hex = oid_to_hex(&oid);
> > + if (bisect_write(state, hex, terms, 0))
> > + return BISECT_FAILED;
> > + check_expected_revs(&hex, 1);
> > + return bisect_auto_next(terms, NULL);
> > + }
> > +
> > + /* Here argc > 0 */
> > + for (; argc; argc--, argv++) {
> > + struct object_id oid;
> > + if (get_oid(*argv, &oid))
> > + return error(_("Bad rev input: %s"), *argv);
> > + oid_array_append(&revs, &oid);
> > + }
>
> It really does not make sense to parse the arguments into an OID array,
> _then_ iterate over the array once, and then immediately releasing it.
> That OID array is not needed at all.
>
> So we'll end up with this loop in case `argc > 0` (where we now call
> `get_oid()`, too), and note how the loop body looks _eerily_ similar to
> the conditional `argc == 0` code block above?
>
> > +
> > + for (i = 0; i < revs.nr; i++) {
> > + hex = oid_to_hex(&revs.oid[i]);
> > + if (bisect_write(state, hex, terms, 0)) {
> > + oid_array_clear(&revs);
> > + return BISECT_FAILED;
> > + }
> > + check_expected_revs(&hex, 1);
> > + }
> > +
> > + oid_array_clear(&revs);
> > + return bisect_auto_next(terms, NULL);
> > +}
>
> So really, this function pretty much _wants_ to look this way (modulo
> bugs, as I did not even test-compile the code):
>
> static enum bisect_error bisect_state(struct bisect_terms *terms,
> const char **argv, int argc)
> {
> const char *state;
> int i, verify_expected = 1;
> struct object_id oid, expected;
> struct strbuf buf = STRBUF_INIT;
>
> if (!argc)
> return error(_("Please call `--bisect-state` with at least one argument"));
>
> state = argv[0];
> if (check_and_set_terms(terms, state) ||
> !one_of(state, terms->term_good, terms->term_bad, "skip", NULL))
> return BISECT_FAILED;
>
> argv++;
> argc--;
> if (argc > 1 && !strcmp(state, terms->term_bad))
> return error(_("'git bisect %s' can take only one argument."), terms->term_bad);
>
> if (strbuf_read_file(&buf, git_path_bisect_expected_rev(), 0) < the_hash_algo->hexsz ||
> get_oid_hex(buf.buf, &expected) < 0)
> verify_expected = 0; /* Ignore invalid file contents */
>
>
> for (i = 0; i < argc + !argc; i++) {
> if (argc) {
> if (get_oid(argv[i], &oid)) {
> error(_("Bad rev input: %s"), *argv);
> return BISECT_FAILED;
> }
> } else {
> enum get_oid_result res = get_oid("BISECT_HEAD", &oid);
>
> if (res == MISSING_OBJECT)
> res = get_oid("HEAD", &oid);
> if (res) {
> error(_("Bad bisect_head rev input"));
> return BISECT_FAILED;
> }
> }
>
> if (bisect_write(state, oid_to_hex(&oid), terms, 0))
> return BISECT_FAILED;
>
> if (verify_expected && !oideq(&oid, &expected)) {
> unlink_or_warn(git_path_bisect_ancestors_ok());
> unlink_or_warn(git_path_bisect_expected_rev());
> verify_expected = 0;
> }
> }
>
> return bisect_auto_next(terms, NULL);
> }
>
> There, not bad, is it?
>
After implementing this solution some tests failed. After debugging
them, I found that with Pranit's solution, that arguments were parsed
into an OID array, if bisect received some junk rev the function
returned and bisect_write() was not executed.
With the new solution, if junk rev is received after a valid rev,
bisect_write() was executed for the valid and the function returned with
the junk rev.
So, there is garbage in the file and when for example bisect-porcelain
test number 5 - 'bisect fails if given any junk instead of revs'
executes 'test -z' fails.
Should I keep the original patch and add a comment in the code that
explains why we use an oid array?.
(I also have implemented an alternative solution that when some junk
rev is found, I delete all refs written, but maybe is too complicated
or not totally correct:
https://gitlab.com/mirucam/git/-/commit/93f669877b87d09a30a5d07f0967667b22026511
)
> > +
> > int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
> > {
> > enum {
> > @@ -847,7 +907,8 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
> > BISECT_START,
> > BISECT_NEXT,
> > BISECT_AUTO_NEXT,
> > - BISECT_AUTOSTART
> > + BISECT_AUTOSTART,
> > + BISECT_STATE
> > } cmdmode = 0;
> > int no_checkout = 0, res = 0, nolog = 0;
> > struct option options[] = {
> > @@ -873,6 +934,8 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
> > N_("verify the next bisection state then checkout the next bisection commit"), BISECT_AUTO_NEXT),
> > OPT_CMDMODE(0, "bisect-autostart", &cmdmode,
> > N_("start the bisection if BISECT_START is empty or missing"), BISECT_AUTOSTART),
> > + OPT_CMDMODE(0, "bisect-state", &cmdmode,
> > + N_("mark the state of ref (or refs)"), BISECT_STATE),
> > OPT_BOOL(0, "no-checkout", &no_checkout,
> > N_("update BISECT_HEAD instead of checking out the current commit")),
> > OPT_BOOL(0, "no-log", &nolog,
> > @@ -945,6 +1008,11 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
> > set_terms(&terms, "bad", "good");
> > res = bisect_autostart(&terms);
> > break;
> > + case BISECT_STATE:
> > + set_terms(&terms, "bad", "good");
> > + get_terms(&terms);
> > + res = bisect_state(&terms, argv, argc);
> > + break;
> > default:
> > BUG("unknown subcommand %d", (int)cmdmode);
> > }
> > diff --git a/git-bisect.sh b/git-bisect.sh
> > index 049ffacdff..2da0810b1a 100755
> > --- a/git-bisect.sh
> > +++ b/git-bisect.sh
> > @@ -39,16 +39,6 @@ _x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
> > TERM_BAD=bad
> > TERM_GOOD=good
> >
> > -bisect_head()
> > -{
> > - if test -f "$GIT_DIR/BISECT_HEAD"
> > - then
> > - echo BISECT_HEAD
> > - else
> > - echo HEAD
> > - fi
> > -}
> > -
> > bisect_skip() {
> > all=''
> > for arg in "$@"
> > @@ -61,43 +51,7 @@ bisect_skip() {
> > esac
> > all="$all $revs"
> > done
> > - eval bisect_state 'skip' $all
> > -}
> > -
> > -bisect_state() {
> > - git bisect--helper --bisect-autostart
> > - state=$1
> > - git bisect--helper --check-and-set-terms $state $TERM_GOOD $TERM_BAD || exit
> > - get_terms
> > - case "$#,$state" in
> > - 0,*)
> > - die "Please call 'bisect_state' with at least one argument." ;;
> > - 1,"$TERM_BAD"|1,"$TERM_GOOD"|1,skip)
> > - bisected_head=$(bisect_head)
> > - rev=$(git rev-parse --verify "$bisected_head") ||
> > - die "$(eval_gettext "Bad rev input: \$bisected_head")"
> > - git bisect--helper --bisect-write "$state" "$rev" "$TERM_GOOD" "$TERM_BAD" || exit
> > - git bisect--helper --check-expected-revs "$rev" ;;
> > - 2,"$TERM_BAD"|*,"$TERM_GOOD"|*,skip)
> > - shift
> > - hash_list=''
> > - for rev in "$@"
> > - do
> > - sha=$(git rev-parse --verify "$rev^{commit}") ||
> > - die "$(eval_gettext "Bad rev input: \$rev")"
> > - hash_list="$hash_list $sha"
> > - done
> > - for rev in $hash_list
> > - do
> > - git bisect--helper --bisect-write "$state" "$rev" "$TERM_GOOD" "$TERM_BAD" || exit
> > - done
> > - git bisect--helper --check-expected-revs $hash_list ;;
> > - *,"$TERM_BAD")
> > - die "$(eval_gettext "'git bisect \$TERM_BAD' can take only one argument.")" ;;
> > - *)
> > - usage ;;
> > - esac
> > - git bisect--helper --bisect-auto-next
> > + eval git bisect--helper --bisect-state 'skip' $all
> > }
> >
> > bisect_visualize() {
> > @@ -185,8 +139,7 @@ exit code \$res from '\$command' is < 0 or >= 128" >&2
> > state="$TERM_GOOD"
> > fi
> >
> > - # We have to use a subshell because "bisect_state" can exit.
> > - ( bisect_state $state >"$GIT_DIR/BISECT_RUN" )
> > + git bisect--helper --bisect-state $state >"$GIT_DIR/BISECT_RUN"
> > res=$?
> >
> > cat "$GIT_DIR/BISECT_RUN"
> > @@ -201,7 +154,7 @@ exit code \$res from '\$command' is < 0 or >= 128" >&2
> > if [ $res -ne 0 ]
> > then
> > eval_gettextln "bisect run failed:
> > -'bisect_state \$state' exited with error code \$res" >&2
> > +'git bisect--helper --bisect-state \$state' exited with error code \$res" >&2
>
> This is not your fault, of course, but it does make me shudder to see such
> an obvious implementation detail in a user-facing error message.
>
> Maybe something to fix up in a follow-up?
>
> Ciao,
> Dscho
>
> > exit $res
> > fi
> >
> > @@ -242,7 +195,7 @@ case "$#" in
> > start)
> > git bisect--helper --bisect-start "$@" ;;
> > bad|good|new|old|"$TERM_BAD"|"$TERM_GOOD")
> > - bisect_state "$cmd" "$@" ;;
> > + git bisect--helper --bisect-state "$cmd" "$@" ;;
> > skip)
> > bisect_skip "$@" ;;
> > next)
> > --
> > 2.25.0
> >
> >
Best,
Miriam.
next prev parent reply other threads:[~2020-06-20 8:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-23 7:06 [PATCH v3 00/12] Finish converting git bisect to C part 2 Miriam Rubio
2020-04-23 7:06 ` [PATCH v3 01/12] bisect--helper: fix `cmd_*()` function switch default return Miriam Rubio
2020-05-22 13:14 ` Johannes Schindelin
2020-04-23 7:06 ` [PATCH v3 02/12] bisect--helper: use '-res' in 'cmd_bisect__helper' return Miriam Rubio
2020-05-22 13:16 ` Johannes Schindelin
2020-04-23 7:06 ` [PATCH v3 03/12] bisect--helper: introduce new `write_in_file()` function Miriam Rubio
2020-05-22 13:25 ` Johannes Schindelin
2020-05-23 1:53 ` Đoàn Trần Công Danh
2020-04-23 7:06 ` [PATCH v3 04/12] bisect--helper: reimplement `bisect_autostart` shell function in C Miriam Rubio
2020-05-22 19:27 ` Johannes Schindelin
2020-05-22 20:50 ` Johannes Schindelin
2020-04-23 7:06 ` [PATCH v3 05/12] bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions " Miriam Rubio
2020-05-22 20:47 ` Johannes Schindelin
2020-04-23 7:06 ` [PATCH v3 06/12] bisect--helper: finish porting `bisect_start()` to C Miriam Rubio
2020-05-22 21:08 ` Johannes Schindelin
2020-04-23 7:06 ` [PATCH v3 07/12] bisect--helper: retire `--bisect-clean-state` subcommand Miriam Rubio
2020-04-23 7:07 ` [PATCH v3 08/12] bisect--helper: retire `--next-all` subcommand Miriam Rubio
2020-04-23 7:07 ` [PATCH v3 09/12] bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C Miriam Rubio
2020-05-22 22:06 ` Johannes Schindelin
2020-06-20 8:04 ` Miriam R. [this message]
2020-06-19 13:57 ` Johannes Schindelin
2020-04-23 7:07 ` [PATCH v3 10/12] bisect--helper: retire `--check-expected-revs` subcommand Miriam Rubio
2020-04-23 7:07 ` [PATCH v3 11/12] bisect--helper: retire `--write-terms` subcommand Miriam Rubio
2020-04-23 7:07 ` [PATCH v3 12/12] bisect--helper: retire `--bisect-autostart` subcommand Miriam Rubio
2020-04-23 20:01 ` [PATCH v3 00/12] Finish converting git bisect to C part 2 Junio C Hamano
2020-04-25 10:57 ` Miriam R.
2020-05-22 22:09 ` Johannes Schindelin
2020-05-24 21:19 ` Miriam R.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN7CjDAnksyerW_sEDLFS=GT0g7rLt53dUA09a8jAqnZb9P_8w@mail.gmail.com' \
--to=mirucam@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).