git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Philip Oakley <philipoakley@iee.email>,
	Phillip Wood <phillip.wood123@gmail.com>
Subject: Re: [PATCH v2] sequencer: fix edit handling for cherry-pick and revert messages
Date: Tue, 30 Mar 2021 12:37:47 -0700	[thread overview]
Message-ID: <CABPp-BGwAtpsQJ8U5N1q21PMkideptY2MB2PNgbPqvya+XuyHg@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2103301200020.52@tvgsbejvaqbjf.bet>

On Tue, Mar 30, 2021 at 3:13 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Tue, 30 Mar 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > save_opts() should save any non-default values.  It was intended to do
> > this, but since most options in struct replay_opts default to 0, it only
> > saved non-zero values.  Unfortunately, this does not always work for
> > options.edit.  Roughly speaking, options.edit had a default value of 0
> > for cherry-pick but a default value of 1 for revert.  Make save_opts()
> > record a value whenever it differs from the default.
> >
> > options.edit was also overly simplistic; we had more than two cases.
> > The behavior that previously existed was as follows:
> >
> >                        Non-conflict commits    Right after Conflict
> >     revert             Edit iff isatty(0)      Edit (ignore isatty(0))
> >     cherry-pick        No edit                 See above
> >     Specify --edit     Edit (ignore isatty(0)) See above
> >     Specify --no-edit  (*)                     See above
> >
> >     (*) Before stopping for conflicts, No edit is the behavior.  After
> >         stopping for conflicts, the --no-edit flag is not saved so see
> >         the first two rows.
> >
> > However, the expected behavior is:
> >
> >                        Non-conflict commits    Right after Conflict
> >     revert             Edit iff isatty(0)      Edit iff isatty(0)
> >     cherry-pick        No edit                 Edit iff isatty(0)
> >     Specify --edit     Edit (ignore isatty(0)) Edit (ignore isatty(0))
> >     Specify --no-edit  No edit                 No edit
> >
> > In order to get the expected behavior, we need to change options.edit
> > to a tri-state: unspecified, false, or true.  When specified, we follow
> > what it says.  When unspecified, we need to check whether the current
> > commit being created is resolving a conflict as well as consulting
> > options.action and isatty(0).  While at it, add a should_edit() utility
> > function that compresses options.edit down to a boolean based on the
> > additional information for the non-conflict case.
> >
> > continue_single_pick() is the function responsible for resuming after
> > conflict cases, regardless of whether there is one commit being picked
> > or many.  Make this function stop assuming edit behavior in all cases,
> > so that it can correctly handle !isatty(0) and specific requests to not
> > edit the commit message.
>
> Nicely explained!
>
> I'll allow myself one tangent: the subject of the sequencer's Unix shell
> script heritage seems to come up with an increasing frequency, in
> particular the awful "let's write out one file per setting" strategy.
>
> I would _love_ for `save_opts()` to write a JSON instead (or an INI via
> the `git_config_*()` family of functions, as is done already by the
> cherry-pick/revert stuff), now that we no longer have any shell script
> backend (apart from `--preserve-merges`, but that one is on its way out
> anyway).
>
> The one thing that concerns me with this idea is that I know for a fact
> that some enterprisey users play games with those files inside
> `<GIT_DIR>/rebase-merge` that should be considered internal implementation
> details. Not sure how to deprecate that properly, I don't think we have a
> sane way to detect whether users rely on these implementation details
> other than breaking their expectations, which is not really a gentle way
> to ask them to update their scripts.

Ooh, I'm glad you took this tangent.  May I follow it for a second?
I'd really, really like this too, for three reasons:

1) I constantly get confused about the massive duplication and
difference in control structures and which ones affect which type of
operations in sequencer.c.  Having them both use an ini-file would
allow us to remove lots of that duplication.  I'm sure there'll still
be some rebase-specific or cherry-pick-specific options, but we don't
need a separate parallel structure for every piece of config.

2) In my merge-ort optimization work, rebasing 35 commits in linux.git
across a massive set of 26K upstream renames has dropped rename
detection time from the top spot.  And it also dropped several other
things in the merge machinery from their spots as well.  Do you know
what's the slowest now?  Wasted time from sequencer.c: the unnecessary
process forking and all the useless disk writing (which I suspect is
mostly updating the index and working directory but also writing all
the individual control files under .git/rebase-merge/).  And this
stuff from sequencer.c is not just barely the slowest part, it's the
slowest by a wide margin.

3) I also want to allow cherry-picking or rebasing branches that
aren't even checked out (assuming no conflicts are triggered;
otherwise an error can be shown with the user asked to repeat with the
operation connected to an active working directory).  For such an
operation, the difference between "cherry-pick" and "rebase" is nearly
irrelevant so you'd expect the code to be the same; every time I look
at the code, though, it seems that the control structures are
separating these two types of operations in more areas than just the
reading and writing of the config.

> > diff --git a/builtin/revert.c b/builtin/revert.c
> > index 314a86c5621b..81441020231a 100644
> > --- a/builtin/revert.c
> > +++ b/builtin/revert.c
> > @@ -182,7 +182,7 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
> >                               "--signoff", opts->signoff,
> >                               "--no-commit", opts->no_commit,
> >                               "-x", opts->record_origin,
> > -                             "--edit", opts->edit,
> > +                             "--edit", opts->edit == 1,
>
> Honestly, I'd prefer `> 0` here.

WFM; I'll change it.

>
> >                               NULL);
> >
> >       if (cmd) {
> > @@ -230,8 +230,6 @@ int cmd_revert(int argc, const char **argv, const char *prefix)
> >       struct replay_opts opts = REPLAY_OPTS_INIT;
> >       int res;
> >
> > -     if (isatty(0))
> > -             opts.edit = 1;
> >       opts.action = REPLAY_REVERT;
> >       sequencer_init_config(&opts);
> >       res = run_sequencer(argc, argv, &opts);
> > diff --git a/sequencer.c b/sequencer.c
> > index 848204d3dc3f..d444c778a097 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -1860,14 +1860,26 @@ static void record_in_rewritten(struct object_id *oid,
> >               flush_rewritten_pending();
> >  }
> >
> > +static int should_edit(struct replay_opts *opts) {
> > +     assert(opts->edit >= -1 && opts->edit <= 1);
>
> Do we really want to introduce more of these useless `assert()`s? I know
> that we stopped converting them to `BUG()`, but I really dislike
> introducing new ones: they have very little effect, being no-ops by
> default in most setups.

What do you mean by "useless"?  They serve two purposes:
  * It's a very concise comment understood by readers of the code
  * Many asserts are helpful guardrails for future people attempting
to change the code and assumptions they were written under

Neither of these have anything to do with users running code in
production -- if that mattered, I would have used BUG().  Since
production use isn't relevant here, I could use code comments, but
those tend to be much more verbose, and less helpful in catching areas
where assumptions were important to the code for any future folks
(possibly myself in a few years) who want to change the code in ways
that might call into question previous assumptions.

> > +     if (opts->edit == -1)
>
> Maybe `< 0`, as we do elsewhere for "not specified"?

Makes sense; will change.

> > +             /*
> > +              * Note that we only handle the case of non-conflicted
> > +              * commits; continue_single_pick() handles the conflicted
> > +              * commits itself instead of calling this function.
> > +              */
> > +             return (opts->action == REPLAY_REVERT && isatty(0)) ? 1 : 0;
>
> Apart from the extra parentheses, that makes sense to me.

The extra parentheses make it more readable to me; since the statement
still makes sense to you, I'll leave them in.  :-)

> > +     return opts->edit;
> > +}
> > +
> >  static int do_pick_commit(struct repository *r,
> >                         enum todo_command command,
> >                         struct commit *commit,
> >                         struct replay_opts *opts,
> >                         int final_fixup, int *check_todo)
> >  {
> > -     unsigned int flags = opts->edit ? EDIT_MSG : 0;
> > -     const char *msg_file = opts->edit ? NULL : git_path_merge_msg(r);
> > +     unsigned int flags = should_edit(opts) ? EDIT_MSG : 0;
> > +     const char *msg_file = should_edit(opts) ? NULL : git_path_merge_msg(r);
> >       struct object_id head;
> >       struct commit *base, *next, *parent;
> >       const char *base_label, *next_label;
> > @@ -3101,9 +3113,9 @@ static int save_opts(struct replay_opts *opts)
> >       if (opts->no_commit)
> >               res |= git_config_set_in_file_gently(opts_file,
> >                                       "options.no-commit", "true");
> > -     if (opts->edit)
> > -             res |= git_config_set_in_file_gently(opts_file,
> > -                                     "options.edit", "true");
> > +     if (opts->edit != -1)
>
> s/!= -1/>= 0/

Will fix.

> > +             res |= git_config_set_in_file_gently(opts_file, "options.edit",
> > +                                                  opts->edit ? "true" : "false");
> >       if (opts->allow_empty)
> >               res |= git_config_set_in_file_gently(opts_file,
> >                                       "options.allow-empty", "true");
> > @@ -4077,7 +4089,7 @@ static int pick_commits(struct repository *r,
> >       prev_reflog_action = xstrdup(getenv(GIT_REFLOG_ACTION));
> >       if (opts->allow_ff)
> >               assert(!(opts->signoff || opts->no_commit ||
> > -                      opts->record_origin || opts->edit ||
> > +                      opts->record_origin || should_edit(opts) ||
> >                        opts->committer_date_is_author_date ||
> >                        opts->ignore_date));
> >       if (read_and_refresh_cache(r, opts))
> > @@ -4370,14 +4382,35 @@ static int pick_commits(struct repository *r,
> >       return sequencer_remove_state(opts);
> >  }
> >
> > -static int continue_single_pick(struct repository *r)
> > +static int continue_single_pick(struct repository *r, struct replay_opts *opts)
> >  {
> > -     const char *argv[] = { "commit", NULL };
> > +     struct strvec argv = STRVEC_INIT;
> > +     int want_edit;
>
> Do we really want that extra `want_edit` variable? I think the code would
> be easier to read without it, and still be obvious enough.
>
> > +     int ret;
> >
> >       if (!refs_ref_exists(get_main_ref_store(r), "CHERRY_PICK_HEAD") &&
> >           !refs_ref_exists(get_main_ref_store(r), "REVERT_HEAD"))
> >               return error(_("no cherry-pick or revert in progress"));
> > -     return run_command_v_opt(argv, RUN_GIT_CMD);
> > +
> > +     strvec_push(&argv, "commit");
> > +
> > +     /*
> > +      * continue_single_pick() handles the case of recovering from a
> > +      * conflict.  should_edit() doesn't handle that case; for a conflict,
> > +      * we want to edit if the user asked for it, or if they didn't specify
> > +      * and stdin is a tty.
> > +      */
> > +     want_edit = (opts->edit == 1) || ((opts->edit == -1) && isatty(0));
> > +     if (!want_edit)
>
> Here is what I would prefer:
>
>         if (!opts->edit || (opts->edit < 0 && !isatty(0)))

Given the changes elsewhere to use >0 as true, <0 as unspecified, and
==0 for false, this change makes perfect sense.  I'll include it.

> The rest looks good, and the comments are _really_ helpful.
>
> And the remainder of the patch also looks good, so I will spare readers
> time by not even quoting it.
>
> Thank you!
> Dscho

  parent reply	other threads:[~2021-03-30 19:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26  7:16 [PATCH] sequencer: fix edit handling for cherry-pick and revert messages Elijah Newren via GitGitGadget
2021-03-26 12:27 ` Philip Oakley
2021-03-26 15:12   ` Elijah Newren
2021-03-28  1:30     ` Junio C Hamano
2021-03-29  9:23 ` Phillip Wood
2021-03-29 20:52   ` Junio C Hamano
2021-03-29 21:25   ` Elijah Newren
2021-03-30  2:09 ` [PATCH v2] " Elijah Newren via GitGitGadget
2021-03-30 10:13   ` Johannes Schindelin
2021-03-30 18:47     ` Junio C Hamano
2021-03-30 20:16       ` Elijah Newren
2021-03-31 17:36         ` Ævar Arnfjörð Bjarmason
2021-03-31 17:52           ` Elijah Newren
2021-03-31 18:01         ` Junio C Hamano
2021-04-01 16:31           ` Elijah Newren
2021-03-30 19:37     ` Elijah Newren [this message]
2021-03-31 13:48       ` unifying sequencer's options persisting, was " Johannes Schindelin
2021-04-02 11:28         ` Phillip Wood
2021-04-02 13:10           ` Phillip Wood
2021-04-02 21:01           ` Junio C Hamano
2021-04-02 22:18             ` Elijah Newren
2021-04-02 22:27               ` Junio C Hamano
2021-04-08  2:40                 ` Johannes Schindelin
2021-04-08 17:45                   ` Junio C Hamano
2021-04-08 19:58                   ` Christian Couder
2021-04-09 13:53                     ` Johannes Schindelin
2021-03-31  6:52   ` [PATCH v3] " Elijah Newren via GitGitGadget
2021-03-31 14:38     ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BGwAtpsQJ8U5N1q21PMkideptY2MB2PNgbPqvya+XuyHg@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=philipoakley@iee.email \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).