From: Orgad Shaneh <orgads@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Orgad Shaneh via GitGitGadget <gitgitgadget@gmail.com>,
git <git@vger.kernel.org>
Subject: Re: [PATCH v3] hooks: allow input from stdin for commit-related hooks
Date: Thu, 19 Nov 2020 22:41:47 +0200 [thread overview]
Message-ID: <CAGHpTBKenJLG7rdhQ+NptjTozKKgn9TW_8pHgeqSf5_0epDYRA@mail.gmail.com> (raw)
In-Reply-To: <xmqqwnyhxilp.fsf@gitster.c.googlers.com>
On Thu, Nov 19, 2020 at 9:16 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Orgad Shaneh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Orgad Shaneh <orgads@gmail.com>
> >
> > Let hooks receive user input if applicable.
> >
> > Closing stdin originates in f5bbc3225 (Port git commit to C,
> > 2007). Looks like the original shell implementation did have
> > stdin open. Not clear why the author chose to close it on
> > the C port (maybe copy&paste).
>
> Please drop unsubstanciated guess in the parentheses. If anything,
> we've learned during the discussion thread that it is a bad idea to
> leave the standard input open when spawning hooks in general, and to
> me it looks like a lot more plausible reason why we tightened the
> interface when it was made from "only to run hooks for 'git commit'"
> to an interface that more widely usable to run any hook. If we are
> not going to record that finding in the log message to help people
> to find out what we knew at the time when the commit was created,
> then we shouldn't mislead them with "maybe copy&paste" that is not
> backed by anything other than a hunch.
Done
> > Allow stdin only for commit-related hooks. Some of the other
> > hooks pass their own input to the hook, so don't change them.
>
> Before this paragraph that gives orders to the code to "be like so",
> the log message needs to explain why it is a good idea to make such
> a change. Which hook benefits by being able to read the standard
> input? Describe what becomes possible in terms of end-user visible
> effects (i.e. "now reading standard input becomes possible for
> pre-commit hook" is *not* an answer. What new things a pre-commit
> hook that now can read from the standard input do for the end user?)
> to justify why such a change is a good thing to have, before this
> paragraph to justify why leaving the standard input open for hooks
> run by "git commit" is a good idea and is a safe thing to do.
Added justification.
> Note that even "git commit" may compete for its standard input with
> hooks. "git commit -F - <message" currently may read the message to
> EOF before doing anything interesting like spawning a hook, but it
> is not implausible that the reading of the message may want to
> happen much later in a future codebase, at which point the hook may
> end up stealing the beginning of the message by reading from the
> standard input. So ideally, if we can find a way to selectively
> close the standard input for the hooks if "git commit" itself uses
> the standard input, that would be better than unconditionally
> leaving it open.
Good catch! I found that pre-commit is called before reading stdin,
so -F - is broken if pre-commit reads the input. Not sure what's the
best way to solve this. Should I pass the flag to run_commit_hook
everywhere? Or maybe add a new opt-out flag for run_commit_hook,
and pass 0 on most calls?
> Let's reorder the patch hunks to see the bottom layer first, as the
> callers are mostly the same.
>
> > diff --git a/run-command.h b/run-command.h
> > index 6472b38bde..e6a850c6fe 100644
> > --- a/run-command.h
> > +++ b/run-command.h
> > @@ -201,11 +201,16 @@ int run_command(struct child_process *);
> > */
> > const char *find_hook(const char *name);
> >
> > +#define RUN_HOOK_ALLOW_STDIN 1
> > +
> > /**
> > * Run a hook.
> > - * The first argument is a pathname to an index file, or NULL
> > - * if the hook uses the default index file or no index is needed.
> > - * The second argument is the name of the hook.
> > + * The first argument is an array of environment variables, or NULL
> > + * if the hook uses the default environment and doesn't require
> > + * additional variables.
> > + * The second argument is zero or RUN_HOOK_ALLOW_STDIN, which enables
> > + * stdin for the child process (the default is no_stdin).
> > + * The third argument is the name of the hook.
> > * The further arguments correspond to the hook arguments.
> > * The last argument has to be NULL to terminate the arguments list.
> > * If the hook does not exist or is not executable, the return
> > @@ -215,8 +220,8 @@ const char *find_hook(const char *name);
> > * On execution, .stdout_to_stderr and .no_stdin will be set.
> > */
> > LAST_ARG_MUST_BE_NULL
> > -int run_hook_le(const char *const *env, const char *name, ...);
> > -int run_hook_ve(const char *const *env, const char *name, va_list args);
> > +int run_hook_le(const char *const *env, int opt, const char *name, ...);
> > +int run_hook_ve(const char *const *env, int opt, const char *name, va_list args);
>
> Is this new parameter meant to be used as an enum? When the
> run_hook interface gets extended the next time and we want a new
> option, is the option expected to be mutually incompatible with
> allow-stdin?
You found it :)
> I suspect that it would make this a more useful API if this new
> parameter is not used as an enum but as a collection of flag bits.
>
> If so, a few things must change in the above:
>
> - The description of the second parameter in the comment shouldn't
> say "zero or RUN_HOOK_ALLOW_STDIN"; it should rather say "an
> OR'ed collection of feature bits like RUN_HOOK_ALLOW_STDIN
> defined above"
Done, thanks.
> - The second parameter should be 'unsigned flags', not 'int opt'.
I copied from run_command_v_opt*, which have int opt for flags. Changed anyway.
> It is my understanding that "git commit" only needs run_hook_ve() to
> drive its hook scripts. Isn't it premature to touch run_hook_le(),
> in which nobody wants to leave the standard input open while running
> hooks? It _might_ be a better idea to allow users of _le() to do
> the same eventually, but then perhaps it is a good idea to do so in
> a separate step at the end, as "only to be complete" patch. That
> is, the structure of the topic ought to be something like:
>
> - [PATCH 1/2] add the "unsigned flags" word to _ve(), assign the
> RUN_HOOK_ALLOW_STDIN bit, and update commit.c::run_commit_hook()
> to pass RUN_HOOK_ALLOW_STDIN to it.
>
> - [PATCH 2/2] after surveying the options "git commit" takes, find
> out the condition where "git commit" itself would want to consume
> the standard input (e.g. "commit -F -", there may be others), and
> tell run_commit_hook() *not* to pass RUN_HOOK_ALLOW_STDIN when we
> use the standard input ourselves (i.e. forbid hooks to read from
> it).
Accepted. Waiting for your feedback to implement this part.
> - [PATCH 3/2] add the same "unsigned flags" word to _le(), and
> teach all callers to pass 0, as a "just for completeness" step.
>
> Personally, I think we should stop at [2/2], and do not do [3/2], as
> there is no real demonstrated use of the standard input for hooks.
> Especially because users of the _le() interface includes programs
> like receive-pack whose standard input should not be molested, I'd
> feel safer not to see [3/2] done at all (for that matter, I'm not
> happy with [1/2] unless it comes with [2/2], either).
Agreed.
> > + if (!(opt & RUN_HOOK_ALLOW_STDIN))
> > + hook.no_stdin = 1;
>
> OK, so you are using the parameter as a flag word after all. Then
> "int opt" should definitely be "unsigned flags". And these two
> lines would be more readable, when written like so:
>
> hook.no_stdin = !(flags & RUN_HOOK_ALLOW_STDIN);
>
> I would think.
Done.
> > hook.stdout_to_stderr = 1;
> > hook.trace2_hook_name = name;
> >
> > return run_command(&hook);
> > }
> >
> > -int run_hook_le(const char *const *env, const char *name, ...)
> > +int run_hook_le(const char *const *env, int opt, const char *name, ...)
> > {
> > va_list args;
> > int ret;
> >
> > va_start(args, name);
> > - ret = run_hook_ve(env, name, args);
> > + ret = run_hook_ve(env, opt, name, args);
>
> I'd rather not to see the function signature of _le() changed in
> patches [1/2] and [2/2]; instead we can just pass hardcoded 0 from
> here to the underlying _ve().
>
> Now, what is left is individual commands that use the run_hook
> interface.
>
> > diff --git a/commit.c b/commit.c
> > index fe1fa3dc41..775019ec9d 100644
> > --- a/commit.c
> > +++ b/commit.c
> > @@ -1646,7 +1646,7 @@ int run_commit_hook(int editor_is_used, const char *index_file,
> > strvec_push(&hook_env, "GIT_EDITOR=:");
> >
> > va_start(args, name);
> > - ret = run_hook_ve(hook_env.v, name, args);
> > + ret = run_hook_ve(hook_env.v, RUN_HOOK_ALLOW_STDIN, name, args);
> > va_end(args);
> > strvec_clear(&hook_env);
> >
>
> This is good for [1/2]. We should avoid "git commit" from competing
> with hooks for its standard input by conditionally passing
> ALLOW_STDIN from here---only when the program itself does not use
> the standard input in [2/2].
>
> > diff --git a/builtin/am.c b/builtin/am.c
>
> "git am <mbox" reads from the mailbox. "git am -i" interacts with
> the end user via its stdin/stdout. There may be other situations
> where the hooks should not touch the standard input.
>
> > diff --git a/builtin/checkout.c b/builtin/checkout.c
>
> I do not offhand think of a reason why "git checkout" would compete
> with its hooks for the standard input. If we were to allow hooks to
> read from the standard input, that should come as an independent
> patch for each program after patch [3/2], I think. The ones I don't
> mention below should never leave the standard input open for hooks.
>
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > diff --git a/builtin/rebase.c b/builtin/rebase.c
> > diff --git a/reset.c b/reset.c
>
> Ditto.
>
>
> > diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> > index b3485450a2..e915ffe546 100755
> > --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> > +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> > @@ -7,6 +7,7 @@ test_description='pre-commit and pre-merge-commit hooks'
> > HOOKDIR="$(git rev-parse --git-dir)/hooks"
> > PRECOMMIT="$HOOKDIR/pre-commit"
> > PREMERGE="$HOOKDIR/pre-merge-commit"
> > +POSTCOMMIT="$HOOKDIR/post-commit"
> >
> > # Prepare sample scripts that write their $0 to actual_hooks
> > test_expect_success 'sample script setup' '
> > @@ -28,11 +29,15 @@ test_expect_success 'sample script setup' '
> > echo $0 >>actual_hooks
> > test $GIT_PREFIX = "success/"
> > EOF
> > - write_script "$HOOKDIR/check-author.sample" <<-\EOF
> > + write_script "$HOOKDIR/check-author.sample" <<-\EOF &&
> > echo $0 >>actual_hooks
> > test "$GIT_AUTHOR_NAME" = "New Author" &&
> > test "$GIT_AUTHOR_EMAIL" = "newauthor@example.com"
> > EOF
> > + write_script "$HOOKDIR/user-input.sample" <<-\EOF
> > + ! read -r line || echo "$line" > hook_input
> > + exit 0
>
> Style (Documentation/CodingGuidelines)
Fixed.
> - Redirection operators should be written with space before, but no
> space after them. In other words, write 'echo test >"$file"'
> instead of 'echo test> $file' or 'echo test > $file'.
>
> So, when our "read" immediately hits EOF (or I/O error, but let's
> not worry about that case for now), we leave hook_input file alone,
> but otherwise we write the single line we read to hook_input, and
> regardless of an error, we report success to the invoking "git
> commit". Which makes sense.
>
> We report success even when we had trouble writing into hook_input
> file, though. Perhaps you should lose the "exit 0" at the end?
Removed.
> > + EOF
> > '
> >
> > test_expect_success 'root commit' '
> > @@ -278,4 +283,34 @@ test_expect_success 'check the author in hook' '
> > test_cmp expected_hooks actual_hooks
> > '
> >
> > +test_expect_success 'with user input' '
> > + test_when_finished "rm -f \"$PRECOMMIT\" user_input hook_input" &&
> > + cp "$HOOKDIR/user-input.sample" "$PRECOMMIT" &&
> > + echo "user input" > user_input &&
>
> Style (I won't repeat from here on).
Fixed.
> > + echo "more" >>file &&
> > + git add file &&
> > + git commit -m "more" < user_input &&
> > + test_cmp user_input hook_input
> > +'
>
> This is probably a good place to also test
>
> git commit -F - <user_input
>
> and see what happens.
Added a test (currently failing).
> Thanks.
Thank you! Your feedback is thorough and helpful.
- Orgad
next prev parent reply other threads:[~2020-11-19 20:45 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-17 15:02 [PATCH] hooks: allow input from stdin Orgad Shaneh via GitGitGadget
2020-11-17 19:59 ` Junio C Hamano
2020-11-19 15:50 ` [PATCH v2] " Orgad Shaneh via GitGitGadget
2020-11-19 15:56 ` [PATCH v3] hooks: allow input from stdin for commit-related hooks Orgad Shaneh via GitGitGadget
2020-11-19 19:16 ` Junio C Hamano
2020-11-19 20:41 ` Orgad Shaneh [this message]
2020-11-19 20:56 ` [PATCH v4 0/2] " Orgad Shaneh via GitGitGadget
2020-11-19 20:56 ` [PATCH v4 1/2] " Orgad Shaneh via GitGitGadget
2020-11-19 21:23 ` Eric Sunshine
2020-11-19 21:32 ` Junio C Hamano
2020-11-20 5:23 ` Orgad Shaneh
2020-11-20 6:38 ` Eric Sunshine
2020-11-20 6:48 ` Eric Sunshine
2020-11-20 7:16 ` Orgad Shaneh
2020-11-20 18:13 ` Junio C Hamano
2020-11-20 10:59 ` Ævar Arnfjörð Bjarmason
2020-11-20 12:34 ` Orgad Shaneh
2020-11-19 20:56 ` [PATCH v4 2/2] commit: fix stdin conflict between message and hook Orgad Shaneh via GitGitGadget
2020-12-09 20:06 ` [PATCH v5 0/2] hooks: allow input from stdin for commit-related hooks Orgad Shaneh via GitGitGadget
2020-12-09 20:06 ` [PATCH v5 1/2] hooks: lay foundations for passing stdin to hooks Orgad Shaneh via GitGitGadget
2020-12-09 20:06 ` [PATCH v5 2/2] hooks: allow input from stdin for commit-related hooks Orgad Shaneh via GitGitGadget
2020-12-09 22:37 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGHpTBKenJLG7rdhQ+NptjTozKKgn9TW_8pHgeqSf5_0epDYRA@mail.gmail.com \
--to=orgads@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).