From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 8486F1F66F for ; Thu, 19 Nov 2020 20:45:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726107AbgKSUmA (ORCPT ); Thu, 19 Nov 2020 15:42:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725890AbgKSUl7 (ORCPT ); Thu, 19 Nov 2020 15:41:59 -0500 Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B2F5C0613CF for ; Thu, 19 Nov 2020 12:41:59 -0800 (PST) Received: by mail-qk1-x741.google.com with SMTP id r7so6820851qkf.3 for ; Thu, 19 Nov 2020 12:41:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=P2mgr5scbNp50XjdD8N5FKNfmCw0q0A072DmH/YlIc0=; b=UML8pvIdK7Sej9/P+wWlS+CR/HLHShqLdx97/zVK3FGXHzZLuUt3fMdxiYu4m9LIie zOZYUoke30kf4V2OcPfOMKQiV3LfPxlLaxaW+4MhQA7YEGXuqHmHS5dmSpfY9mH+VqpT RBwRztQ7WBfXqDabaD4wdhV7jFxs+AvxuDsUzYp8GKwsFL9OQcEEzts+FJqEJB4fw33u iayW5SD+ZW79mPQcyfYT3B3yWCh2bQOycs2NGWkWbUkKz+bzokmXRREcj7ikBcDHXROI LuSgFpRlD3hv4lJ2YJyW9msZ4Jj1IU1MZNhIBE5tUCCg+e8yUFsx5Cml0W6zkMZZgVi+ u3pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=P2mgr5scbNp50XjdD8N5FKNfmCw0q0A072DmH/YlIc0=; b=LMfkVP8euMAt2G4Gi4jBFx4lPP2Jjur0AV8VX1vgTuyiuxgMUQFMx+zxt37leYvkm0 xUSp8KcT6i9f6axX7lNGHU4w3CnUhO6WjSnpjqZkwbp7BYYXyYyPtlGgj9WiMhfj6kbC 77l0/+CzWyQNZYBHW/Yhf51FaXeOAMvWT6N8GyaKa4K5DQxLd2wH5O2/iIyVmnAZ87EV kOfdQSiuNWixAiha/KEv76PTNTVkCXPiBvx/EwaqdwN3ArEvTLQn2V7fGI0wjNZltaNi tUHWQunQ/t9BkZppEz2JU0V5N+X/49i2b9u+VFmn9ND8hlB8PfTFnEnnxb3YKqdu4EQV IUZQ== X-Gm-Message-State: AOAM530RH2RPCMdcwD/ylcncQdU669y6C3T/FmtVMpg4xEymySWOI/Ri HSAsqjm6bdfElOOHLrjR2m2lVrar9njQKf3hvWA= X-Google-Smtp-Source: ABdhPJz2wQXFpboW6s667Q6bI5Hob/GH7dSQUXD+K0ej1vnS4Xr6AFZITC0Gl7hLUZbljWZ8Kxa1w/Gq+p6qbwZOHoQ= X-Received: by 2002:a05:620a:64d:: with SMTP id a13mr13659713qka.482.1605818518484; Thu, 19 Nov 2020 12:41:58 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Orgad Shaneh Date: Thu, 19 Nov 2020 22:41:47 +0200 Message-ID: Subject: Re: [PATCH v3] hooks: allow input from stdin for commit-related hooks To: Junio C Hamano Cc: Orgad Shaneh via GitGitGadget , git Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Nov 19, 2020 at 9:16 PM Junio C Hamano wrote: > > "Orgad Shaneh via GitGitGadget" writes: > > > From: Orgad Shaneh > > > > Let hooks receive user input if applicable. > > > > Closing stdin originates in f5bbc3225 (Port git commit to C, > > 2007). Looks like the original shell implementation did have > > stdin open. Not clear why the author chose to close it on > > the C port (maybe copy&paste). > > Please drop unsubstanciated guess in the parentheses. If anything, > we've learned during the discussion thread that it is a bad idea to > leave the standard input open when spawning hooks in general, and to > me it looks like a lot more plausible reason why we tightened the > interface when it was made from "only to run hooks for 'git commit'" > to an interface that more widely usable to run any hook. If we are > not going to record that finding in the log message to help people > to find out what we knew at the time when the commit was created, > then we shouldn't mislead them with "maybe copy&paste" that is not > backed by anything other than a hunch. Done > > Allow stdin only for commit-related hooks. Some of the other > > hooks pass their own input to the hook, so don't change them. > > Before this paragraph that gives orders to the code to "be like so", > the log message needs to explain why it is a good idea to make such > a change. Which hook benefits by being able to read the standard > input? Describe what becomes possible in terms of end-user visible > effects (i.e. "now reading standard input becomes possible for > pre-commit hook" is *not* an answer. What new things a pre-commit > hook that now can read from the standard input do for the end user?) > to justify why such a change is a good thing to have, before this > paragraph to justify why leaving the standard input open for hooks > run by "git commit" is a good idea and is a safe thing to do. Added justification. > Note that even "git commit" may compete for its standard input with > hooks. "git commit -F - EOF before doing anything interesting like spawning a hook, but it > is not implausible that the reading of the message may want to > happen much later in a future codebase, at which point the hook may > end up stealing the beginning of the message by reading from the > standard input. So ideally, if we can find a way to selectively > close the standard input for the hooks if "git commit" itself uses > the standard input, that would be better than unconditionally > leaving it open. Good catch! I found that pre-commit is called before reading stdin, so -F - is broken if pre-commit reads the input. Not sure what's the best way to solve this. Should I pass the flag to run_commit_hook everywhere? Or maybe add a new opt-out flag for run_commit_hook, and pass 0 on most calls? > Let's reorder the patch hunks to see the bottom layer first, as the > callers are mostly the same. > > > diff --git a/run-command.h b/run-command.h > > index 6472b38bde..e6a850c6fe 100644 > > --- a/run-command.h > > +++ b/run-command.h > > @@ -201,11 +201,16 @@ int run_command(struct child_process *); > > */ > > const char *find_hook(const char *name); > > > > +#define RUN_HOOK_ALLOW_STDIN 1 > > + > > /** > > * Run a hook. > > - * The first argument is a pathname to an index file, or NULL > > - * if the hook uses the default index file or no index is needed. > > - * The second argument is the name of the hook. > > + * The first argument is an array of environment variables, or NULL > > + * if the hook uses the default environment and doesn't require > > + * additional variables. > > + * The second argument is zero or RUN_HOOK_ALLOW_STDIN, which enables > > + * stdin for the child process (the default is no_stdin). > > + * The third argument is the name of the hook. > > * The further arguments correspond to the hook arguments. > > * The last argument has to be NULL to terminate the arguments list. > > * If the hook does not exist or is not executable, the return > > @@ -215,8 +220,8 @@ const char *find_hook(const char *name); > > * On execution, .stdout_to_stderr and .no_stdin will be set. > > */ > > LAST_ARG_MUST_BE_NULL > > -int run_hook_le(const char *const *env, const char *name, ...); > > -int run_hook_ve(const char *const *env, const char *name, va_list args); > > +int run_hook_le(const char *const *env, int opt, const char *name, ...); > > +int run_hook_ve(const char *const *env, int opt, const char *name, va_list args); > > Is this new parameter meant to be used as an enum? When the > run_hook interface gets extended the next time and we want a new > option, is the option expected to be mutually incompatible with > allow-stdin? You found it :) > I suspect that it would make this a more useful API if this new > parameter is not used as an enum but as a collection of flag bits. > > If so, a few things must change in the above: > > - The description of the second parameter in the comment shouldn't > say "zero or RUN_HOOK_ALLOW_STDIN"; it should rather say "an > OR'ed collection of feature bits like RUN_HOOK_ALLOW_STDIN > defined above" Done, thanks. > - The second parameter should be 'unsigned flags', not 'int opt'. I copied from run_command_v_opt*, which have int opt for flags. Changed anyway. > It is my understanding that "git commit" only needs run_hook_ve() to > drive its hook scripts. Isn't it premature to touch run_hook_le(), > in which nobody wants to leave the standard input open while running > hooks? It _might_ be a better idea to allow users of _le() to do > the same eventually, but then perhaps it is a good idea to do so in > a separate step at the end, as "only to be complete" patch. That > is, the structure of the topic ought to be something like: > > - [PATCH 1/2] add the "unsigned flags" word to _ve(), assign the > RUN_HOOK_ALLOW_STDIN bit, and update commit.c::run_commit_hook() > to pass RUN_HOOK_ALLOW_STDIN to it. > > - [PATCH 2/2] after surveying the options "git commit" takes, find > out the condition where "git commit" itself would want to consume > the standard input (e.g. "commit -F -", there may be others), and > tell run_commit_hook() *not* to pass RUN_HOOK_ALLOW_STDIN when we > use the standard input ourselves (i.e. forbid hooks to read from > it). Accepted. Waiting for your feedback to implement this part. > - [PATCH 3/2] add the same "unsigned flags" word to _le(), and > teach all callers to pass 0, as a "just for completeness" step. > > Personally, I think we should stop at [2/2], and do not do [3/2], as > there is no real demonstrated use of the standard input for hooks. > Especially because users of the _le() interface includes programs > like receive-pack whose standard input should not be molested, I'd > feel safer not to see [3/2] done at all (for that matter, I'm not > happy with [1/2] unless it comes with [2/2], either). Agreed. > > + if (!(opt & RUN_HOOK_ALLOW_STDIN)) > > + hook.no_stdin = 1; > > OK, so you are using the parameter as a flag word after all. Then > "int opt" should definitely be "unsigned flags". And these two > lines would be more readable, when written like so: > > hook.no_stdin = !(flags & RUN_HOOK_ALLOW_STDIN); > > I would think. Done. > > hook.stdout_to_stderr = 1; > > hook.trace2_hook_name = name; > > > > return run_command(&hook); > > } > > > > -int run_hook_le(const char *const *env, const char *name, ...) > > +int run_hook_le(const char *const *env, int opt, const char *name, ...) > > { > > va_list args; > > int ret; > > > > va_start(args, name); > > - ret = run_hook_ve(env, name, args); > > + ret = run_hook_ve(env, opt, name, args); > > I'd rather not to see the function signature of _le() changed in > patches [1/2] and [2/2]; instead we can just pass hardcoded 0 from > here to the underlying _ve(). > > Now, what is left is individual commands that use the run_hook > interface. > > > diff --git a/commit.c b/commit.c > > index fe1fa3dc41..775019ec9d 100644 > > --- a/commit.c > > +++ b/commit.c > > @@ -1646,7 +1646,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, > > strvec_push(&hook_env, "GIT_EDITOR=:"); > > > > va_start(args, name); > > - ret = run_hook_ve(hook_env.v, name, args); > > + ret = run_hook_ve(hook_env.v, RUN_HOOK_ALLOW_STDIN, name, args); > > va_end(args); > > strvec_clear(&hook_env); > > > > This is good for [1/2]. We should avoid "git commit" from competing > with hooks for its standard input by conditionally passing > ALLOW_STDIN from here---only when the program itself does not use > the standard input in [2/2]. > > > diff --git a/builtin/am.c b/builtin/am.c > > "git am the end user via its stdin/stdout. There may be other situations > where the hooks should not touch the standard input. > > > diff --git a/builtin/checkout.c b/builtin/checkout.c > > I do not offhand think of a reason why "git checkout" would compete > with its hooks for the standard input. If we were to allow hooks to > read from the standard input, that should come as an independent > patch for each program after patch [3/2], I think. The ones I don't > mention below should never leave the standard input open for hooks. > > > diff --git a/builtin/clone.c b/builtin/clone.c > > diff --git a/builtin/gc.c b/builtin/gc.c > > diff --git a/builtin/merge.c b/builtin/merge.c > > diff --git a/builtin/rebase.c b/builtin/rebase.c > > diff --git a/reset.c b/reset.c > > Ditto. > > > > diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > index b3485450a2..e915ffe546 100755 > > --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh > > @@ -7,6 +7,7 @@ test_description='pre-commit and pre-merge-commit hooks' > > HOOKDIR="$(git rev-parse --git-dir)/hooks" > > PRECOMMIT="$HOOKDIR/pre-commit" > > PREMERGE="$HOOKDIR/pre-merge-commit" > > +POSTCOMMIT="$HOOKDIR/post-commit" > > > > # Prepare sample scripts that write their $0 to actual_hooks > > test_expect_success 'sample script setup' ' > > @@ -28,11 +29,15 @@ test_expect_success 'sample script setup' ' > > echo $0 >>actual_hooks > > test $GIT_PREFIX = "success/" > > EOF > > - write_script "$HOOKDIR/check-author.sample" <<-\EOF > > + write_script "$HOOKDIR/check-author.sample" <<-\EOF && > > echo $0 >>actual_hooks > > test "$GIT_AUTHOR_NAME" = "New Author" && > > test "$GIT_AUTHOR_EMAIL" = "newauthor@example.com" > > EOF > > + write_script "$HOOKDIR/user-input.sample" <<-\EOF > > + ! read -r line || echo "$line" > hook_input > > + exit 0 > > Style (Documentation/CodingGuidelines) Fixed. > - Redirection operators should be written with space before, but no > space after them. In other words, write 'echo test >"$file"' > instead of 'echo test> $file' or 'echo test > $file'. > > So, when our "read" immediately hits EOF (or I/O error, but let's > not worry about that case for now), we leave hook_input file alone, > but otherwise we write the single line we read to hook_input, and > regardless of an error, we report success to the invoking "git > commit". Which makes sense. > > We report success even when we had trouble writing into hook_input > file, though. Perhaps you should lose the "exit 0" at the end? Removed. > > + EOF > > ' > > > > test_expect_success 'root commit' ' > > @@ -278,4 +283,34 @@ test_expect_success 'check the author in hook' ' > > test_cmp expected_hooks actual_hooks > > ' > > > > +test_expect_success 'with user input' ' > > + test_when_finished "rm -f \"$PRECOMMIT\" user_input hook_input" && > > + cp "$HOOKDIR/user-input.sample" "$PRECOMMIT" && > > + echo "user input" > user_input && > > Style (I won't repeat from here on). Fixed. > > + echo "more" >>file && > > + git add file && > > + git commit -m "more" < user_input && > > + test_cmp user_input hook_input > > +' > > This is probably a good place to also test > > git commit -F - > and see what happens. Added a test (currently failing). > Thanks. Thank you! Your feedback is thorough and helpful. - Orgad